performance/ocw/profiling/notes.md

39 lines
1.4 KiB
Markdown

# profiling
## intro
### instrumentation: overall timing (usual idea)
- while timing a clock with `time_the_clock.cc`, i erroneously initialized a
`chrono::microseconds` default. this can be seen in the STL source: the default constructor
initializes the `Rep` (repeated value, the unit for the duration) by
default. A `microsecond` is a long here - this became garbage.
- NOTE: timing the clock (with 0 optimization) took ~0ns and is not of significance for my profiling. I concluded this by compiling with `-O0` and de-mangling with c++filt we can see this is the case: we can see this is the case and nothing is being optimized out:
```asm
.L585:
call std::chrono::_V2::system_clock::now()@PLT
movq %rax, -64(%rbp)
call std::chrono::_V2::system_clock::now()@PLT
```
However, I acknowledge that this code is not run one-for-one. I don't have the
knowledge to assess caching done by the actual CPU itself, or even other side
effects like inlining.
So far, though, one lesson of profiling is:
> Only profile the code *actually* being profiled
### statistical profiling
- more accurate by *experimental* virtue - probabilistic look
- imagine the execution of the program as a "strip" of time
- system-specific, but so was before
- statistical: stats
- program simulation: ll
## exercise 1
- use cachegrind/valgrind/asan on cf problem
- apply bentley's rules to some code
- begin developing a strategy for how to profile things