39 lines
1.4 KiB
Markdown
39 lines
1.4 KiB
Markdown
# profiling
|
|
|
|
## intro
|
|
|
|
### instrumentation: overall timing (usual idea)
|
|
|
|
- while timing a clock with `time_the_clock.cc`, i erroneously initialized a
|
|
`chrono::microseconds` default. this can be seen in the STL source: the default constructor
|
|
initializes the `Rep` (repeated value, the unit for the duration) by
|
|
default. A `microsecond` is a long here - this became garbage.
|
|
- NOTE: timing the clock (with 0 optimization) took ~0ns and is not of significance for my profiling. I concluded this by compiling with `-O0` and de-mangling with c++filt we can see this is the case: we can see this is the case and nothing is being optimized out:
|
|
|
|
```asm
|
|
.L585:
|
|
call std::chrono::_V2::system_clock::now()@PLT
|
|
movq %rax, -64(%rbp)
|
|
call std::chrono::_V2::system_clock::now()@PLT
|
|
```
|
|
However, I acknowledge that this code is not run one-for-one. I don't have the
|
|
knowledge to assess caching done by the actual CPU itself, or even other side
|
|
effects like inlining.
|
|
|
|
So far, though, one lesson of profiling is:
|
|
|
|
> Only profile the code *actually* being profiled
|
|
|
|
### statistical profiling
|
|
|
|
- more accurate by *experimental* virtue - probabilistic look
|
|
- imagine the execution of the program as a "strip" of time
|
|
- system-specific, but so was before
|
|
|
|
- statistical: stats
|
|
- program simulation: ll
|
|
|
|
## exercise 1
|
|
- use cachegrind/valgrind/asan on cf problem
|
|
- apply bentley's rules to some code
|
|
- begin developing a strategy for how to profile things
|