centralized performance resources
This commit is contained in:
commit
50b15a1522
63 changed files with 328466 additions and 0 deletions
39
ocw/profiling/notes.md
Normal file
39
ocw/profiling/notes.md
Normal file
|
|
@ -0,0 +1,39 @@
|
|||
# profiling
|
||||
|
||||
## intro
|
||||
|
||||
### instrumentation: overall timing (usual idea)
|
||||
|
||||
- while timing a clock with `time_the_clock.cc`, i erroneously initialized a
|
||||
`chrono::microseconds` default. this can be seen in the STL source: the default constructor
|
||||
initializes the `Rep` (repeated value, the unit for the duration) by
|
||||
default. A `microsecond` is a long here - this became garbage.
|
||||
- NOTE: timing the clock (with 0 optimization) took ~0ns and is not of significance for my profiling. I concluded this by compiling with `-O0` and de-mangling with c++filt we can see this is the case: we can see this is the case and nothing is being optimized out:
|
||||
|
||||
```asm
|
||||
.L585:
|
||||
call std::chrono::_V2::system_clock::now()@PLT
|
||||
movq %rax, -64(%rbp)
|
||||
call std::chrono::_V2::system_clock::now()@PLT
|
||||
```
|
||||
However, I acknowledge that this code is not run one-for-one. I don't have the
|
||||
knowledge to assess caching done by the actual CPU itself, or even other side
|
||||
effects like inlining.
|
||||
|
||||
So far, though, one lesson of profiling is:
|
||||
|
||||
> Only profile the code *actually* being profiled
|
||||
|
||||
### statistical profiling
|
||||
|
||||
- more accurate by *experimental* virtue - probabilistic look
|
||||
- imagine the execution of the program as a "strip" of time
|
||||
- system-specific, but so was before
|
||||
|
||||
- statistical: stats
|
||||
- program simulation: ll
|
||||
|
||||
## exercise 1
|
||||
- use cachegrind/valgrind/asan on cf problem
|
||||
- apply bentley's rules to some code
|
||||
- begin developing a strategy for how to profile things
|
||||
Loading…
Add table
Add a link
Reference in a new issue