notes

profiling a warehouse-scale computer

System Design as development for use case
- Optimized for record append and random reads
Master-Slave
- Limitations: faul tolerance despite replicas, throughput
Bottlenecks & network optimization
- Data & Control flow separation
State restoration & logging (lots of things I don't get here)
- Related: OS journaling
Weak consistency - "tolerable errors" (i.e. clients reading different states)
Garbage Collection
- Amortized cost w/ FS scans
- Parallels w/ language design
Terms to learn:
1. Network Bandwidth and per-machine limit
2. Racks & data centers - how are these managed (i.e. "cross-{rack,DC} replication")?
Use the latest {soft,hard}ware or deal with slowdowns (older kernel fsync() requiring reading entirety of file on append)
Getting to know the real numbers: 440 MB/s throughput on double chunkserver kill & google network
Network as the ultimate bottleneck & inefficiency

mapreduce: map[k0, v0] -> [k1,v1] -> reduce[k1,v[]] -> v[]
Master-Slave assigns map/reduce tasks
Separate M & R -> M >> R (usually) -> optimize worker allocation
Map & reduce individually parallelized, but not overall
- Reducer waits for all intermediate kv pairs in order, then told by master -> this is how output is sorted
RPC remote file read for data transfer from M -> R
Re-execute entire M/R stage for fault tolerance
Backup Tasks: dynamic performance adjustments -> 44% speedup (slow on machine -> reschedule)
Caching & Network Topology: schedule workers close to internal GFS chunkservers to minimize latency
Simplicity + abstraction - not optimal, but first of its kind and made waves