2.4 KiB
2.4 KiB
notes
profiling a warehouse-scale computer
cassandra
bigtable
gfs
-
System Design as development for use case
- Optimized for record append and random reads
-
Master-Slave
- Limitations: faul tolerance despite replicas, throughput
-
Bottlenecks & network optimization
- Data & Control flow separation
-
State restoration & logging (lots of things I don't get here)
- Related: OS journaling
-
Weak consistency - "tolerable errors" (i.e. clients reading different states)
-
Garbage Collection
- Amortized cost w/ FS scans
- Parallels w/ language design
-
Terms to learn:
- Network Bandwidth and per-machine limit
- Racks & data centers - how are these managed (i.e. "cross-{rack,DC} replication")?
-
Use the latest {soft,hard}ware or deal with slowdowns (older kernel
fsync()requiring reading entirety of file on append) -
Getting to know the real numbers: 440 MB/s throughput on double chunkserver kill & google network
-
Network as the ultimate bottleneck & inefficiency
mapreduce
- mapreduce: map[k0, v0] -> [k1,v1] -> reduce[k1,v[]] -> v[]
- Master-Slave assigns map/reduce tasks
- Separate M & R -> M >> R (usually) -> optimize worker allocation
- Map & reduce individually parallelized, but not overall
- Reducer waits for all intermediate kv pairs in order, then told by master -> this is how output is sorted
- RPC remote file read for data transfer from M -> R
- Re-execute entire M/R stage for fault tolerance
- Backup Tasks: dynamic performance adjustments -> 44% speedup (slow on machine -> reschedule)
- Caching & Network Topology: schedule workers close to internal GFS chunkservers to minimize latency
- Simplicity + abstraction - not optimal, but first of its kind and made waves