whitepapers/notes.md
2025-01-03 11:50:49 -06:00

2.4 KiB

notes

profiling a warehouse-scale computer

cassandra

bigtable

gfs

  • System Design as development for use case

    • Optimized for record append and random reads
  • Master-Slave

    • Limitations: faul tolerance despite replicas, throughput
  • Bottlenecks & network optimization

    • Data & Control flow separation
  • State restoration & logging (lots of things I don't get here)

    • Related: OS journaling
  • Weak consistency - "tolerable errors" (i.e. clients reading different states)

  • Garbage Collection

    • Amortized cost w/ FS scans
    • Parallels w/ language design
  • Terms to learn:

    1. Network Bandwidth and per-machine limit
    2. Racks & data centers - how are these managed (i.e. "cross-{rack,DC} replication")?
  • Use the latest {soft,hard}ware or deal with slowdowns (older kernel fsync() requiring reading entirety of file on append)

  • Getting to know the real numbers: 440 MB/s throughput on double chunkserver kill & google network

  • Network as the ultimate bottleneck & inefficiency

mapreduce

  • mapreduce: map[k0, v0] -> [k1,v1] -> reduce[k1,v[]] -> v[]
  • Master-Slave assigns map/reduce tasks
  • Separate M & R -> M >> R (usually) -> optimize worker allocation
  • Map & reduce individually parallelized, but not overall
    • Reducer waits for all intermediate kv pairs in order, then told by master -> this is how output is sorted
  • RPC remote file read for data transfer from M -> R
  • Re-execute entire M/R stage for fault tolerance
  • Backup Tasks: dynamic performance adjustments -> 44% speedup (slow on machine -> reschedule)
  • Caching & Network Topology: schedule workers close to internal GFS chunkservers to minimize latency
  • Simplicity + abstraction - not optimal, but first of its kind and made waves

spark