feat(notes): gfs
This commit is contained in:
parent
a2e450ef09
commit
8c3f0eb910
1 changed files with 25 additions and 0 deletions
25
notes.md
25
notes.md
|
|
@ -8,6 +8,31 @@
|
|||
|
||||
## [gfs](https://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf)
|
||||
|
||||
- System Design as development for use case
|
||||
- Optimized for record append and random reads
|
||||
- Master-Slave
|
||||
- Limitations: faul tolerance despite replicas, throughput
|
||||
- Bottlenecks & network optimization
|
||||
- Data & Control flow separation
|
||||
- State restoration & logging (lots of things I don't get here)
|
||||
- Related: OS journaling
|
||||
- Weak consistency - "tolerable errors" (i.e. clients reading different states)
|
||||
- Garbage Collection
|
||||
|
||||
- Amortized cost w/ FS scans
|
||||
- Parallels w/ language design
|
||||
|
||||
- Terms to learn:
|
||||
|
||||
1. Network Bandwidth and _per-machine_ limit
|
||||
2. Racks & data centers - how are these managed (i.e. "cross-{rack,DC} replication")?
|
||||
|
||||
- Use the latest {soft,hard}ware or deal with slowdowns (older kernel `fsync()` requiring reading entirety of file on append)
|
||||
- Getting to know the real numbers: 440 MB/s throughput on double chunkserver kill & google network
|
||||
- Network as the ultimate bottleneck & inefficiency
|
||||
|
||||
## [mapreduce](https://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-osdi04.pdf)
|
||||
|
||||
## [spark](https://people.eecs.berkeley.edu/~matei/papers/2016/cacm_apache_spark.pdf)
|
||||
|
||||
## [rpc](https://www.h3c.com/en/Support/Resource_Center/EN/Home/Switches/00-Public/Trending/Technology_White_Papers/gRPC_Technology_White_Paper-6W100/)
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue