Data Storage, Replication, and Integrity Syllabus

Reading list designed to help programmers understand the domain of storage, replication, and integrity. There are basically no modern books on the topic as a whole.

The study of transactions is very well covered so that a section on transactions is explicitly not a part of this reading list.

THIS IS WORK IN PROGRESS. Send me suggestions.

Part 1: Data structures

Reading list:

Key topics: LSM Tree, B-Tree

Part 2: Dealing with failure

Reading list:

Key topics: torn writes, torn reads, misdirected writes, misdirected reads, use of checksums, fsync

Part 3: IO models

Reading list:

Key topics: thread-per-core, io_uring, epoll, aio, spdk

Part 4: Replication

Reading list:

Key topics: raft, paxos, chain replication, viewstamped replication