Skip to content

Schedule

This page will be updated with the schedule for the course, including lecture topics, readings, and assignment due dates.

Some of the abbreviations used in the schedule include:

  • DDIA: Designing Data-Intensive Applications, Kleppmann.
  • DS4: Distributed Systems: Principles and Paradigms (4th Ed.), Tanenbaum, van Steen.

Graduate Reading Summary Form: Please submit your summaries at the Graduate Reading Summary Form

Week Date Topic Readings Assignments (Released / Due)
1 Tue 03/24 Introduction & Motivation
Slides & Discussion
Suggested Reading:
DDIA Ch. 1
DS4 Ch. 1
Optional Readings:
Dean & Barroso — The Tail at Scale
HW1 Out
Thu 03/27 Processes, Threads, and RPC
Slides & Discussion
Required Reading:
DS4 3.1, DDIA Ch. 4
Graduate Reading:
Waldo et al. — A Note on Distributed Computing
Optional Readings:
DS4 Ch. 3.2-3.6
Birrell & Nelson — Implementing Remote Procedure Calls
2 Tue 03/31 Logical Time and Coordination
Slides & Discussion
Required Reading:
DDIA Ch. 8 §"Unreliable Clocks"
DS4 Ch. 5.2
Graduate Reading:
Lamport — Time, Clocks, and the Ordering of Events
Optional Readings:
DDIA Ch. 9 §"Ordering Guarantees"
Mattern — Virtual Time and Global States
2 Thu 04/02 Failures and Fault Models
Slides & Discussion
Required Reading:
DDIA Ch. 8
Graduate Reading:
Chandra & Toueg — Unreliable Failure Detectors
Optional Readings:
Fischer, Lynch, Paterson — FLP Impossibility
Lamport et al. - The Byzantine Generals Problem
Hayashibara et al. - The ϕ Accrual Failure Detector
2 Fri 04/03 HW1 Due
3 Tue 04/07 Replication
Slides & Discussion
Required Reading:
DDIA Ch. 5
Graduate Reading:
van Renesse & Schneider — Chain Replication
Optional Readings:
Oki & Liskov — Viewstamped Replication
Terrace & Freedman — Object Storage on CRAQ
3 Thu 04/09 Partitioning
Slides & Discussion
Required Reading:
DDIA Ch. 6
Graduate Reading:
DeCandia et al. — Dynamo: Amazon's Highly Available Key-Value Store
Optional Readings:
Chang et al. — Bigtable
HW2 Out
4 Tue 04/14 Consistency Models
Slides & Discussion
Required Reading:
DDIA Ch. 9 p321–352
Graduate Reading:
Herlihy & Wing — Linearizability (§1-3 Only, Proofs Optional)
Optional Readings:
Gilbert & Lynch — Brewer's Conjecture and the CAP Theorem
Terry et al. — Session Guarantees for Weakly Consistent Replicated Data
Vogels - Eventually Consistent
4 Thu 04/16 Consensus I (Paxos)
Slides & Discussion
Required Reading:
DS4 §8.2.4
Graduate Reading:
Lamport — Paxos Made Simple
Optional Readings:
Lamport — The Part-Time Parliament
Chandra et al. — Paxos Made Live
4 Sun 04/19 HW2 Due
5 Tue 04/21 Consensus II (Raft)
Slides & Discussion
Required Reading:
DDIA Ch. 9 §"Fault-Tolerant Consensus" (p364-369)
Graduate Reading:
Ongaro & Ousterhout — In Search of an Understandable Consensus Algorithm (Raft)
Optional Readings:
Howard et al. — Flexible Paxos: Quorum Intersection Revisited
5 Thu 04/23 Distributed Transactions
Slides & Discussion
Required Reading:
DDIA Ch. 7 §"The Slippery Concept of a Transaction" (p221-228)
DDIA Ch. 9 §"Distributed Transactions and Consensus (upto "Fault Tolerant Consensus") (p352-360)
Graduate Reading:
Gray & Lamport — Consensus on Transaction Commit (§1-5, proofs optional)
Optional Readings:
Helland — Life Beyond Distributed Transactions: An Apostate's Opinion
HW3 Out
6 Tue 04/28 Distributed File Systems Required Reading:
DDIA Ch. 10 §"MapReduce and Distributed Filesystems"
Graduate Reading:
Ghemawat et al. — The Google File System
Optional Readings:
Shvachko et al. — The Hadoop Distributed File System
Weil et al. — Ceph: A Scalable, High-Performance Distributed File System
6 Thu 04/30 Coordination Services Required Reading:
DS4 §5.3.6
DDIA Ch. 6 §"Request Routing"
Graduate Reading:
Hunt et al. — ZooKeeper: Wait-free Coordination for Internet-scale Systems
Optional Readings:
Burrows — The Chubby Lock Service
7 Tue 05/05 Global Distributed Databases Required Reading:
DDIA Ch 7. §"Snapshot Isolation and Repeatable Read" (p237-239) Ch. 8 §"Synchronized Clocks for Global Snapshots (p294)"
Graduate Reading:
Corbett et al. — Spanner: Google's Globally Distributed Database
Optional Readings:
Bacon et al. — Spanner: Becoming a SQL System
Kulkarni et al. — Logical Physical Clocks and Consistent Snapshots in Globally Distributed Databases
7 Thu 05/07 Cluster Management and Orchestration Required Reading
DS4 §3.2.2-3.2.3
Graduate Reading:
Verma et al. — Large-scale Cluster Management at Google with Borg
Optional Readings:
Burns et al. — Borg, Omega, and Kubernetes
Hindman et al. — Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
7 Fri 05/08 HW3 Due
8 Tue 05/12 Distributed Computation Frameworks Required Reading DDIA Ch.10
Graduate Reading:
Dean & Ghemawat — MapReduce: Simplified Data Processing on Large Clusters
Optional Readings:
Zaharia et al. — Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing
Abadi et al. — TensorFlow: A System for Large-Scale Machine Learning
HW 4 Out
8 Thu 05/14 Stream Processing Required Reading:
DDIA Ch. 11
Graduate Reading:
Chandy & Lamport — Distributed Snapshots: Determining Global States of Distributed Systems
Optional Readings:
Akidau et al. — The Dataflow Model
Kreps et al. — Kafka: A Distributed Messaging System for Log Processing
Carbone et al. — Apache Flink: Stream and Batch Processing in a Single Engine
9 Tue 05/19 Distributed AI Infrastructure Required and Graduate Reading:
Kwon et al. — Efficient Memory Management for Large Language Model Serving with PagedAttention
Optional Readings:
Dean et al. — Large Scale Distributed Deep Networks Shoeybi et al. — Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
9 Thu 05/21 Wrap-up & Review
10 Fri 05/22 HW4 Due
Finals Week Fri 05/29 Final Exam