Sometimes people ask me which computer science papers they should read and I can't really answer that question, but I can list the papers I've enjoyed reading over the past years.— Pedro Tavareλ (@ordepdev) August 14, 2021
Following the tweet above, I’ve decided to do a thread dump of my favorite computer science papers.
This is not a you should read these papers kind of post, it’s a curated list of great computer science papers that I’ve enjoyed reading and re-reading over the past years.
(I think you should read them as well!)
💡 You’ll learn about a technique called a log-structured file system that writes all modifications to disk sequentially, thereby speeding up both file writing and crash recovery.
💡 You’ll learn about a disk-based index structure called B-Tree and its different variations. The paper does quite a good job of explaining why they have been so successful over the years.
💡 You’ll continue to learn about low-cost indexing for a file experiencing a high rate of record inserts over an extended period. The paper also provides a nice comparison of LSM-tree and B-tree I/O costs.
💡 You’ll learn about log processing, Kafka’s architecture, and design principles including producers, brokers, and consumers.
💡 You’ll learn about the ZooKeeper wait-free coordination kernel and a lot of distributed systems concepts that are nicely described in the paper.
💡 You’ll learn about one-way functions, the Lamport-Diffie one-time signature, and a new “tree-signature” also known as Merkle tree.
💡 Leslie Lamport’s most cited paper. You’ll learn about logical clocks, real-time synchronization, and concepts such as “total ordering” and “happened-before”.
💡 You’ll learn about strategies for improving a system’s overall availability while tolerating some kind of graceful degradation.
💡 You’ll learn about reliability in computer systems, whenever it has to cope with the failure of one or more of its components.
💡 You’ll learn about a strong correctness condition for concurrent objects that guarantees a strict time ordering of read and write operations in a multi-threaded environment.
💡 You’ll learn about a data structure that makes the eventual consistency of a distributed object possible without coordination between replicas.
💡 You’ll learn about an optimization made to state-based CRDTs that ensure convergence by disseminating only recently applied changes, instead of the entire (possibly large) state.
💡 You’ll learn about Erlang, concurrent programming, message passing, fault-tolerance, and the concept of “let it crash”.
Looking for more papers?
These are my favorites.
I might be missing a few papers, for sure.