Skip to main content

Posts

Paper Insights #38 - Availability in Globally Distributed Storage Systems

This paper, published by Google in 2010 and presented at OSDI 2010, is quite challenging due to its highly mathematical nature. I was fortunate to work with one of the co-authors, Florentina Popovici. She's truly the best.
Recent posts

Paper Insights #37 - The Honey Badger of BFT Protocols

This paper was presented at ACM SIGSAC 2016, a premier conference in computer security. Authored by Andrew Miller et al. and affiliated with leading global universities, the paper is inherently technical and academically focused.

Paper Insights #36 - ORCA: A Distributed Serving Systems for Transformer-Based Generative Models

This recent paper, presented at Usenix OSDI '22 by Seoul National University, offers an excellent introduction to modern ML serving systems.

Paper Insights #35 - Microsecond Consensus for Microsecond Applications

Presented at Usenix OSDI '20, this influential paper from VMware Research and EPFL (Swiss) has since received significant attention within the distributed systems community. It was authored by Marcos Aguirela , a researcher at VMware.

Paper Insights #34 - CRDTs: Consistency without Concurrency Control

Authored in 2009, this noteworthy paper from National Institute for Research in Computer Science and Automation was presented at the prestigious IEEE ICDCS. It introduces compelling design concepts for distributed systems.

Paper Insights #33 - Photon: Fault-tolerant and Scalable Joining of Continuous Data Streams

Presented at SIGMOD 2013, this paper from Google details another innovation stemming from Google Ads, a platform known for its planet-scale data processing. Notable authors include Ashish Gupta , a senior engineering leader within Google Ads, and Manpreet Singh , a principal engineer at Google. The year 2013 marked a significant period for stream processing, as Google was concurrently developing MillWheel and Dataflow , foundational technologies that influenced the creation of Apache Flink and Apache Beam .

Paper Insights #32 - Napa: Powering Scalable Data Warehousing with Robust Query Performance at Google

Napa represents the next generation of planet-scale data warehousing at Google, following Mesa . Napa is a key system for analytics workloads that stores enormous datasets for various tenants within Google. The extensive authorship of the paper underscores the collaborative effort behind its creation. This paper was presented at VLDB 2021.

Paper Insights #31 - F1 Query: Declarative Querying at Scale

We shift our focus from databases to a query engine. Google presented this paper at VLDB, the premier global database conference, in 2018. Notably, this paper has a number of authors and is incredibly dense. With so many parts, the paper only provides a high-level idea of its different components.

Paper Insights #30 - Autopilot: Workload Autoscaling at Google Scale

This paper from Google was presented at Eurosys 2020. It has a lot of statistics. However, it represents one of the most important concepts in cluster/cloud computing - scaling - and it is important to explore those concepts in system design.

Paper Insights #29 - Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center

This paper, presented at NSDI in 2011, comes from the UC Berkeley Systems Lab, with authorship by influential figures like Matei Zaharia (Spark's creator and CTO of  Databricks ), Ali Ghodsi  (CEO of Databricks), Scott Shenker , and Ion Stoica . UC Berkeley's Systems Lab is a powerhouse in computer systems research. Their most recent work on Sky Computing —envisioning a cloud computing marketplace—is truly groundbreaking.

Paper Insights #28 - TAO: Facebook's Distributed Data Store for the Social Graph

Following our discussion of causal consistency in COPS , this paper presents an eventually consistent database designed for graph storage. This paper was presented at USENIX ATC 2013, a prestigious venue in the field of computer science, in the year 2013.

Paper Insights #27 - Don't Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS

This work provides a strong foundation for understanding causality , both within distributed systems and more broadly. Its principles underpin systems achieving causal consistency, a powerful form of consistency that ensures high availability. Presented at SOSP 2011, this paper features contributions from prominent distributed systems researchers Wyatt Lloyd and Michael Freedman .