Skip to main content

Posts

Paper Insights #30 - Napa: Powering Scalable Data Warehousing with Robust Query Performance at Google

Napa represents the next generation of planet-scale data warehousing at Google, following Mesa. A key system for analytics workloads, Napa stores enormous datasets for various tenants within Google. The extensive authorship of the paper presented at VLDB 2021 underscores the collaborative effort behind its creation. Paper Link Work in progress!
Recent posts

Paper Insights #29 - Autopilot: Workload Autoscaling at Google Scale

This paper from Google was presented at Eurosys 2020. It has a lot of statistics. However, it represents one of the most important concepts in cloud computing - autoscaling - and hence it is important to explore those concepts in system design. Paper Link Recommended Read : Paper Insights - Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center  where I introduced cluster computing and resource allocation.  Borg Borg , a cluster manager developed at Google, is a prominent example of a cluster orchestrator . It is important to note that the design of Kubernetes was significantly influenced by Borg. Jobs and Tasks A Borg cluster consists of roughly 100 to 10,000 physical machines connected through a high-speed network fabric. Users submit jobs for execution on these machines, and these jobs are categorized into several types: Services : These are long-duration jobs that frequently constitute components of larger systems, such as those employing a microservices ...

Paper Insights #28 - Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center

This paper, presented at NSDI in 2011, comes from the UC Berkeley Systems Lab, with authorship by influential figures like Matei Zaharia (Spark's creator and Databricks co-founder), Ali Ghodsi  (CEO of Databricks), Scott Shenker , and Ion Stoica . UC Berkeley's Systems Lab is a powerhouse in computer systems research, and their current work on Sky Computing —envisioning a cloud computing marketplace—is truly groundbreaking. The concepts explored in these papers are closely intertwined with the development of several influential projects at UC Berkeley, including Spark , Delay Scheduling , Dominant Resource Fairness , and the technical report detailing Mesos – all of which were being actively researched and built concurrently by the same authors. Paper Link Let's begin with some basic concepts (and there are many of them). Cluster Management A computing  cluster comprises numerous interconnected machines, linked by a high-speed network, often referred to as a fabric . ...

Paper Insights #27 - TAO: Facebook's Distributed Data Store for the Social Graph

Following our discussion of causal consistency in COPS , this paper presents an eventually consistent database designed for graph storage. This paper was presented at USENIX ATC 2013, a prestigious venue in the field of computer science, in the year 2013. Paper Link Let's begin with some basic concepts. Consistency Models Revisisted We've previously explored linearizability and sequential consistency in ZooKeeper , as well as causal consistency in COPS . In this discussion, we'll examine eventual consistency models and provide an illustrative example to summarize them, within the context of single-key distributed key-value stores. The phrase "consistency model" in the context of a single data item read-write system means the following: Say clients are observing  all  the writes happening in the store. A consistency model determines the order in which the clients see the writes. After all the writes are applied, the consistency model determines what would be the ef...