Introducing Google's Colossus File System

To say that Google's data storage infrastructure is extensive would be a vast understatement. Not only are they responsible for powering Google's traditional search engine, which is arguably the most popular online search engine in history, but they also need to maintain their other offerings, too. Services like Gmail, Google Cloud, Google Drive, and even YouTube all rely on this infrastructure for day-to-day usage.

Introducing Google File System
Google File System, or GFS, was designed to handle incredibly large amounts of datasets. Originating from their search engine and their other services, this infrastructure needs to maintain efficiency for all users – whether they're using the search engine, their email service, or any of their other products.

Nicknamed Colossus, the latest version of GFS isn't exactly new. It was released in 2010, after all. However, Google recently lifted the hood on their unique infrastructure and gave the general public an unprecedented look at how it operates.

Combing Multiple Components

Google's Colossus file system utilizes five distinct components: a client library, curators, custodians, "D" file servers, and a database for storing metadata. Each component is responsible for completing a certain task.

The first component, the client library, serves as the client's gateway into the Google platform. In other words, this represents a user who is searching for a keyword online or using Gmail, Google Drive, or another Google service. This component uses software-based RAID to assist in process optimization and delivery.

From here, the system's curators take over. These automated systems take care of things like file creation, and they can also store metadata via Google's NoSQL database, BigTable. Data itself is transferred directly from the original client application into the "D" file servers, where they're looked after by custodians that ensure data integrity.

This concept was also summarized in a recent blog post by Google, which stated: "With Colossus, a single cluster is scalable to exabytes of storage and tens of thousands of machines. For example, in the example below, we have instances accessing Cloud Storage from Compute Engine VMs, YouTube serving nodes, and Ads MapReduce nodes—all of which can share the same underlying file system to complete requests. The key ingredient is having a shared storage pool managed by the Colossus control plane, providing the illusion that each has its isolated file system."

Looking at the Hardware

All of these systems must work in tandem to ensure data fluidity and efficiency. In order for that to happen, Google depends on a vast network of next-gen hardware resources, including disk-based and flash storage, to maintain data on a long-term basis. Given the scale of their operations, however, it's safe to assume that individual hardware components are failing nearly constantly. As a result, Colossus was designed with fault tolerance and background recovery in mind.

Without these technologies, your search results would be far slower. Even worse, Google's entire data storage infrastructure would be prone to frequent crashes, which would ultimately result in untold amounts of lost data throughout their entire platform.

J.R. Johnivan 05/03/2021

Comments

No comments yet. Sign in to add the first!

Data Backup Digest

Do-It-Yourself Windows File Recovery Software: A Comparison

Topics

Introducing Google's Colossus File System

Comments