

In fact, by rerouting some writes from S3 to Magic Pocket, and making our remaining S3 requests more cost efficient, Object Store has helped us save millions of dollars each year. The MySQL and service clusters don’t run for free, but their cost is still a fraction of the API costs saved. And when we use S3 as a cache, Object Store short-circuits S3 GETs during the nonexistent read case, cutting more API requests. By reducing the number of PUT requests for the same total volume of data, we save money. A quick inspection of S3 pricing shows that users pay per gigabyte stored and per API request. This design solves our two biggest cost inefficiencies in accessing S3. Object Store recovers the object and batch metadata from the key, then performs a ranged read on the batched blob in persistent storage using the object’s start and end offset as range delimiters. Thanks to Object Store, we’ve been able to save millions of dollars in annual operating costs.Įach GET(key) involves a fetch and a slice. We built this service in 2020 and gave it the pedestrian name Object Store-but its impact has been anything but. At the service layer, we could provide additional features like granular per-object encryption, usage and performance monitoring, retention policies, and dedicated support for on-call upkeep.

As pricing plans, access patterns, and security requirements change over time and across use cases, having this extra layer would allow us to flexibly route traffic between options without migrations.Īnother desirable side effect of routing all blob traffic through a single service was centralization. What we really desired was the ability to expose a meta-store, transparently backed by different cloud providers’ storage offerings. Looking at the bigger picture, S3 was simply an expensive default choice among many competitors-including our own M agic P ocket block store. Caches built against S3 burned pricey GET requests with each cache miss. For instance, crash traces wrote many objects which were rarely accessed unless specifically needed for an investigation, generating a large PUT bill. Using these two legacy systems as generic blob storage caused many pain points-the worst of which was the cost inefficiency of using S3’s API. Among these use cases were crash traces, build artifacts, test logs, and image caching. Although we migrated user file data to our internal block storage system Magic Pocket in 2015, Dropbox continued to use S3 and HDFS as a general-purpose store for other internal products and tools. Dropbox originally used Amazon S3 and the Hadoop Distributed File System (HDFS) as the backbone of its data storage infrastructure.
