Horizontal Sharding for MySQL Made Easy
Since writing this blog we have released a new version of PlanetScale. Learn more about what we've built and give it a try. And be sure to check out our docs.
For developers building out a web application, a transactional datastore is the obvious and proven choice, but with success comes scale limitations. A monolithic database works well initially, but as an application sees growth, the size of its data will eventually grow beyond what is optimal for a single server. If the application can live with eventually consistent data, scaling read traffic can be solved with relative ease by adding more replicas. However, scaling write traffic is more challenging; for example at a certain point even the largest MySQL database will see performance issues.
This is not a new challenge, organizations have faced it for years, and one of the key patterns for solving it is horizontal sharding. Horizontal sharding refers to taking a single MySQL database and partitioning the data across several database servers each with identical schema. This spreads the workload of a given database across multiple database servers, which means you can scale linearly simply by adding more database servers as needed. Each of these servers is called a “shard”. Having multiple shards reduces the read and write traffic handled by a single database server and makes it possible to keep the data on a single database server at an optimal size. However, now, since you are dealing with multiple servers rather than one this adds additional complexity to query routing and to the operational tasks like backup and restore, schema migration, and monitoring.
Many companies implemented horizontal sharding at the application level. In this approach, all of the logic for routing queries to the correct database server lives in the application. This requires additional logic at the application level, which must be updated any time a new feature is added. It also means that cross shard features need to be implemented in the application. Additionally, as data grows and the initial set of shards runs out of capacity, “resharding” or increasing the number of shards while continuing to serve traffic becomes a daunting operational challenge.
Pinterest took this approach after trying out the available NoSQL technology and determining that it was not mature enough at that time. Marty Weiner, a software engineer who worked on the project, noted, “We had several NoSQL technologies, all of which eventually broke catastrophically.” Pinterest mapped their data by primary key, and used that key to map data to the shard where it resided. Sharding in this way provided scale, but traded off cross shard joins and the use of foreign keys. Similarly, Etsy took this approach when moving to a sharded database system, but added a two-way lookup primary key to the shard_id and packed shards onto hosts, automating some of the work of managing shards. In both cases, however, ongoing management of shards, including splitting shards after the initial resharding, presented significant challenges.
Alongside sharding at the application layer, another approach to horizontal sharding emerged. Engineers at YouTube began building out the open source project Vitess in 2010. Vitess sits between the application and MySQL databases, allowing horizontally sharded databases to appear monolithic to the application. In addition to removing the complexity of query routing from the application, Vitess provides master failover and backup solutions that remove the operational complexity of a sharded system, as well as features like connection pooling and query rewriting for improved performance.
Companies like Square (read about their journey), Slack, JD.com, and many more have used Vitess to scale their MySQL databases. JD.com, one of the largest online retailers in China, saw 35 million QPS run through Vitess during a peak in traffic on Singles day. Slack has migrated almost all of their databases to Vitess, surviving the massive influx of traffic from the transition to work from home earlier this year. Both Etsy and Pinterst have moved some of their workloads to Vitess because of the management benefits Vitess provides. Vitess has demonstrated its ability to run in production against high workloads repeatedly.
However, running Vitess at scale still requires an engineering team and not all organizations have the depth that Slack and Square do. At PlanetScale, we’ve built a database-as-a-service on top of Vitess so that anyone can access this level of scale with their MySQL databases. With PlanetScaleDB, you can start small with a single MySQL instance and scale up as you grow. When the time comes to horizontally shard, you’ll need to design a sharding scheme, but once you have decided how to organize your data across shards, you’ll be able to shard your database via our UI without having to make significant changes to your application. In just a few clicks, and the time it takes to copy your data, you can move from a single database server to a sharded database with as many shards as you need.
Check out this video demo to watch a database grow from one shard to 128 shards seamlessly, while serving traffic.
This feature is currently in beta and you can give it a try here.