Design a Key-Value Store (Part I)

Since many people have emailed us saying they want to read more about system design interviews, we’re going to cover more on this topic. I’m quite happy to hear a lot of feedbacks and if you have any suggestions or questions, do tell us by leaving a comment.

This week, I’m going to talk about key-value store. A key-value store is a very power technique that is used in almost every system in the world. It can be as simple as a hash table and at the same time, it can also be a distributed storage system. For instance, the underline system of Cassandra is a key-value storage system and Cassandra is widely used in many companies like Apple, Facebook etc..

In this post, I’ll cover topics like basic key-value storage system, distributed key-value storage and scaling issues including sharding, all of which are possible to be covered in system design interviews.

Basic key-value storage

How would you design a simple key-value storage system in a single machine?

The most straightforward thing is to use a hash table to store key-value pairs, which is how most this kind of systems work nowadays. A hash table allows you to read/write a key-value pair in constant time and it’s extremely easy to use. Most languages have built-in support for this.

However, the downside is also obvious. Using a hash table usually means you need to store everything in memory, which may not be possible when the data set is big. There are two common solutions:

Compress your data. This should be the first thing to think about and often there are a bunch of stuff you can compress. For example, you can store reference instead of the actual data. You can also use float32 rather than float64. In addition, using different data representations like bit array (integer) or vectors can be effective as well.
//store reference 具体技术方案
Storing in a disk. When it’s impossible to fit everything in memory, you may store part of the data into disk. To further optimize this, you can think of the system as a cache system. Frequently visited data is kept in memory and the rest is on a disk.
//a cache system, on a disk,如何交互，技术选型

Distributed key-value storage

The most interesting topic is definitely scaling the key-value storage into multiple machines. If you want to support big data, you will implement distributed system for sure. Let’s see how we can design a distributed key-value storage system.

If you have read Design a Cache System, you will notice that a lot of concepts here are exactly the same.

Since a single machine doesn’t have enough storage for all the data, the general idea here is to split the data into multiple machines by some rules and a coordinator machine can direct clients to the machine with requested resource. The question is how to split the data into multiple machines and more importantly, what is a good strategy to partition data?

Sharding

Suppose all the keys are URLs like http://gainlo.co and we have 26 machines. One approach is to divide all keys (URLs) based on the first character of the URL (after “www”) to these 26 machines. For example, http://gainlo.co will be stored at machine G and http://blog.gainlo.co will be stored at machine B. So what are disadvantages of this design?

Let’s ignore cases where URL contains ASCII characters. A good sharding algorithm should be able to balance traffic equally to all machines. In other words, each machine should receive equal requests ideally. Apparently, the above design doesn’t work well. First of all, the storage is not distributed equally. Probably there are much more URLs starting with “a” than “z”. Secondly, some URLs are much more popular like Facebook and Google.

In order to balance the traffic, you’d better make sure that keys are distributed randomly. Another solution is to use the hash of URL, which usually have much better performance. To design a good sharding algorithm, you should fully understand the application and can estimate the bottleneck of the system.

//分片;分表

Summary

There are too many interesting topics about key-value store to cover and I can hardly fit all of them into one article. As you can see, when scaling the system, there are much more issues to consider and that’s why many people find distributed system difficult.

In the next post, we’ll continue the discussion and will cover more about scaling the system. Things like system availability, consistency will be discussed.

The post is written by Gainlo - a platform that allows you to have mock interviews with employees from Google, Amazon etc..

//太空泛，需要写些具体案例，技术选型