What problems are solved by the features of Redis?

First, let's look at what Redis is

The official profile explains that Redis is a BSD-based open source project that is a storage system that puts structured data in memory, and you can use it as a database, cache and messaging middleware.

It also supports strings, lists, hashes, sets, sorted sets, bitmaps, hyperloglogs and geospatial indexes and other data types.

It also has built-in replication, lua scripting, LRU, transactions, high availability through redis sentinel, and automatic sharding through redis cluster. It also has transactions, publish/subscribe, automatic failover, and more.

In summary, Redis provides a wealth of features that may seem overwhelming at first glance. What problems do they solve? When do you need them? The following is a rough explanation of the evolution of Redis, starting from scratch and going step by step.

Starting from scratch

The initial requirement was very simple: we had an api that provided a list of hot news: http://api.xxx.com/hot-news. The api's consumers complained that each request took about 2 seconds to return results.

We then set out to improve the perceived performance of the api consumer, and soon the simplest and most brutal first solution came out: add HTTP-based cache control cache-control:max-age=600 to the API response, i.e., allow the consumer to cache the response for ten minutes.

If the api consumer makes effective use of the cache control information in the response, it can effectively improve its perceived performance (within 10 minutes). But there are two disadvantages: the first is that within 10 minutes of the cache taking effect, the api consumer may get the old data; the second is that if the api client ignores the cache and accesses the API directly, it still takes 2 seconds to cure the symptoms but not the cause.

Native memory-based caching

In order to solve the problem that it still takes 2 seconds to call the API, after checking, the main reason is that the process of using SQL to get the hot news consumes nearly 2 seconds, so we came up with a simple and brutal solution, that is, the results of the SQL query are directly cached in the memory of the current api server (set the cache validity time to 1 minute).

Subsequent requests within 1 minute will be read directly from the cache, and it will no longer take 2 seconds to execute the SQL. If the api receives 100 requests per second, then there are 6000 requests per minute, which means that only the first 2 seconds of requests will take 2 seconds, and all subsequent requests in the next 58 seconds can be responded to without waiting for 2 seconds.

Other API partners found this to be a good solution, so we soon found that the memory of the API server was going to be full.

Server-side Redis

At the point where the API server's memory was all stuffed with caches, we found we had to come up with another solution. The most straightforward idea was to throw all these caches on a dedicated server with a large memory configuration. Then we set our sights on redis.

As for how to configure the deployment of redis here is not explained, redis official has a detailed description. We then used a separate server for Redis, and the memory pressure on the API server was solved.

1 Persistence (Persistence)

A single Redis server is always in a bad mood for a few days a month and goes on strike, causing all the cache to be lost (redis data is stored in memory). Although the Redis server can be brought back online, the loss of data in memory causes an avalanche of caches, and the pressure on the API server and database comes up all at once.

This is where the Redis persistence feature comes in handy to mitigate the impact of the cache avalanche. redis persistence means that redis writes the in-memory data to the hard drive and loads it when redis is restarted, thus minimizing the impact of cache loss.

2 Sentinel (Sentinel) and Replication (Replication)

Redis server strikes without warning are a nuisance. So what do you do? The answer is: back up one, you hang it up. So how do you know when a redis server hangs, how do you switch, and how do you ensure that the backup machine is a complete backup of the original server? This is where Sentinel and Replication come into play.

Sentinel manages multiple Redis servers, providing monitoring, alerting, and automatic failover, while Replication is responsible for allowing a Redis server to be equipped with multiple backup servers, both of which are used by Redis to ensure high Redis availability. In addition, the Sentinel feature makes use of Redis' publish and subscribe capabilities.

3 Cluster

There is always an upper limit to the resources of a single server. CPU and IO resources can be separated by master-slave replication to transfer some of the CPU and IO pressure to the slave server. But what about memory resources? The master-slave mode can only backup the same data, and cannot expand memory horizontally; the memory of a single machine can only be increased, but there is always a ceiling.

So we needed a solution that would allow us to scale horizontally. The ultimate goal is to make each server responsible for only a part of it, so that all these servers form a whole, and the distributed set of servers is like a centralized server to outside consumers (the difference between distributed and web-based is explained in the previous blog on interpreting REST: web-based application architecture).

Before the official Redis distribution scheme came out, there were twemproxy and codis, both of which generally relied on proxies for distribution, meaning that redis itself didn't care about distribution, but left it up to twemproxy and codis.

The official redis cluster solution takes this part of distribution to each redis server, making it independent of other components to accomplish the distributed requirements. We don't care about the advantages and disadvantages of these solutions here, let's focus on what the distribution is actually dealing with here? That is, twemproxy and codis independently handle the logic of this part of the distribution and the logic of this part of the cluster integration to the redis service in the end to solve what problem?

As we said earlier, a distributed service looks like a centralized service to the outside world. So to do this there is a problem that needs to be solved: both increasing or decreasing the number of servers in the distributed service should be indifferent to the client consuming the service; this means that the client cannot penetrate the distributed service and tie itself to a particular server, because once it does, you can no longer add new servers or replace them with failures.

There are two ways to solve this problem.

The first way is the most straightforward, that is, I add an intermediate layer to isolate this specific dependency, that is, twemproxy uses the way, so that all clients can only consume redsi services through it, through which to isolate this dependency (but you will find twermproxy will become a single point), in this case each redis server is independent, between them unaware of each other's existence.

The second way is to let the redis servers know about each other's existence, through the mechanism of redirection to guide the client to complete the operation they need, for example, the client links to a certain redis server and says I want to perform this operation, the redis server finds itself unable to complete this operation, then it gives the information of the server that can complete this operation to the client, so that the client can go request another server, then you will find that each redis server needs to keep a complete copy of the distributed server information, otherwise how does it know which other server to let the client go to perform the operation the client wants.

The above paragraph explains so much, I wonder if there is found that no matter the first way or the second way, there is a common thing that exists, that is, the information of all the servers in the distributed service and the services they can provide. This information is going to exist anyway, the difference is that the first way is to manage this part of the information separately, with this information to coordinate the back end of multiple independent redis servers; the second way is to let each redis server hold this information, each other know each other's existence, to achieve the same purpose as the first way, the advantage is no longer need an additional component to handle this part of the process.

The specific implementation details of Redis Cluster are based on the concept of Hash Slot, which means that 16384 slots are allocated in advance: the client gets the corresponding slot by performing CRC16 (key) % 16384 operation on the Key; on the redis server side, each server is responsible for a part of the slots, and when a new server is added or removed, it migrates these slots and their corresponding data. In the redis server side, each server is responsible for a part of the slot, and when a new server is added or removed, it migrates these slots and their corresponding data, while each server holds the complete information about the slot and its corresponding server, which allows the server side to redirect the client's request.

Client-side Redis

The third subsection above focuses on the evolutionary steps of the Redis server side, explaining how Redis evolved from a standalone service to a highly available, decentralized, distributed storage system. This subsection focuses on the redis services that can be consumed by the client.

1 Data type

redis supports a rich set of data types, from the most basic string to complex and commonly used data structures.

string: The most basic data type, a binary-safe string, up to 512M.

list: A list of strings that maintains order in the order they are added.

set: Unordered set of strings, no duplicate elements.

sorted set: sorted set of strings.

hash: A collection of key-value pairs.

bitmap: A more fine-grained operation, in bit units.

hyperloglog: A probability-based data structure.

These numerous data types are mainly designed to support the needs of various scenarios, and of course each type has a different time complexity. In fact, these complex data structures are equivalent to the specific implementation of Remote Data Access (RDA) introduced in my previous blog series "Interpreting REST" based on the architectural style of web applications, i.e., by executing a standard set of operation commands on the server to get the desired reduced set of results between the servers, thus simplifying the client's use, and also improve network performance. For example, if there is no data structure like list, you can only store the list as a string, and the client gets the complete list, and then submits the complete list to redis after the operation, which will generate a lot of waste.

2 Business

Each of the above data types has a separate command to operate on, and in many cases we need to execute more than one command at a time, and need it to succeed or fail at the same time. redis' support for transactions also stems from this part of the need to support the ability to execute multiple commands sequentially at once, and to ensure their atomicity.

3 Lua script

On top of transactions, lua can be useful if we need to perform more complex operations (including some logical judgments) at once on the server side (e.g., when fetching a cache while extending its expiration time). redis guarantees the atomicity of lua scripts, and in certain scenarios, it is possible to replace the transaction-related commands provided by redis. This is equivalent to the specific implementation of Remote Evluation (REV) introduced in the architectural style of web-based applications.

4 Pipeline

Because redis client-server connections are TCP-based, only one command can be executed per connection by default. Pipelines, on the other hand, allow multiple commands to be processed with a single connection, thus saving some of the overhead of a tcp connection. The difference between pipelines and transactions is that pipelines are designed to save communication overhead, but they do not guarantee atomicity.

5 Distributed locks

The official recommendation is to use the Redlock algorithm, which uses a string type, a specific key when adding the lock, and then set a random value; when removing the lock, use a lua script to first perform a comparison, and then remove the key. the specific command is as follows.

SET resource_name my_random_value NX PX 30000

if redis.call(“get”,KEYS[1]) == ARGV[1] then
    return redis.call(“del”,KEYS[1])
else
    return 0
end

Summary

This article focuses on the abstract level to explain the functions of redis and the purpose of its existence, without caring what the specific details are. Thus we can focus on the problem it solves, based on the concept of abstraction level can make us choose a more appropriate solution in a particular scenario, rather than limited to its technical details.

The above are some of the author's personal understanding, if inappropriate, welcome to correct.