1. What is the Redis big key problem?
The Redis big key problem refers to a situation where a key in Redis has a value that consumes a significant amount of memory, leading to performance degradation, insufficient memory, data imbalance, and delayed master-slave synchronization, among other issues.
What size of data is considered a big key?
There is no fixed criterion for determining the size of a big key. Typically, a string-type key with a value occupying more than 1MB of memory or a set-type key with more than 10,000 elements is considered a big key. The definition and evaluation criteria for big keys in Redis may vary based on the actual usage and specific requirements of the application. For example, in high-concurrency and low-latency scenarios, a key of just 10KB might be considered a big key. However, in low-concurrency and high-capacity environments, the threshold for big keys may be around 100KB. Therefore, when designing and using Redis, it is important to establish reasonable thresholds for big keys based on business requirements and performance indicators.
2. Impacts of big keys:
- High memory consumption: Big keys occupy a significant amount of memory, potentially leading to memory shortages and triggering memory eviction policies. In extreme cases, this can result in memory exhaustion and Redis instance crashes, affecting system stability.
- Performance degradation: Big keys increase memory fragmentation, which can impact the overall performance of Redis. Operations such as read, write, and deletion on big keys consume more CPU time and memory resources, further decreasing system performance.
- Blocking other operations: Certain operations on big keys can cause Redis instances to become unresponsive for a period of time, blocking other client requests and affecting response time and throughput.
- Network congestion: Retrieving a big key generates substantial network traffic, potentially saturating machine or LAN bandwidth and affecting other services. For example, if a big key occupies 1MB of space and is accessed 1000 times per second, it results in 1000MB of network traffic.
- Delayed master-slave synchronization: When Redis instances are configured for master-slave replication, big keys can introduce synchronization delays. Due to their large memory footprint, significant data needs to be transferred during the synchronization process, resulting in increased network latency between the master and slave and potential data consistency issues.
- Data skew: In Redis cluster mode, excessive memory usage by a specific data shard can disrupt the balanced distribution of memory resources among shards. It may lead to key eviction based on the maxmemory parameter and even result in memory overflow.
3. Causes of big keys:
- Poor business design: This is the most common cause where a large amount of data is stored under a single key instead of being distributed across multiple keys. For example, instead of storing data nationwide under a single key, it should be divided into multiple keys based on administrative regions or cities to reduce the probability of generating big keys.
- Failure to anticipate dynamic growth of values: If data keeps getting added to a key without proper mechanisms for deletion, expiration, or quantity limitations, it will eventually result in big keys. Examples include fan lists of celebrities on social media or popular comments.
- Improper expiration time setting: Failure to set an expiration time for a key, or setting an excessively long expiration time, can lead to rapid accumulation of values over time, ultimately resulting in big keys.
- Program bugs: Certain exceptional circumstances can cause the lifecycle of a key to exceed expectations or lead to abnormal growth in the number of values, resulting in big keys.
4. How to identify big keys:
4.1 SCAN command:
By using the SCAN command in Redis, one can gradually traverse all keys in the database. Combined with other commands such as STRLEN, LLEN, SCARD, HLEN, etc., one
can identify big keys. The advantage of the SCAN command is that it allows traversal without blocking the Redis instance.
4.2 “redis-cli –bigkeys” parameter:
When connecting to a Redis server using the “redis-cli” command-line tool, adding the “–bigkeys” parameter allows scanning for keys with the largest count in each data type.
Example: “redis-cli -h 127.0.0.1 -p 6379 –bigkeys”
4.3 Redis RDB Tools:
Using the open-source tool Redis RDB Tools, one can analyze RDB files and scan for big keys. For example, the following command outputs the top 3 keys with memory usage larger than 1KB:
“rdb –command memory –bytes 1024 –largest 3 dump.rdb”
5. How to resolve big key issues:
- Split into multiple small keys: This is the most straightforward approach, reducing the size of individual keys, and utilizing batch retrieval methods like “MGET” for reading.
- Data compression: When using the String type, employ compression algorithms to reduce the size of values. Alternatively, consider using the Hash type, as it utilizes compressed list data structures.
- Set reasonable expiration times: Set expiration times for each key and ensure they are appropriate to automatically clean up data after it becomes invalid, avoiding long-term accumulation of big keys.
- Enable memory eviction policies: Enable Redis’ memory eviction policies, such as LRU (Least Recently Used), to automatically evict the least recently used data when memory is scarce, preventing big keys from occupying memory for extended periods.
- Data sharding: Use Redis Cluster to distribute data across multiple Redis instances, reducing the burden on individual instances and lowering the risk of big key problems.
- Delete big keys: Use the UNLINK command to delete big keys. UNLINK is an asynchronous version of the DEL command that allows keys to be deleted in the background, avoiding Redis instance blocking.
6. Conclusion:
The big key problem is a common issue in Redis that can lead to performance degradation, excessive memory usage, blocked operations, delayed master-slave synchronization, and other challenges. This article provides an overview of the causes, impacts, detection methods, and solutions for big key problems. By optimizing data structure designs, setting appropriate data expiration policies, optimizing system architecture and configurations, and progressively deleting big keys, it is possible to effectively address and prevent big key issues, thereby improving the stability and performance of Redis systems.