Redis Memory Policy: Fix OOM Command Blocking During High Traffic Scaling
Quick Fix Summary
TL;DRTemporarily increase maxmemory or switch to volatile-lru policy to unblock writes.
Redis is configured with a maxmemory limit and a policy (e.g., noeviction) that blocks new write commands when memory is full, causing application errors during traffic spikes.
Diagnosis & Causes
Recovery Steps
Step 1: Verify Memory Usage and Configuration
Connect to the Redis instance and check current memory usage, maxmemory setting, and eviction policy.
redis-cli INFO memory | grep -E "(used_memory_human|maxmemory_human|maxmemory_policy)"
redis-cli CONFIG GET maxmemory
redis-cli CONFIG GET maxmemory-policy Step 2: Immediate Relief - Temporarily Raise Maxmemory
Dynamically increase the maxmemory limit to allow writes to proceed. This is a temporary fix.
# Set to a higher value (e.g., 2GB). Replace with your desired value.
redis-cli CONFIG SET maxmemory 2147483648
# Verify the change
redis-cli CONFIG GET maxmemory Step 3: Change Eviction Policy to Allow Writes
If the policy is 'noeviction', change it to an evicting policy like 'volatile-lru' or 'allkeys-lru' to allow new data by removing old keys.
# Set to volatile-lru (evicts expired keys first, then least recently used among keys with TTL)
redis-cli CONFIG SET maxmemory-policy volatile-lru
# Or for more aggressive eviction, use allkeys-lru
# redis-cli CONFIG SET maxmemory-policy allkeys-lru Step 4: Identify and Reduce Memory Footprint
Find the largest keys and analyze memory patterns to identify candidates for cleanup or optimization.
# Scan for large keys (run on a replica or during low traffic if possible)
redis-cli --bigkeys
# Get a memory usage report for top keys
redis-cli MEMORY USAGE some_large_key_name Step 5: Flush Volatile Data (If Applicable)
If data can be regenerated or is non-critical, flush expired keys or the entire database as a last resort.
# Actively evict all expired keys
redis-cli MEMORY PURGE
# WARNING: Flushes ALL data in the current database. Use with extreme caution.
# redis-cli FLUSHDB ASYNC Step 6: Scale Resources Permanently
Permanently adjust the maxmemory configuration in redis.conf and restart, or scale the underlying infrastructure.
# Edit the Redis configuration file
sudo vi /etc/redis/redis.conf
# Find and update 'maxmemory' and 'maxmemory-policy'
# maxmemory 4gb
# maxmemory-policy volatile-lru
# Restart Redis service
sudo systemctl restart redis-server Step 7: Implement Monitoring and Alerts
Set up monitoring for memory usage to prevent future occurrences. Use the INFO command metrics.
# Key metrics to monitor:
# used_memory / maxmemory (percentage)
# evicted_keys (rate of eviction)
# blocked_clients (clients blocked on BRPOP, BLPOP, etc.) Architect's Pro Tip
"The 'noeviction' policy is often the default in newer Redis versions. It's chosen for data safety but is catastrophic for write-heavy applications during traffic spikes. Always set an appropriate eviction policy (like volatile-lru) in production unless you have absolute data retention requirements and a separate scaling plan."
Frequently Asked Questions
What's the difference between 'volatile-lru' and 'allkeys-lru'?
'volatile-lru' evicts only keys that have an expiration (TTL) set, using the LRU algorithm. 'allkeys-lru' evicts any key, regardless of TTL. Use 'volatile-lru' if you have a mix of permanent and temporary data. Use 'allkeys-lru' if all data is equally evictable.
Will changing maxmemory-policy cause data loss?
Yes. Any policy other than 'noeviction' will cause Redis to automatically delete keys to make space for new writes. This is the intended behavior to maintain write availability, but it means some existing data will be lost.
Why did my memory usage spike suddenly?
Common reasons include: a large batch job inserting data, a cache stampede causing mass re-population, a bug creating excessively large data structures (e.g., huge lists/hashes), or a lack of TTLs on cached data leading to unbounded growth.