Redis databases can often handle short-term traffic bursts above planned or configured throughput, but sustained spikes without sufficient headroom commonly lead to increased latency, client timeouts, or error rates. In Redis Cloud and Redis Software environments, Configured Ops/Sec is primarily a sizing and planning signal, not always a strict enforcement limit. What ultimately determines stability under load is how close the system is to CPU, memory, and network saturation.
This article explains how to interpret Configured Ops/Sec versus observed throughput, how to operate safely during unexpected traffic spikes, and what actions to take when real-world traffic significantly exceeds configuration. It also covers common pitfalls in development and test environments where irregular spikes are frequent.
Quick Fix Actions When Throughput Exceeds Configuration
| Symptom | Fast Check | Immediate Action |
|---|---|---|
| Observed ops/sec significantly above configured (for example, 50k configured, 150k observed) | Review per-shard CPU and p95/p99 latency | Reduce or pause non-critical background workloads and confirm no blocking commands are running |
| Latency spikes during traffic bursts | Check p95/p99 latency during peak windows | Enable or increase client-side pipelining and batching |
| CPU saturation on one or more shards | Per-shard CPU consistently above ~70% | Increase shard count or CPU to keep per-shard CPU below approximately 70% at peak |
| Evictions or memory pressure | Review memory usage and eviction metrics | Increase memory capacity and confirm eviction policy matches workload |
| Client timeouts or retry storms | Review client timeout and retry configuration | Implement client-side backoff to prevent overload cascades |
Configured Ops/Sec vs Observed Throughput
Configured Ops/Sec is typically used as a capacity planning and sizing guideline, and in some Redis Cloud plans it may also be used for automated scaling signals or protective behavior. It is not always a hard enforcement limit.
Sustained operation above the configured Ops/Sec increases the risk of elevated latency and client timeouts if CPU, memory, or network headroom is insufficient.
Key points:
Latency and error rates are the primary indicators of overload, not raw ops/sec.
Redis may continue to serve traffic above configured throughput as long as CPU, memory, and network headroom remain available.
Some Redis Cloud plans may warn or throttle when sustained traffic exceeds configuration, while others operate on a best-effort basis with increasing latency.
-
Always validate expected behavior against your deployment model:
Sizing and Configuration Guidance for Traffic Spikes
Measure peak traffic, not averages
Capture p95 and p99 throughput and latency during known busy windows. Average metrics often mask short but impactful spikes.
See Redis Latency Monitoring Guidance
Scale capacity to preserve CPU headroom
Increase shard count or CPU to keep per-shard CPU below approximately 70% at peak:
Maintain memory headroom
Plan for 20–30% free memory to avoid evictions, allocator pressure, and degraded performance. For background on how Redis behaves under memory pressure, see Memory and Performance
Optimize client access patterns
Use pipelining and batching commands such as MGET and MSET. Avoid chatty request patterns that amplify latency under load:
Align eviction policy and TTLs with workload
For cache workloads, select an eviction policy such as allkeys-lru that matches access patterns, and ensure keys expire naturally. Read more on Data Eviction Policies
Operating Safely Above Planned Throughput (Short Term)
Short periods above configured Ops/Sec are often survivable if sufficient headroom exists, but elevated latency should be expected as utilization rises.
Recommended mitigations:
Monitor latency and error rates rather than ops/sec alone.
Implement client-side timeouts and exponential backoff to prevent retry storms
Temporarily pause or reduce non-essential workloads such as analytics jobs, scans, or maintenance tasks until capacity is increased.
Development and Test Environments with Irregular Spikes
Non-production environments are especially prone to mis-sizing and noisy workloads.
Common risks and mitigations:
Load tests that far exceed environment sizing can cause latency spikes and, in extreme cases, evictions or failovers.
Allocate additional headroom for environments that regularly run stress tests.
Isolate heavy test workloads on separate clusters or subscriptions.
-
Schedule load testing during off-peak hours where possible:
Step-by-Step Action Plan When Throughput Exceeds Configuration
Stabilize the system
Reduce or stop non-critical background jobs, and confirm that no blocking commands are running (e.g., KEYS, long-running SCAN, large Lua scripts, or heavy EVAL operations).
These commands can monopolize the Redis main thread and amplify latency during spikes.Assess current performance
Review per-shard CPU, memory usage, network utilization, and p95/p99 latency. Verify that clients are using batching or pipelining.Scale capacity
Add shards to reduce per-shard load or scale up CPU. Increase memory if evictions or allocator pressure are observed.Align configuration with reality
Update configured Ops/Sec to reflect real peak usage and implement alerts based on latency, CPU, and error rates rather than ops/sec alone.
Success Criteria
Traffic spikes are considered well-handled when:
p95 and p99 latency remain within defined SLOs during peak load.
No sustained evictions or error-rate spikes occur.
Per-shard CPU and network utilization remain within safe operating ranges.
0 comments
Please sign in to leave a comment.