Handling Traffic Spikes and Throughput Above Configured Ops/Sec – Redis Knowledge Base

Redis databases can often handle short-term traffic bursts above planned or configured throughput, but sustained spikes without sufficient headroom commonly lead to increased latency, client timeouts, or error rates. In Redis Cloud and Redis Software environments, Configured Ops/Sec is primarily a sizing and planning signal, not always a strict enforcement limit. What ultimately determines stability under load is how close the system is to CPU, memory, and network saturation.

This article explains how to interpret Configured Ops/Sec versus observed throughput, how to operate safely during unexpected traffic spikes, and what actions to take when real-world traffic significantly exceeds configuration. It also covers common pitfalls in development and test environments where irregular spikes are frequent.

Quick Fix Actions When Throughput Exceeds Configuration

Symptom	Fast Check	Immediate Action
Observed ops/sec significantly above configured (for example, 50k configured, 150k observed)	Review per-shard CPU and p95/p99 latency	Reduce or pause non-critical background workloads and confirm no blocking commands are running
Latency spikes during traffic bursts	Check p95/p99 latency during peak windows	Enable or increase client-side pipelining and batching
CPU saturation on one or more shards	Per-shard CPU consistently above ~70%	Increase shard count or CPU to keep per-shard CPU below approximately 70% at peak
Evictions or memory pressure	Review memory usage and eviction metrics	Increase memory capacity and confirm eviction policy matches workload
Client timeouts or retry storms	Review client timeout and retry configuration	Implement client-side backoff to prevent overload cascades

Configured Ops/Sec vs Observed Throughput

Configured Ops/Sec is typically used as a capacity planning and sizing guideline, and in some Redis Cloud plans it may also be used for automated scaling signals or protective behavior. It is not always a hard enforcement limit.

Sustained operation above the configured Ops/Sec increases the risk of elevated latency and client timeouts if CPU, memory, or network headroom is insufficient.

Key points:

Latency and error rates are the primary indicators of overload, not raw ops/sec.
Redis may continue to serve traffic above configured throughput as long as CPU, memory, and network headroom remain available.
Some Redis Cloud plans may warn or throttle when sustained traffic exceeds configuration, while others operate on a best-effort basis with increasing latency.
Always validate expected behavior against your deployment model:
- Redis Cloud monitoring and scaling behavior
- Redis Software capacity and performance considerations

Sizing and Configuration Guidance for Traffic Spikes

Measure peak traffic, not averages
Capture p95 and p99 throughput and latency during known busy windows. Average metrics often mask short but impactful spikes.
See Redis Latency Monitoring Guidance

Scale capacity to preserve CPU headroom
Increase shard count or CPU to keep per-shard CPU below approximately 70% at peak:

Maintain memory headroom
Plan for 20–30% free memory to avoid evictions, allocator pressure, and degraded performance. For background on how Redis behaves under memory pressure, see Memory and Performance

Optimize client access patterns
Use pipelining and batching commands such as MGET and MSET. Avoid chatty request patterns that amplify latency under load:

Align eviction policy and TTLs with workload
For cache workloads, select an eviction policy such as allkeys-lru that matches access patterns, and ensure keys expire naturally. Read more on Data Eviction Policies

Operating Safely Above Planned Throughput (Short Term)

Short periods above configured Ops/Sec are often survivable if sufficient headroom exists, but elevated latency should be expected as utilization rises.

Recommended mitigations:

Monitor latency and error rates rather than ops/sec alone.
Implement client-side timeouts and exponential backoff to prevent retry storms
Temporarily pause or reduce non-essential workloads such as analytics jobs, scans, or maintenance tasks until capacity is increased.

Development and Test Environments with Irregular Spikes

Non-production environments are especially prone to mis-sizing and noisy workloads.

Common risks and mitigations:

Load tests that far exceed environment sizing can cause latency spikes and, in extreme cases, evictions or failovers.
Allocate additional headroom for environments that regularly run stress tests.
Isolate heavy test workloads on separate clusters or subscriptions.
Schedule load testing during off-peak hours where possible:
- Redis Cloud subscription sizing
- Redis Software cluster sizing guidance

Step-by-Step Action Plan When Throughput Exceeds Configuration

Stabilize the system
Reduce or stop non-critical background jobs, and confirm that no blocking commands are running (e.g., KEYS, long-running SCAN, large Lua scripts, or heavy EVAL operations).
These commands can monopolize the Redis main thread and amplify latency during spikes.
Assess current performance
Review per-shard CPU, memory usage, network utilization, and p95/p99 latency. Verify that clients are using batching or pipelining.
Scale capacity
Add shards to reduce per-shard load or scale up CPU. Increase memory if evictions or allocator pressure are observed.
Align configuration with reality
Update configured Ops/Sec to reflect real peak usage and implement alerts based on latency, CPU, and error rates rather than ops/sec alone.

Success Criteria

Traffic spikes are considered well-handled when:

p95 and p99 latency remain within defined SLOs during peak load.
No sustained evictions or error-rate spikes occur.
Per-shard CPU and network utilization remain within safe operating ranges.

Related to