As Redis workloads grow, throughput and capacity limits can be reached due to higher traffic, larger datasets, or new application features. Redis supports multiple scaling models to address these limits, but choosing the wrong approach can introduce unnecessary complexity or delay future growth.
This article helps you decide how to scale Redis based on real-world symptoms and growth patterns. It explains when scaling up is sufficient, when scaling out becomes necessary, what changes operationally when you scale, and what to validate before and after making changes. Product-specific execution details are linked inline where deeper guidance is required.
Throughput and Capacity Basics
Throughput is the number of operations per second a Redis database can process while meeting latency expectations.
Capacity is the amount of data Redis can store and serve efficiently, primarily constrained by memory, shard layout, and eviction behavior.
When either throughput or capacity approaches system limits, Redis performance becomes less predictable and scaling is required to restore headroom.
Quick Fix: Common Scaling Symptoms
| Symptom | What It Usually Indicates | Recommended Direction |
|---|---|---|
| CPU or memory consistently near 80 percent | Single shard saturation | Scale up or prepare to scale out |
| Ops per second capped below demand | Throughput limit reached | Increase throughput or enable clustering |
| One shard significantly hotter than others | Hot keys or uneven access patterns | Fix key design and reshard |
| Latency spikes as traffic grows | Insufficient headroom | Scale before increasing load |
| Errors after enabling clustering | Client or key pattern incompatibility | Validate cluster-safe access |
Scale Up (Vertical Scaling)
When scale up is the right choice
Scale up is appropriate when performance is constrained by CPU, memory, or network bandwidth on a single shard, and the workload still fits comfortably within one shard.
Scaling up increases the resources available to process requests without changing how data is distributed or how applications interact with Redis.
Scale up works best when:
The dataset is small to moderately sized
Traffic growth is incremental or bursty
The workload does not require horizontal parallelism
Scale up does not solve:
Hot keys or uneven access patterns
Sustained throughput growth beyond a single shard
High availability or fault isolation requirements
At a certain point, vertical scaling reaches physical or product limits. In Redis Cloud, increasing throughput or memory beyond supported single-shard thresholds automatically enables clustering, which is a permanent change.
For Redis Cloud–specific throughput limits and scaling behavior, see: Undersized Redis Cloud Databases: How to Diagnose and Fix
Scale Out (Horizontal Scaling)
When scale out is required
Scale out is required when throughput or dataset size exceeds what a single shard can handle, or when higher availability and fault isolation are needed.
Scaling out distributes data and traffic across multiple shards, allowing Redis to process requests in parallel and scale capacity more predictably over time.
Scale out is the right choice when:
Throughput growth is sustained rather than occasional
The dataset continues to grow over time
You need resilience to shard or node failures
What changes when you scale out
Scaling out improves aggregate throughput, but it also introduces new operational and application considerations:
Data is partitioned across shards
Clients must be cluster-aware
Multi-key operations require careful key design
Poor key patterns or unsupported clients can limit the benefits of sharding, even after adding resources.
Before scaling out, review clustering behavior, limitations, and query requirements: Redis Cloud Throughput Sizing Decision Guide: Optimize, Increase, or Upgrade
For Redis Software environments, detailed shard and node expansion workflows are covered here:Oversized Redis Cloud Databases: How to Right-Size Safely
Read Scaling with Replicas
For read-heavy workloads, adding read replicas can increase throughput without increasing write contention.
Read scaling is effective when:
The majority of traffic is read-only
Applications can tolerate eventual consistency
Replication lag is monitored and acceptable
Read replicas complement sharding but do not replace it for write-heavy workloads.
Best Practices for Scaling Decisions
Optimize before scaling
Use pipelining, connection pooling, and efficient data structures. Avoid blocking commands where possible.
Plan with headroom
Target CPU and memory utilization below 80 percent to absorb spikes, failovers, and maintenance operations.
Avoid hot keys and large keys
Distribute access patterns evenly and split oversized keys to prevent single-shard bottlenecks.
Validate client compatibility early
Confirm cluster support before enabling clustering or resharding.
Monitor after every change
Track latency percentiles, eviction rate, shard balance, and throughput to confirm scaling achieved the intended outcome.
Key Takeaways
Scale up for fast, short-term gains on smaller workloads
Scale out for sustained growth, high availability, and predictable throughput
In Redis Cloud, clustering is permanent once enabled
Poor key design limits scalability even after adding resources
Always validate and monitor before increasing traffic further
0 comments
Please sign in to leave a comment.