Increase Maximum Throughput and Capacity (Scale Up or Scale Out) – Redis Knowledge Base

As Redis workloads grow, throughput and capacity limits can be reached due to higher traffic, larger datasets, or new application features. Redis supports multiple scaling models to address these limits, but choosing the wrong approach can introduce unnecessary complexity or delay future growth.

This article helps you decide how to scale Redis based on real-world symptoms and growth patterns. It explains when scaling up is sufficient, when scaling out becomes necessary, what changes operationally when you scale, and what to validate before and after making changes. Product-specific execution details are linked inline where deeper guidance is required.

Throughput and Capacity Basics

Throughput is the number of operations per second a Redis database can process while meeting latency expectations.

Capacity is the amount of data Redis can store and serve efficiently, primarily constrained by memory, shard layout, and eviction behavior.

When either throughput or capacity approaches system limits, Redis performance becomes less predictable and scaling is required to restore headroom.

Quick Fix: Common Scaling Symptoms

Symptom	What It Usually Indicates	Recommended Direction
CPU or memory consistently near 80 percent	Single shard saturation	Scale up or prepare to scale out
Ops per second capped below demand	Throughput limit reached	Increase throughput or enable clustering
One shard significantly hotter than others	Hot keys or uneven access patterns	Fix key design and reshard
Latency spikes as traffic grows	Insufficient headroom	Scale before increasing load
Errors after enabling clustering	Client or key pattern incompatibility	Validate cluster-safe access

Scale Up (Vertical Scaling)

When scale up is the right choice

Scale up is appropriate when performance is constrained by CPU, memory, or network bandwidth on a single shard, and the workload still fits comfortably within one shard.

Scaling up increases the resources available to process requests without changing how data is distributed or how applications interact with Redis.

Scale up works best when:

The dataset is small to moderately sized
Traffic growth is incremental or bursty
The workload does not require horizontal parallelism

Scale up does not solve:

Hot keys or uneven access patterns
Sustained throughput growth beyond a single shard
High availability or fault isolation requirements

At a certain point, vertical scaling reaches physical or product limits. In Redis Cloud, increasing throughput or memory beyond supported single-shard thresholds automatically enables clustering, which is a permanent change.

For Redis Cloud–specific throughput limits and scaling behavior, see: Undersized Redis Cloud Databases: How to Diagnose and Fix

Scale Out (Horizontal Scaling)

When scale out is required

Scale out is required when throughput or dataset size exceeds what a single shard can handle, or when higher availability and fault isolation are needed.

Scaling out distributes data and traffic across multiple shards, allowing Redis to process requests in parallel and scale capacity more predictably over time.

Scale out is the right choice when:

Throughput growth is sustained rather than occasional
The dataset continues to grow over time
You need resilience to shard or node failures

What changes when you scale out

Scaling out improves aggregate throughput, but it also introduces new operational and application considerations:

Data is partitioned across shards
Clients must be cluster-aware
Multi-key operations require careful key design

Poor key patterns or unsupported clients can limit the benefits of sharding, even after adding resources.

Before scaling out, review clustering behavior, limitations, and query requirements: Redis Cloud Throughput Sizing Decision Guide: Optimize, Increase, or Upgrade

For Redis Software environments, detailed shard and node expansion workflows are covered here:Oversized Redis Cloud Databases: How to Right-Size Safely

Read Scaling with Replicas

For read-heavy workloads, adding read replicas can increase throughput without increasing write contention.

Read scaling is effective when:

The majority of traffic is read-only
Applications can tolerate eventual consistency
Replication lag is monitored and acceptable

Read replicas complement sharding but do not replace it for write-heavy workloads.

Best Practices for Scaling Decisions

Optimize before scaling
Use pipelining, connection pooling, and efficient data structures. Avoid blocking commands where possible.

Plan with headroom
Target CPU and memory utilization below 80 percent to absorb spikes, failovers, and maintenance operations.

Avoid hot keys and large keys
Distribute access patterns evenly and split oversized keys to prevent single-shard bottlenecks.

Validate client compatibility early
Confirm cluster support before enabling clustering or resharding.

Monitor after every change
Track latency percentiles, eviction rate, shard balance, and throughput to confirm scaling achieved the intended outcome.

Key Takeaways

Scale up for fast, short-term gains on smaller workloads
Scale out for sustained growth, high availability, and predictable throughput
In Redis Cloud, clustering is permanent once enabled
Poor key design limits scalability even after adding resources
Always validate and monitor before increasing traffic further

Related to