Redis Software allows increasing a node’s effective memory limit (maxmemory) dynamically when sufficient host-level RAM is available, enabling capacity expansion without downtime. This is commonly used to handle growth, reduce eviction pressure, or restore memory headroom. This article covers when it is safe to increase memory, how to perform a rolling change without downtime, and how to validate stability, along with DC/DR considerations and Troubleshooting guidance.
Quick Fix Table
| Issue | Resolution |
|---|---|
| Hitting maxmemory or eviction thresholds | Increase maxmemory if the host has available RAM headroom |
| Need more capacity without downtime | Perform rolling maxmemory increase per node |
| High fragmentation ratio (>1.5) | Increase memory to restore effective usable capacity |
| Memory increase fails or not applied | Verify host-level RAM availability or platform limits |
| Failover risk after scaling | Apply same changes to DR cluster and validate replication |
Prerequisites
Ensure the environment is ready for a safe change
Sufficient physical RAM available on each node
-
Cluster is healthy:
Low replication lag
Stable latency
No ongoing failovers
Confirm platform capabilities
Dynamic memory allocation is supported without restart
If not, plan for a rolling restart when resizing underlying infrastructure
Step-by-Step: Increase Memory Without Downtime
Validate available memory headroom
Confirm each node has unused RAM beyond current maxmemory
Check platform indicators (if available) such as reserved or provisional RAM
Determine how much memory to add
Maintain 20–30% free memory at steady state
If fragmentation ratio is high (>1.5), allocate additional buffer
Prefer incremental increases if unsure
Select a node for update
Choose a shard/node during a lower-traffic window
Avoid modifying multiple nodes simultaneously
Increase maxmemory
Update the node configuration using your platform tools
Ensure the new value does not exceed safe host limits
Monitor behavior (5–15 minutes)
-
Observe:
Memory usage
CPU
Latency
Replication status
Repeat for remaining nodes
Proceed node-by-node in a rolling fashion
DC/DR Considerations
Keep capacity aligned across regions
Apply identical memory changes to both primary (DC) and disaster recovery (DR) clusters
Prevent failover scenarios where DR cannot handle production load
Validate replication capacity
Increased memory may increase dataset size and replication throughput
Ensure network links can support the change
Test failover readiness
Perform a controlled failover after changes
Confirm DR cluster stability and performance
Post-Change Validation
Confirm memory stability
used_memory should remain comfortably below maxmemory
Check eviction behavior
-
Evictions should be:
Zero, or
Within expected policy thresholds
Validate performance
Latency and CPU remain within SLOs
No increase in replication lag
Troubleshooting
Memory increase not applied
Verify host-level RAM availability
Confirm platform supports dynamic allocation
Check for configuration limits
Unexpected instability after change
Roll back recent memory increase
Re-evaluate headroom and workload patterns
Evictions continue after increase
Dataset growth may exceed expectations
Reassess sizing or consider scaling horizontally (adding shards)
Decrease in maxmemory causes issues
Reducing memory can trigger evictions or instability if used_memory exceeds new limits
Avoid decreasing memory without prior dataset reduction
When to Scale Instead
Increase node size or shard count if:
No available RAM headroom exists
Dataset growth is continuous and predictable
Memory pressure persists after tuning
0 comments
Please sign in to leave a comment.