When a small number of "hot keys" receive a disproportionate amount of traffic in Redis, they can overload the shards on which they reside. This guide provides engineering-level insight into detecting, diagnosing, and resolving hot key imbalances to maintain consistent performance and avoid shard-level bottlenecks. Key sections include Detection, Step-by-Step Troubleshooting, and Remediation Techniques
Detection
Monitor Shard Metrics
- Use observability tools like Prometheus and Grafana to track shard-level metrics.
- Focus on CPU usage >80%, elevated latency, and memory usage discrepancies between shards.
Identify Hot Keys
- Run
redis-cli --hotkeysto sample the most frequently accessed keys. - Use
OBJECT FREQ <key>to get usage frequency. - Determine which shard a key resides on using a Redis key slot calculator.
Step-by-Step Troubleshooting
Initial Assessment
- Review Grafana dashboards for metrics that reveal imbalanced resource usage.
- Identify shards showing high CPU or memory usage consistently.
Pinpoint Hot Keys
- Run
redis-cli --hotkeysandOBJECT FREQon affected databases. Review the slowlog for the database to see if there’s a frequent key referenced in the output.
- Match keys to shard locations using key slot mapping tools.
Deeper Analysis
- Use
INFO,CLIENT LIST, andSLOWLOGto examine request volume and latency on hot shards. - For Redis Search workloads, inspect query and index behavior using
FT.INFOandFT.PROFILE.
Simulate & Benchmark
- Reproduce traffic using tools like
memtier_benchmarkor internal scripts. - Observe impact in monitoring dashboards to validate hot key behavior.
Remediation Techniques
1. Break Up Hot Keys
Split large hashes or JSON objects into smaller, distributed structures.
- Reduces contention on a single shard.
- Improves cache locality and write throughput.
- Simplifies resharding and scaling.
2. Manual Hash Tagging
Hash tags can be useful—but they often cause hot shards if overused.
How hash tags work
- A tag (for example,
{user123}) forces all keys with that tag onto the same shard. - This can help co-locate related data when affinity is needed.
Risks of overuse - Hot shard creation: Many keys share the same tag.
- Performance degradation: Traffic concentrates on one shard.
-
Rebalancing difficulty: Requires key redesign or resharding.
Best practices - Use tags only for predictable, related workloads.
- Don’t use tags to “manually” spread load.
- Continuously monitor shard traffic for imbalances.
3. Modify Partitioning Strategy
Redis Software manages slot allocation automatically. To improve balance:
-
Increase shard count using the Web UI or REST API (not
rladmin). - This redistributes slot ranges and spreads traffic.
- After resharding, monitor shard utilization to verify improvement.
4. Reconfigure Redis Cluster
In a native Redis Cluster, slot rebalancing alone won’t fix hot shards—high-traffic keys still map to the same slot.
- Distribute shards across additional nodes to reduce noisy-neighbor impact.
- Redis automatically manages slot-to-shard mapping (manual assignment is not supported).
- Monitor performance after scaling to confirm better balance.
5. Set Alerts
Add observability rules for CPU, memory, and latency per shard using Grafana or your preferred monitoring platform.
- Enables early detection of skewed workloads.
- Supports proactive scaling or resharding actions.
6. Application-Level Sharding
If hot keys persist, redesign at the client layer:
- Update application logic or use cluster-aware client libraries to distribute traffic.
- Hash or namespace key patterns to spread load dynamically.
- Validate distribution using request metrics or keyspace stats.
Common Troubleshooting Tips
- If no obvious hot key exists, investigate data ingestion or background operations as possible culprits.
- For Redis Search issues, inspect and refactor query/index patterns that hit the same shard repeatedly.
- During short traffic spikes (e.g., sales or peak hours), log access trends to inform future workload adjustments.
For complex performance issues, gather shard metrics, logs, and key samples, and contact Redis Support with a detailed support package.
0 comments
Please sign in to leave a comment.