Resharding in Redis Software lets you increase shard count to scale throughput and memory capacity without downtime. It’s a critical operation when dataset growth or key imbalance starts to impact performance. This guide explains how to prepare and execute a resharding operation safely, including Prerequisites and Safety Checks, Step-by-Step Instructions, Validation and Monitoring, and Troubleshooting for common issues such as CROSSSLOT errors, hot keys, and rack-aware reshard failures.
Quick answer
You can only increase the number of shards in place. To reduce shard count, create a new database with fewer shards and migrate data to it.
Before resharding, ensure CPU and memory utilization are safely below capacity on all nodes (ideally <80%) and schedule the operation during a maintenance window to absorb transient latency spikes.
Always back up the database and, when possible, rehearse the reshard in a non‑production environment first.
Prefer doubling the shard count (x2) where possible (for example, 2 → 4 → 8) to simplify slot redistribution, validation, and capacity planning.
When resharding rack‑aware databases, ensure replication is enabled
Prerequisites and Safety Checks
To minimize risk during resharding, review the following grouped requirements before proceeding:
Access and Environment Readiness
Admin Access: Use an account with administrator privileges on the Redis Cluster (REC) UI or the rladmin CLI.
Maintenance Window: Perform the operation during a scheduled maintenance window to account for short-lived latency spikes.
Monitoring Setup: Ensure Redis Insight, Prometheus, or Grafana are configured to track shard balance, latency, and throughput during the process.
Cluster Resource Validation
Maintain CPU and memory utilization below 80% across all nodes, with additional free memory for rebalancing.
Replication and Rack Awareness
Enable replication before resharding.
Rack-aware databases require replication for safe resharding. Each primary shard should have a replica in a different rack or zone.
If resharding fails in a rack-aware deployment:
Enable replication if not already enabled
Or temporarily disable rack awareness, complete the reshard, then re-enable it
If issues persist, collect a support package and contact Redis Support.
Data Protection and Testing
Backups: Back up your database before initiating the operation.
Staging Validation: Run a resharding test in a non-production environment to validate hashing and replication behavior.
Key and Hashing Policy Review
Multi-key Operations:
UseINFO COMMANDSTATSto identify multi-key commands (for example,MSET,MGET).
Ensure all related keys share the same hash tag (e.g.,user:{123}:cart,user:{123}:profile) or use a custom hashing policy.
Large and hot keys
-
Large keys
-
Run:
redis-cli -h <host> -p <port> --bigkeysto identify very large keys that may significantly slow or stall resharding (especially when copied or trimmed).
-
Consider:
Breaking very large values into smaller chunks.
Refactoring large hashes/lists/streams into multiple keys.
-
-
Hot keys
-
Run (if supported):
redis-cli -h <host> -p <port> --hotkeysThis command samples frequently-accessed keys and helps identify hotspots.
If the command is unavailable or does not return results, use Redis Insight or contact Redis Support to profile hot keys.
-
Address hot keys by:
-
Splitting or sharding hot data across multiple keys
Reducing per-key contention through caching or request distribution
-
-
Step-by-Step Resharding
Prepare Key Naming and Hashing
Review application key patterns and ensure multi-key operations use consistent hash tags or follow the defined hashing policy.-
Initiate Resharding
You can reshard either from the REC UI or via
rladmin(for Redis Software clusters you administer directly).
Option A: Using the UI
Go to Databases in the Redis Enterprise UI.
Select the target database.
Open Configuration → Shards (or the equivalent “Shards / Reshard” panel).
Increase the number of shards to the desired value.
Apply the configuration change to start the resharding process.
Option B: Using the REST API (Recommended for automation)
Use the revamp action to safely update the database topology, including shard count.
Step 1: Dry run (recommended)
Validate the change before execution:
curl -k -u "<user>:<password>" \
-X PUT "https://<cm-host>/v1/bdbs/<uid>/actions/revamp?dry_run=true" \
-H "Content-Type: application/json" \
-d '{
"shards_count": <new_count>
}'Step 2: Execute resharding
curl -k -u "<user>:<password>" \
-X PUT "https://<cm-host>/v1/bdbs/<uid>/actions/revamp" \
-H "Content-Type: application/json" \
-d '{
"shards_count": <new_count>
}'<uid>: Database ID<new_count>: Desired number of primary shards
Step 3: Track progress
If the request is accepted, the response returns an action_uid. Track progress using:
GET /v1/actions/<action_uid>Notes
Resharding is an online operation; the database remains available, though temporary latency increases may occur.
Always perform a dry run before executing topology changes.
For large datasets, progress may take time depending on key distribution and system load.
Validation and Monitoring
Shard Balance: Confirm even utilization with
rladmin status extra allor the REC UI.Key Distribution: Re-run
redis-cli --bigkeysor review Redis Insight analytics for uneven key sizes.Performance Metrics: Compare latency, throughput, and memory usage before and after the operation.
Cluster Health: Run
rlcheckto ensure all nodes and processes are healthy.
Quick Fix Table
| Symptom | Likely Cause | Quick Resolution |
|---|---|---|
| CROSSSLOT errors | Multi-key operations span multiple hash slots | Use hash tags or adjust hashing policy. |
| Latency spikes | Large or hot keys during resharding | Split large keys, add RAM, or schedule off-peak. |
| Shard imbalance | Uneven key distribution | Re-run resharding or review hash tag configuration. |
| Reshard fails with rack awareness | Rack-aware DB missing prerequisites (often no replication) | Temporarily disable rack awareness, reshard, then re-enable. |
| Resharding stuck or slow | Node or process issue | Restart cnm_exec, check event_log.log or cnm_exec.log, verify resources. |
If resharding repeatedly fails:
Inspect
/var/opt/redislabs/log/(event_log.log,redis-ID.log) for migration errors.Check node resource usage (RAM, CPU, disk).
Collect a support package (rladmin cluster support) and contact Redis Support.
Best Practices
Avoid oversized keys: Keep keys <512 MB (ideally <300 MB) for better replication and migration performance.
Monitor hot keys: Regularly identify and refactor keys that overload a single shard.
Scale predictably: Increase shard counts in powers of two (2→4→8) for optimal distribution.
Test in staging: Validate the process in non-production environments first.
Schedule off-peak: Perform resharding during low-traffic hours.
Back up data: Always back up before any resharding activity.
References
For extended guidance, see
Database clustering in Redis Enterprise,
Performance tuning best practices, and
Troubleshooting distributed system issues in Redis Software.
0 comments
Please sign in to leave a comment.