Redis Software deployed on Kubernetes uses the Redis Enterprise operator to manage RedisEnterpriseDatabase resources. If rladmin status shows ERROR: stale info for one or more shards and the Redis Enterprise UI displays Active Change pending, it usually means a database configuration change did not complete successfully and cluster state is no longer fully synchronized.
This most commonly occurs after running kubectl apply to create or modify a database where reconciliation was interrupted, cluster capacity was insufficient, or shard processes failed to initialize correctly. This guide provides immediate triage steps in the Quick Fix Table, followed by deeper validation under Root Causes, Step-by-Step Troubleshooting, and Recovery Options.
Prerequisites
kubectl access to the correct namespace
Access to Redis Enterprise cluster nodes (rladmin)
Permission to view RedisEnterpriseDatabase and RedisEnterpriseCluster resources
Access to operator logs
Quick Fix Table
| Symptom | What To Check Immediately |
|---|---|
| rladmin status shows ERROR: stale info (xxxxs) | Run rladmin cluster status and confirm all nodes are Online and connected. |
| Shards show N/A memory usage | Verify the shard pods are Running and not stuck in Pending or CrashLoopBackOff. |
| UI shows “Active Change pending” for an extended time | Check kubectl describe redb <db-name> for reconciliation errors or failed conditions. |
| Issue started right after kubectl apply | Validate the database YAML for shard count, memory, replication, and version compatibility. |
| Only specific shards affected | Confirm the hosting node is not Failed, recovering, or resource-constrained. |
What “ERROR: stale info” Means
The stale info state indicates that shard metadata has not been successfully refreshed in the cluster for an extended period. The timer (for example, 15917s) represents how long the shard has been reporting outdated state.
This typically happens when:
A node hosting the shard is unreachable
The shard process failed to start
Cluster heartbeat or gossip communication is interrupted
A database change operation was interrupted mid-process
What “Active Change pending” Means
“Active Change pending” indicates that a database configuration change has not completed. Examples include:
Scaling shard count
Adding or removing replicas
Changing memory allocation
Modifying persistence settings
Updating replication topology
If reconciliation fails or is blocked, the database remains in a transitional state and shards may not fully initialize.
Step-by-Step Troubleshooting
1. Verify Database Resource Status
Check the RedisEnterpriseDatabase resource:
kubectl get redb -n <namespace>
kubectl describe redb <database-name> -n <namespace>Look for:
Failed conditions
Allocation errors
Insufficient resources
Stuck Progressing state
If the resource remains in Progressing without change, reconciliation may be blocked.
2. Check Pod Health
kubectl get pods -n <namespace> -o wideConfirm:
All database pods are Running
No pods are in CrashLoopBackOff
No pods are stuck in Pending
If pods are Pending:
kubectl describe pod <pod-name>Common causes include:
Insufficient CPU or memory
Node selector mismatch
Taints preventing scheduling
Storage or PVC binding failures
3. Verify Cluster Health
On a Redis Enterprise node:
rladmin status
rladmin cluster statusConfirm:
All nodes are Online
No nodes are in Failed state
No quorum issues exist
Cluster state is consistent
If a node is unreachable, investigate:
Kubernetes node health
Network connectivity
Resource pressure
4. Confirm Shard Placement
Identify which node hosts the affected shards. If shards show:
N/A memoryERROR: stale info
Confirm the hosting node:
Is Online
Has sufficient available memory
Is not configured as quorum_only
Is not under maintenance
5. Validate the Applied YAML
Review the database specification:
kubectl get redb <database-name> -o yamlVerify:
Memory size fits cluster capacity
Shard count is supported by available nodes
Replication factor is achievable
Redis version is supported
Persistence configuration is valid
Common misconfigurations:
Requesting more shards than available nodes
Insufficient memory for replication
Changing shard count and replication simultaneously
Applying topology changes during node instability
6. Review Operator Logs
kubectl logs deployment/redis-enterprise-operator -n <namespace>Look for:
Allocation failures
Scheduling errors
Resource constraint violations
Internal API failures
Recovery Options
Option 1: Correct and Re-Apply
If misconfiguration is identified, correct the YAML and reapply:
kubectl apply -f <corrected-file.yaml>Monitor:
kubectl get redb -wOption 2: Delete and Recreate (New Database Only)
If the database was newly created and does not contain critical data:
kubectl delete redb <database-name> -n <namespace>
kubectl apply -f <corrected-file.yaml>Option 3: Restore or Engage Support
If this occurred during a production change:
Do not force-delete shards
Avoid manual metadata changes
Collect cluster status and operator logs
Generate a support package from all participating clusters if using CRDB or Replica Of
When to Escalate
Engage Redis Support if:
Multiple nodes show stale info
Cluster metadata appears inconsistent
Shards fail to attach after node recovery
CRDB or cross-cluster replication is involved
Reconciliation repeatedly fails without clear resource constraints
Before escalation, collect:
rladmin statusrladmin cluster statuskubectl describe redbOperator logs
Support package from relevant clusters
Prevention Best Practices
Validate cluster capacity before scaling shards or replicas
Avoid combining shard count and replication changes in one operation
Confirm all nodes are Online before applying topology changes
Monitor operator logs during database modifications
Test configuration changes in staging environments
0 comments
Please sign in to leave a comment.