Redis Software for Kubernetes (operator-managed REC/REDB) uses a Kubernetes node label (for example, topology.kubernetes.io/zone) to spread Redis Software cluster pods and database shards across failure domains for higher availability. This guide focuses on the cases where rack awareness appears enabled but placement is not behaving as expected, and walks through Quick Fix actions, a Configuration Checklist, Verification Steps, and Troubleshooting patterns you can apply in production
Quick Fix
| If you see | Do this |
|---|---|
| REC pods are landing in the same zone | Validate node labels and eligibility. Confirm every eligible node has the same label key configured in spec.rackAwarenessNodeLabel, and that values differ across zones. If any eligible node is missing the label, fix labels or narrow eligibility with nodeSelector. |
| Pods are spread, but primaries and replicas still collide in one zone | Confirm the database has replication enabled, then re-optimize shard placement. Generate an optimized shard placement blueprint and apply it to rearrange shards. |
You set rackAwarenessNodeLabel but rladmin verify rack_aware indicates rack awareness is not configured
|
On older Redis Software versions, enable rack-zone awareness at the cluster layer. Assign a rack_id per node and enable the rack_aware cluster policy via the management REST API, then verify again. On newer versions, this is automated. |
| Operator logs mention permissions or can’t read node labels | Fix RBAC for node reads. Ensure the operator ServiceAccount has list/get/watch permissions on nodes, then recheck reconciliation. |
| Rack awareness was correct, then drifted after pod restarts | Detect collisions, then correct with shard optimization. Use rladmin verify rack_aware, then generate and apply an optimized shard placement blueprint to restore proper distribution. |
Prerequisites
Redis Enterprise for Kubernetes, deployed via the Redis Enterprise operator, using
RedisEnterpriseCluster(REC) andRedisEnterpriseDatabase(REDB) custom resources.At least 3 REC nodes for a meaningful zone spread (recommended).
Node labels are present and stable across the nodes you consider eligible for REC pods.
Operator RBAC can read node labels (list/get/watch nodes).
Database replication enabled if you expect primary/replica separation across zones.
Concepts
What rack awareness means in this deployment
Kubernetes layer (REC pods): the operator can schedule Redis Software cluster pods across nodes based on a label key you provide via spec.rackAwarenessNodeLabel. Read more RedisEnterpriseCluster API Reference
Redis Software layer (shards): rack-zone awareness uses node rack-zone IDs to avoid placing primaries and their corresponding replicas in the same rack or zone, improving availability during a rack/zone failure. Read for Rack awareness examples
Configuration Checklist
1) Choose and validate the node label key
Preferred label key: topology.kubernetes.io/zone (standard). Use your own custom label only if you manage it consistently.
Run:
kubectl get nodes -o custom-columns="name:metadata.name","zone:metadata.labels.topology\.kubernetes\.io/zone"If output is empty for some nodes, label those nodes or restrict REC eligibility so only labeled nodes are considered.
2) Ensure the operator can read node labels
If you see “permissions denied” symptoms in operator logs or reconciliation, confirm your operator has RBAC to read nodes (list/get/watch). The rack awareness YAML example explicitly calls this out as a common failure mode.
3) Set rack awareness on the REC
In the REC manifest, set:
spec:
rackAwarenessNodeLabel: topology.kubernetes.io/zoneThis is the canonical YAML pattern for a rack-aware cluster in Redis Software for Kubernetes.
Important: The value must match the node label key exactly (case and punctuation).
4) Configure databases so rack awareness can take effect
Rack-aware shard placement requires both database-level rack awareness and something to separate.
Ensure spec.rackAware: true is set on the RedisEnterpriseDatabase so rack awareness is enabled at the database level. Read more on RedisEnterpriseDatabase API Reference
If replication is off, there's no replica shard to place in a different zone.
If capacity or topology is insufficient, some co-location may be unavoidable.
(Keep the database HA configuration aligned with your zone count and node capacity.)
Verification Steps
A) Confirm REC pods are spread across zones
kubectl get pods -l app=redis-enterprise -o wideValidate the pods are scheduled on nodes in different label values (zones).
B) Confirm rack awareness at the Redis Software layer
From a REC pod:
kubectl exec -it <rec-pod> -- rladmin verify rack_awareIf the output indicates rack awareness is not configured, treat it as a Redis Software configuration issue (see Troubleshooting).
If enabled, use the output to identify collisions (primary and replica in the same rack/zone).
C) If you enabled rack awareness after placement existed, re-optimize shard placement
After enabling rack-zone awareness for an existing database, generate an optimized shard placement blueprint and apply it to rearrange shards.
Troubleshooting
1) REC pods are not distributed across zones
Most common causes
Missing or inconsistent node labels on eligible nodes.
Typo in
rackAwarenessNodeLabel(wrong key, wrong case, punctuation differences).Operator RBAC cannot read nodes (list/get/watch denied).
Not enough eligible nodes per zone to satisfy spread.
Fix
Audit labels for all eligible nodes.
Confirm
spec.rackAwarenessNodeLabelexactly matches the label key on nodes.Fix RBAC if the operator can't read node metadata.
If you restrict nodes with nodeSelector, ensure the selector doesn't accidentally exclude entire zones.
2) Pods are spread, but database shards aren't
Most common causes
Replication disabled on the database (no replica separation possible).
Rack-zone awareness not enabled at the Redis Software layer, even if pods appear spread by Kubernetes scheduling.
Shards were placed before rack awareness was enabled (needs explicit optimization and rearrangement).
Fix
Confirm DB replication is enabled (and topology/capacity supports your desired separation).
Verify rack awareness inside Redis Software (
rladmin verify rack_aware).-
Generate and apply an optimized shard placement blueprint:
GET /v1/bdbs/{uid}/actions/optimize_shards_placementApply via
PUT /v1/bdbs/{uid}using thecluster-state-idresponse header andshards_blueprintbody.
3) Drift after pod restarts
Redis Software can initially deploy pods and shards according to rack constraints, but it doesn't automatically maintain that distribution after node pod restarts. If you see drift, plan for periodic detection and manual correction. Read more on Control node selection
Fix
Detect collisions (
rladmin verify rack_aware)Re-optimize and apply shard placement blueprint (REST API workflow above).
Notes for supportability and operations
Design target: At least three REC nodes in distinct zones for HA.
Automation idea: Schedule a recurring check for rack collisions using
rladmin verify rack_aware, and alert if collisions appear.
0 comments
Please sign in to leave a comment.