Redis Pub/Sub delivers ultra-low latency, fire-and-forget message broadcasting using PUBLISH, SUBSCRIBE, and related commands. While it is extremely efficient for real-time fan-out, high message rates, large payloads, pattern subscriptions, and excessive subscriber connections can drive significant CPU and network load. In Redis OSS Cluster, global Pub/Sub can amplify cluster-bus traffic across shards, while in Redis Enterprise (Redis Software) and Redis Cloud, extreme message rates or high connection density per shard can cause localized CPU pressure.
Note: Pub/Sub behavior and scaling characteristics differ between Redis Open Source (standalone vs. Cluster) and Redis Software/Redis Cloud. This article calls out those differences where they matter most.
This article covers where Pub/Sub load originates, how to measure and confirm Pub/Sub as the bottleneck; architecture decisions that reduce CPU impact (including Sharded Pub/Sub and Redis Software/Redis Cloud behavior), application-level optimization strategies, when to use Streams or other data structures instead of Pub/Sub, and field-tested troubleshooting patterns.
Where Pub/Sub Load Comes From
Pub/Sub is broadcast by design. Each PUBLISH operation results in:
Delivery to every subscriber of the target channel
Delivery to every matching pattern subscription
Network transmission to each subscribed client
Output buffer management per subscriber
CPU and server load increase when:
Message rate is high (messages/sec)
Payload size is large
Subscriber count per channel is large
Broad PSUBSCRIBE patterns are used
Subscribers are slow to read
Global Pub/Sub is used in large OSS clusters
OSS Cluster Behavior
In Redis OSS Cluster (prior to Redis 7 Sharded Pub/Sub), every Pub/Sub message is propagated to every shard. Cluster traffic therefore grows with:
message_rate × shard_countThis can saturate the cluster bus bandwidth and increase CPU usage on all nodes.
Important: This amplification applies to “global” Pub/Sub behavior (PUBLISH/SUBSCRIBE) in Redis OSS Cluster. If you can use Redis 7+, prefer Sharded Pub/Sub to avoid cluster-wide propagation.
Redis 7+ Sharded Pub/Sub
Redis 7 introduced Sharded Pub/Sub (SPUBLISH, SSUBSCRIBE). Messages are routed to the shard owning the channel’s hash slot, eliminating global broadcast across all shards.
Redis Software / Redis Cloud Behavior
Redis Software / Redis Cloud differs architecturally:
Channel names hash to specific shards (like keys)
The proxy multiplexes publishers into high-density shard connections
Messages aren't broadcast across shards
Subscribers maintain connections to each shard
In Redis Software/Redis Cloud, Pub/Sub channels exist within a single database (DB). Messages are not “cluster-wide” across different databases.
Typical bottlenecks in Redis Software / Redis Cloud are:
Extremely high per-shard message rates
Large payload sizes
Very high subscriber connection counts per shard
Confirm Pub/Sub Is the Bottleneck
Before modifying the architecture, verify Pub/Sub is responsible for CPU load.
1. Correlate CPU with Publish Activity
Check:
CPU utilization per node or shard
instantaneous_ops_per_sec
total_net_input_bytes
total_net_output_bytes
Look for correlation between:
CPU spikes
PUBLISH rate increases
Subscriber count growth
Network throughput spikes
2. Inspect Pub/Sub Fan-Out
Use:
PUBSUB CHANNELS
PUBSUB NUMSUB <channel>
CLIENT LIST (filter for subscribed clients)
INFO stats
CONFIG GET notify-keyspace-events
Look for:
Channels with extremely high subscriber counts
Heavy PSUBSCRIBE usage
Large client output buffers
Unnecessary keyspace notifications generating Pub/Sub traffic
Tip: In Redis Enterprise/Redis Cloud, also confirm whether CPU/egress is concentrated on a subset of shards, which often indicates a small set of “hot” channels mapped to those shards.
Architecture Changes That Reduce CPU Load
1. Use Sharded Pub/Sub (Redis 7+ OSS Cluster)
For high-traffic channels:
Identify hot channels.
Create shard channel equivalents.
-
Update publishers to use:
SPUBLISH <channel> <payload> -
Update subscribers to:
SSUBSCRIBE <channel> Gradually phase out global channels.
Result:
Eliminates cluster-wide propagation.
Reduces CPU and network load across all shards.
2. Scale Properly in Redis
Redis Open Source Cluster:
If you are using “global” Pub/Sub (PUBLISH/SUBSCRIBE), adding more shards can increase the amount of cluster-wide propagation work.
First Priority: Migrate hot paths to Sharded Pub/Sub (Redis 7+), then scale by adding shards if needed.
Redis Software / Redis Cloud:
Add shards to increase Pub/Sub throughput.
Rebalance databases to distribute channel load.
Monitor per-shard CPU rather than cluster-wide averages.
Each shard adds independent Pub/Sub capacity, but a single hot channel can still concentrate work on one shard.
3. Isolate Pub/Sub Workloads
If Pub/Sub traffic competes with key-value workloads:
Move Pub/Sub traffic to a dedicated database.
In high-volume cases, deploy a separate cluster for Pub/Sub.
Isolation prevents cross-impact between caching/transactions and messaging.
Application-Level Optimization Strategies
Most Pub/Sub CPU issues originate at the application layer.
1. Reduce Fan-Out Per Message
Partition channels by region, tenant, or topic.
Avoid global broadcast channels when possible.
Do not create per-user channels unless absolutely necessary.
Better:
events:{region}Instead of:
events2. Minimize Pattern Subscriptions
PSUBSCRIBE requires pattern matching on every message.
Avoid:
PSUBSCRIBE *
PSUBSCRIBE events.*Prefer explicit:
SUBSCRIBE events.usBroad patterns significantly increase CPU usage.
3. Shrink Payload Size
Large payloads increase:
CPU serialization cost
Network overhead
Client buffer pressure
Best practice:
Publish IDs or references
Store full payloads in keys or external storage
Aggregate high-frequency events
4. Protect Against Slow Consumers
Slow subscribers cause output buffer growth and memory pressure.
Mitigations:
-
Tune:
client-output-buffer-limit pubsub Ensure subscribers continuously read from sockets.
Disconnect non-responsive clients.
Avoid performing heavy processing on the subscriber read loop.
5. Rate Limit or Batch Publishers
Pub/Sub has no back-pressure.
Implement:
Rate limiting at publishers
Event aggregation (publish every second instead of per event)
Debounce duplicate updates
Bursty publishers are a common cause of CPU spikes.
6. Reduce Connection Count
Each subscriber connection consumes CPU and memory.
Best practices:
Share subscription connections within application frameworks.
Use an intermediary layer (e.g., WebSocket server) to fan-out to users.
In Redis, monitor per-shard connection limits (default 10k).
7. Disable Unnecessary Keyspace Notifications
Keyspace notifications can generate additional Pub/Sub traffic and CPU usage.
If you don’t use them, disable them using the appropriate mechanism for your environment:
Redis Open Source / self-managed: adjust notify-keyspace-events.
Managed services: use the service’s supported configuration controls (direct CONFIG may be restricted).
When Pub/Sub Is the Wrong Tool
Pub/Sub is at-most-once and non-persistent. If a subscriber disconnects or can’t keep up, messages are not replayed and may be permanently lost.
Use alternatives when you need:
| Requirement | Recommended Structure |
|---|---|
| Replay messages | Redis Streams |
| Back-pressure | Streams |
| Per-consumer acknowledgment | Streams |
| Durable queue | Lists or Streams |
| Priority queue | Sorted Sets |
Moving durability-heavy or replay-oriented workloads to Streams often frees substantial CPU capacity for real-time Pub/Sub traffic.
Troubleshooting Patterns
High CPU on All Shards (OSS Cluster)
Likely cause: Global Pub/Sub broadcast
Fix: Migrate to Sharded Pub/Sub (Redis 7+)
High CPU on a Subset of Shards (Redis Software / Redis Cloud)
Likely cause: A small number of hot channels hashed to those shards (high message rate and/or large fan-out).
Fix: Partition channels, reduce fan-out/payload, and scale/rebalance to spread hot-channel load.
CPU Spikes During Traffic Bursts
Likely cause: Unthrottled publisher
Fix: Rate limit or batch events
Excessive Subscriber Connections
Likely cause: One connection per end user
Fix: Introduce shared subscription layer
High CPU with Moderate Message Rate
Likely cause: Broad pattern subscriptions
Fix: Replace PSUBSCRIBE with explicit channels
Messages “Lost”
Likely cause: At-most-once semantics
Fix: Use Redis Streams
Validation Checklist
After mitigation, confirm:
Subscriber count per hot channel has been reduced
Pattern subscriptions narrowed or removed
Average payload size decreased
Output buffers are stable
CPU flattened during previous spike windows
Network traffic proportional to message rate
Key Takeaways
Pub/Sub CPU cost scales with message rate × subscriber count × payload size.
Pattern subscriptions and global broadcasts are common multipliers.
Sharded Pub/Sub dramatically reduces cluster-wide amplification.
Redis scales horizontally but can still bottleneck per shard.
Most CPU problems originate from application design, not Redis itself.
Streams are often a better fit for durable or high-fan-out replay use cases.
0 comments
Please sign in to leave a comment.