Redis Software enables scalable Bloom filters via the RedisBloom module, allowing explicit sharding across nodes to optimize memory use and performance. This article explains how to distribute a Bloom filter across multiple Redis shards, using cluster-aware techniques like hash tags, slot-aware key naming, and custom partitioning logic. Topics include Bloom Filter Basics, choosing a Sharding Strategy, Implementation Steps, Operational Considerations, and Troubleshooting & Resources.
Bloom Filter Basics
A Bloom filter is a probabilistic structure that efficiently checks for set membership with no false negatives and a tunable false positive rate. Redis Enterprise supports this via RedisBloom with commands like BF.RESERVE, BF.ADD, BF.EXISTS, and BF.INFO.
- Common use cases: malware hash tracking, ad deduplication, IP blacklists.
- False positives can be reduced by pre-sizing filters correctly.
Note: Bloom filters are supported only in Standalone Redis Software databases. They are not available in Active-Active (CRDB) deployments. If you are using Active-Active, Bloom filter commands will not appear.
Sharding Strategy
Resharding a Bloom filter may become necessary when:
Performance bottlenecks occur because a single shard is handling too much data.
Data size growth causes filters to exceed their planned capacity, leading to higher false positive rates.
Operational scaling is required as workloads increase and filters need to be distributed across more shards.
To address these challenges, Redis Software does not automatically reshard Bloom filters—you must choose a strategy that fits your application:
Manual sharding using key tags: Use {} in key names (e.g., BF:{shard0}:filter) to place data on the same shard.
Best for simple setups where you fully control key naming conventions.Hash-slot mapping for control: Use CRC16 key-space knowledge to place filters deterministically across shards.
Useful when you need predictable placement across larger clusters.App-level logic: Partition input data (for example, using the first character of a SHA-256 hash) to direct writes and reads to the appropriate filter or shard.
Ideal when workloads already include an application-layer routing mechanism.
Implementation Steps
-
Estimate Capacity and Error Rate
- Run
BF.RESERVE key error_rate capacityfor each shard. - Use the
NONSCALINGflag to prevent sub-filters if sizing is known.
- Run
-
Create Bloom Filters Across Shards
- Use tagged keys:
BF:{0},BF:{1}, etc. - Confirm placement using Redis UI or CLI tools.
- Use tagged keys:
-
Shard Mapping Logic
- Use CRC16 hash-slot awareness or application-layer mapping (see Python sample below).
-
Example: SHA-256 Character Mapping
To distribute Bloom filters evenly across Redis shards, you can use Redis key tags ({}) in combination with CRC16 hash slot calculations. Redis assigns keys to one of 16,384 hash slots based on the portion of the key inside the curly braces. Keys with the same tag will be placed on the same shard.
For example:
These all hash to the same slot and reside on the same shard.To intentionally distribute filters across multiple shards, vary the contents of the key tags to target different hash slots. This enables you to create filters like Bloom:{0}, Bloom:{1}, Bloom:{2}, and ensure each is stored on a different shard.
Now, to map incoming data (e.g., SHA-256 hashes) to the correct Bloom filter, you can use the first character of the SHA-256 hash to decide which shard-specific filter to use. This mapping is done at the application layer using predefined lookup tables.
To check for the presence of a SHA-256 value in the correct filter:
This logic ensures that each Bloom filter is mapped to a specific shard and allows you to scale from 2 to 4 (or more) filters over time simply by expanding the lookup dictionaries and filter key naming scheme.Java (Jedis/Redisson): Show key-tag usage and hash-slot control.
.NET (StackExchange.Redis): Demonstrate mapping keys with hash tags to specific shards.
Example: SHA-256 Character MappingTo distribute Bloom filters evenly across Redis shards, you can use Redis key tags ({}) in combination with CRC16 hash slot calculations. Redis assigns keys to one of 16,384 hash slots based on the portion of the key inside the curly braces. Keys with the same tag will be placed on the same shard.
-
Monitor Filters Using
BF.INFO- Watch for sub-filter creation and fill rate warnings.
- Use Redis’s dashboard for node-level visibility.
Best Practices
- Pre-size filters to avoid performance loss due to auto-expansion.
- Use Redis Software slot grouping techniques to control shard placement.
- Track false positive rate drift and rebalance filters proactively.
- Avoid
CROSSSLOTerrors by grouping related keys into the same slot using hash tags.
Troubleshooting and Operational Considerations
- Uneven Load: Balance filter assignments across shards based on expected data volume.
- High False Positives: Indicates overfilled filters—either increase shard count or rebalance.
- Resharding Requirements: Redis Software doesn't auto-reshard filters. Manual intervention and reload are needed.
- Monitoring: Use Redis Insight or built-in stats to observe memory, CPU, and command patterns per shard.
0 comments
Please sign in to leave a comment.