Setsum is an order agnostic, commutative checksum. It was developed by Robert Escriva at Dropbox’s metadata team. In this short post, I’ll explain why they’re used and the math behind them. Jump to the end if you’d like to see the code.
Introduction
Say you’re building a database replication system. The primary sends logical operations to replicas, which apply them in order:
{"op": "add", "id": "apple"}
{"op": "add", "id": "apple"}
{"op": "remove", "id": "orange"}
After the replica processes these changes (add two apples, remove an orange), how do you verify both nodes ended up in the same state?
One naive (rather horrible) approach is to dump both states and compare them directly. It’s expensive, impractical, and doesn’t scale. Instead, you can maintain checksums that update with each operation. When you’re done, just compare the checksums; if they match, you’re in sync. That’s why distributed databases like Cassandra use Merkle trees for the same purpose.
Setsum is similar but has some nice properties that make it attractive over Merkle trees. They can be computed incrementally; the cost only depends on the change being applied, not the whole dataset. I also find them attractive because they let you remove items as well.
Properties
Setsum has some interesting properties:
1. Order doesn’t matter. Both of these yield the same result:
s1 = New()
s1.add("apple")
s1.add("banana")
s2 = New()
s2.add("banana")
s2.add("apple")
assert_eq(s1, s2)
2. You can remove items. These are equivalent:
s1.add("apple")
s1.add("banana")
s1.remove("apple")
s2.add("banana")
s2.add("banana")
s2.remove("banana")
assert_eq(s1, s2)
3. You can combine setsums. As you guessed already, these are equal:
s1.add("apple")
s1.add("banana")
s2.add("chikoo")
s3.add("banana")
s3.add("chikoo")
s3.add("apple")
assert_eq(s1+s2, s3)
The only state you need to maintain is 256 bits, and all operations are O(len(msg))
instead of depending on your entire dataset.
The internals
Each Setsum is an array of 8 unsigned 32-bit integers (u32), called “columns”. Each column starts at 0 when you create a new Setsum. Each column has an associated large prime number (close to u32::MAX
).
When you add an item:
- Compute the SHA3-256 hash of the item (produces 32 bytes)
- Split the hash into 8 chunks of 4 bytes each
- Interpret each chunk as a little-endian u32
- Add each number to its corresponding column
- If the sum exceeds the column’s prime, store the remainder (mod prime)
You can also remove an item that was previously added. The magic is in computing the inverse: first, derive the inverse of the item’s hashed value, then add that inverse to the setsum. This effectively cancels out the original, removing the item from the set!
To compute the inverse, we use modular arithmetic: it’s simply the prime minus the value.
The math behind setsum
Disclaimer: If we’re friends, you already know I’m no math person. If not, hey there, new friend! You can probably skip this if you understand modulo arithmetic, the Chinese remainder theorem, and a bit of probability.
Let’s simplify: instead of 8 columns, let’s use just one. The prime number for this column is 29. Consider adding these items with their hash and inverse values:
Item | Hash | Inverse |
---|---|---|
apple | 15 | 14 |
banana | 23 | 6 |
chikoo | 7 | 22 |
pomegranate | 18 | 11 |
watermelon | 26 | 3 |