Fast, Memory-Efficient Hash Table in Java: Borrowing the Best Ideas

https://news.ycombinator.com/rss Hits: 1
Summary

One day, I ran into SwissTable—the kind of design that makes you squint, grin, and immediately regret every naive linear-probing table you’ve ever shipped.This post is the story of how I tried to bring that same “why is this so fast?” feeling into Java. It’s part deep dive, part engineering diary, and part cautionary tale about performance work.1) The SwissTable project, explained the way it feels when you first understand it#SwissTable is an open-addressing hash table design that came out of Google’s work and was famously presented as a new C++ hash table approach (and later shipped in Abseil).At a high level, it still does the usual hash-table thing: compute hash(key), pick a starting slot, and probe until you find your key or an empty slot.The twist is that SwissTable separates metadata (tiny “control bytes”) from the actual key/value storage, and it uses those control bytes to avoid expensive key comparisons most of the time. Instead of immediately touching a bunch of keys (which are cold in cache and often pointer-heavy), it first scans a compact control bytes that is dense, cache-friendly, and easy to compare in bulk.To make probing cheap, SwissTable effectively splits the hash into two parts: h1 and h2. Think of h1 as the part that chooses where to start probing (which group to look at first), and h2 as a tiny fingerprint stored in the control bytes to quickly rule slots in or out. It’s not a full hash—just enough bits to filter candidates before we pay the cost of touching real keys.So on lookup, you compute a hash, derive (h1, h2), jump to the group from h1, and compare h2 against all control bytes in that group before you even look at any keys. That means most misses (and many hits) avoid touching key memory entirely until the metadata says “there’s a plausible candidate here.”Because probing stays cheap, SwissTable can tolerate higher load factors—up to about 87.5% (7/8) in implementations like Abseil’s flat_hash_map—without falling off a performance clif...

First seen: 2025-12-13 18:52

Last seen: 2025-12-13 18:52