Constitent Hashing

https://news.ycombinator.com/rss Hits: 13
Summary

This post is an introduction to consistent hashing, an algorithm for designing a hash table such that only a small portion of keys has to be recomputed when the table's size changes. Motivating use case Suppose we're designing a caching web proxy, but the expected storage demands are higher than what a single machine can handle. So we distribute the cache across multiple machines. How do we do that? Given a URL, how do we make sure that we can easily find out which server we should approach for a potentially cached version ? An approach that immediately comes to mind is hashing. Let's calculate a numeric hash of the URL and distribute it evenly between N nodes (that's what we'll call the servers in this post): hash := calculateHashFunction(url) nodeId := hash % N This process works but turns out to have serious downsides in real-world applications. The problem with the naive hashing approach Consider our caching use case again; in a realistic application at "internet scale", one of the assumptions we made implicitly doesn't hold - the cache nodes are not static. New nodes are added to the system if the load is high (or if new machines come into service); existing nodes can crash or be taken offline for maintenance. In other words, the number N in our application is not a constant. The problem may be apparent now; to demonstrate it directly, let's take an actual implementation of hashItem using Go's md5 package : // hashItem computes the slot an item hashes to, given a total number of slots. func hashItem(item string, nslots uint64) uint64 { digest := md5.Sum([]byte(item)) digestHigh := binary.BigEndian.Uint64(digest[8:16]) digestLow := binary.BigEndian.Uint64(digest[:8]) return (digestHigh ^ digestLow) % nslots } The terminology is slightly adjusted: Instead of url, we'll just refer to a generic item "Slot" is a common concept in hash tables: our hashItem computes a slot number for an item, given the total number of available slots Let's say we started with 32 slots...

First seen: 2025-10-03 04:51

Last seen: 2025-10-03 16:53