Locality, and Temporal-Spatial Hypothesis

https://news.ycombinator.com/rss Hits: 2
Summary

Locality, and Temporal-Spatial Hypothesis Good fences make good neighbors? Last week at PGConf NYC, I had the pleasure of hearing Andres Freund talking about the great work he’s been doing to bring async IO to Postgres 18. One particular result caught my eye: a large difference in performance between forward and reverse scans, seemingly driven by read ahead1. The short version is that IO layers (like Linux’s) optimize performance by proactively pre-fetching data ahead of the current read point in a file, so it’s already cached when needed. Notably, most of these systems don’t do this backwards. This leads to a big difference in performance between forward scans (where the pages are already in the cache when they’re needed) and backward scans (where the database needs to block on IO to fetch the next page). This lead me to thinking more about a particular hypothesis behind many database designs: a temporal-spatial locality hypothesis2. Before we get there, let’s talk about locality more generally, because it might be the single most important idea in database performance (and computer systems performance generally). Temporal locality is the idea that data accessed recently is likely to be accessed again soon. This idea is what’s behind CPU caches, database buffer pools, and most caches you’ll come across in computer systems. Spatial locality is the idea that when we access data, we’re likely to access nearby data soon. Almost all database systems take advantage of these forms of locality, and would lost significant performance without taking advantage of them. Stacks of books could be written about these ideas. Stacks of books have been written about these ideas. We could talk about cache-oblivious algorithms, or non-polluting read and write instructions, or have an argument about linked lists. Instead, I want to zoom in to a particular idea in databases: temporal-spatial hypothesis. The hypothesis I mean has a definition something like this: Temporal-spatial localit...

First seen: 2025-10-07 19:11

Last seen: 2025-10-07 20:11