Branch prediction: Why CPUs can't wait?

https://news.ycombinator.com/rss Hits: 3
Summary

Recently, I have had some free time and started learning some low-level computer fundamentals, trying to practice and better understand the concepts in greater detail. Along the way, I learned about a modern compiler infrastructure named LLVM, which is a target-independent optimizer and code generator that has been used as a part of many compilers for different programming languages like Rust, Zig, or Haskell. While diving into LLVM’s aggressive optimization features, I have a humbling realization: even the most advanced compiler optimizations cannot save you from a fundamental limitation at the hardware level. If our program has unpredictable branches, all of the clever LLVM optimizations become secondary to something called “branch prediction”. Poor branch prediction patterns can easily negate the benefits of multiple LLVM optimizations. There are some latency numbers that every programmer should know, according to Jeff Dean, and one of them is branch misprediction, which costs around 5ns in 2012, and the latency remains roughly the same as the time of writing this post. So what is branch prediction, what happens when it’s mispredicted, and why is it costly? Before diving into the explanation, I think it’s worth mentioning what kind of instruction execution pattern the CPU ideally expects. Because memory layout is linear, and when a program is compiled and stored in memory, its instructions are stored as a continuous sequence: MarkdownMemory Address Instruction PC Value 0x1000 MOV EAX, 5 PC = 0x1000 0x1003 ADD EAX, 10 PC = 0x1003 0x1006 MOV [0x2000], EAX PC = 0x1006 0x100A RET PC = 0x100A The natural way to read this is from top to bottom, just like reading a book. In a CPU, there is a special register named Program Counter (PC), and its job is to record the address of the current instruction. The CPU’s program counter naturally increments to the next sequential instruction address: [math] PC = PC + instruction\ size[/math] The instruction size is the size of the ...

First seen: 2025-08-19 21:02

Last seen: 2025-08-19 23:02