CPU Cache-Friendly Data Structures in Go: 10x Speed with Same Algorithm

https://news.ycombinator.com/rss Hits: 9
Summary

Key Takeaways Cache misses can slow down your code by 60x compared to L1 cache hits False sharing occurs when multiple cores update different variables in the same cache line Proper data structure padding can improve performance by 5-10x in specific scenarios Data-oriented design beats object-oriented for high-performance systems Always measure with benchmarks - cache effects are hardware-specific The Numbers That Matter L1 Cache: 4 cycles (~1ns) 32KB L2 Cache: 12 cycles (~3ns) 256KB L3 Cache: 40 cycles (~10ns) 8MB RAM: 200+ cycles (~60ns) 32GB Cache line size: 64 bytes (on x86_64) Reading from RAM is approximately 60x slower than L1 cache. One cache miss equals 60 cache hits. This is why cache-friendly code can run significantly faster - often 5-10x in specific scenarios. False Sharing: The Silent Killer False sharing occurs when multiple CPU cores modify different variables that happen to share the same cache line. This forces cache line invalidation across cores, causing significant performance degradation. The problem is subtle: your variables might be logically independent, but if they're physically adjacent in memory (within 64 bytes), updating one invalidates the cache for all others on that line. In our metrics collection system, we noticed 10x slower performance during high concurrency. The issue was multiple goroutines updating different counters that were packed in the same cache line. Detection requires careful benchmarking with concurrent access patterns. The performance drop isn't visible in single-threaded tests, only under parallel load. // BROKEN: False sharing destroys performance type Counters struct { requests uint64 // 8 bytes errors uint64 // 8 bytes - SAME cache line! latency uint64 // 8 bytes - SAME cache line! } // Result: 45ns/op under contention // FIXED: Padding prevents false sharing type Counters struct { requests uint64 _ [56]byte // Padding to 64 bytes (cache line) errors uint64 _ [56]byte latency uint64 _ [56]byte timeouts uint64 _ [...

First seen: 2025-10-09 14:19

Last seen: 2025-10-10 03:22