Shared_ptr<T>: the (not always) atomic reference counted smart pointer

https://news.ycombinator.com/rss Hits: 6
Summary

shared_ptr<T>: the (not always) atomic reference counted smart pointer 13 Feb 2019 Introduction This is a write-up of the “behavioral analysis” of shared_ptr<T> reference count in GNU’s libstdc++. This smart pointer is used to share references to the same underlaying pointer. The mechanism beneath works by tracking the amount of references through a reference count so the pointer gets freed only after the last reference is destructed. It is usually used in multi-threaded programs (in conjunction with other types) because of the guarantees of having its reference count tracked atomically. Story time A few months ago, I was running a micro-benchmark on data structures in Rust vs C++ ones. At one point, I found that my Rust port of an immutable RB tree insertion was significantly slower than the C++ one. It was unexpected to me as both codebases were idiomatic and Rustc optimizes very well usually matching C++ speed. I proceeded to re-check that my code was correct. At first I thought that my re-balancing code could be wrong so I put it side by side with the C++ one but couldn’t find any defect. Profiling The second day, I started profiling with callgrind and cachegrind. Here is where I got the aha moment. Every part of the code that was copying shared_ptr<T> was being much faster than my equivalent Arc::clone calls in Rust. Inside KCachegrind, I saw something unexpected, the code was straightforward but before increasing shared_ptr’s reference count during a pointer copy, there was a branch to decide if it should do an atomic addition or a non-atomic one. The code-path being taken was the non atomic one! Certainly, my knowledge about shared_ptr was being challenged. As far as I knew, the reference count should be atomic so it could be used in parallel programs sharing the value without the risk of racing the count and ending up with dangling pointers or memory leaks. Tracking the code Simplified C++ poc: const auto tree = make_shared<Tree<int>>(10); for(auto i = 0; i ...

First seen: 2025-08-31 11:44

Last seen: 2025-08-31 16:45