Can We Know Whether a Profiler Is Accurate?

https://news.ycombinator.com/rss Hits: 10
Summary

If you have been following the adventures of our hero over the last couple of years, you might remember that we can’t really trust sampling profilers for Java, and it’s even worse for Java’s instrumentation-based profilers. For sampling profilers, the so-called observer effect gets in the way: when we profile a program, the profiling itself can change the program’s performance behavior. This means we can’t simply increase the sampling frequency to get a more accurate profile, because the sampling causes inaccuracies. So, how could we possibly know whether a profile correctly reflects an execution? We could try to look at the code and estimate how long each bit takes, and then painstakingly compute what an accurate profile would be. Unfortunately, with the complexity of today’s processors and language runtimes, this would require a cycle-accurate simulator that needs to model everything, from the processor’s pipeline, over the cache hierarchy, to memory and storage. While there are simulators that do this kind of thing, they are generally too slow to simulate a full JVM with JIT compilation for any interesting program within a practical amount of time. This means that simulation is currently impractical, and it is impractical to determine what a ground truth would be. So, what other approaches might there be to determine whether a profile is accurate? In 2010, Mytkowicz et al. already checked whether Java profilers were actionable by inserting computations at the Java bytecode level. On today’s VMs, that’s unfortunately an approach that changes performance in fairly unpredictable ways, because it interacts with the compiler optimizations. However, the idea to check whether a profiler accurately reflects the slowdown of a program is sound. For example, an inaccurate profiler is less likely to correctly identify a change in the distribution of where a program spends its time. Similarly, if we change the overall amount of time a program takes, without changing the distr...

First seen: 2025-10-15 03:40

Last seen: 2025-10-15 12:42