Everything Is Correlated

https://news.ycombinator.com/rss Hits: 12
Summary

Statistical folklore asserts that “everything is correlated”: in any real-world dataset, most or all measured variables will have non-zero correlations, even between variables which appear to be completely independent of each other, and that these correlations are not merely sampling error flukes but will appear in large-scale datasets to arbitrarily designated levels of statistical-significance or posterior probability. This raises serious questions for null-hypothesis statistical-significance testing, as it implies the null hypothesis of 0 will always be rejected with sufficient data, meaning that a failure to reject only implies insufficient data, and provides no actual test or confirmation of a theory. Even a directional prediction is minimally confirmatory since there is a 50% chance of picking the right direction at random. It also has implications for conceptualizations of theories & causal models, interpretations of structural models, and other statistical principles such as the “sparsity principle”. Knowing one variable tells you (a little) about everything else. In statistics & psychology folklore, this idea circulates under many names: “everything is correlated”, “everything is related to everything else”, “crud factor”, “the null hypothesis is always false”, “coefficients are never zero”, “ambient correlational noise”, Thorndike’s dictum (“in human nature good traits go together”), etc. Closely related are the “bet on sparsity principle”, Anna Karenina principle, Barry Commoner’s “first law of ecology” (“Everything is connected to everything else”) & Waldo R. Tobler’s “first law of geography” (“everything is related to everything else, but near things are more related than distant things”). The core idea here is that in any real-world dataset, it is exceptionally unlikely that any particular relationship will be exactly 0 for reasons of arithmetic (eg. it may be impossible for a binary variable to be an equal percentage in 2 unbalanced groups); prior pro...

First seen: 2025-08-22 05:56

Last seen: 2025-08-22 17:20