Anscombe's Quartet

https://news.ycombinator.com/rss Hits: 14
Summary

From Wikipedia, the free encyclopedia Four data sets with the same descriptive statistics, yet very different distributions The four datasets composing Anscombe's quartet. All four sets have identical statistical parameters, but the graphs show them to be considerably different Anscombe's quartet comprises four datasets that have nearly identical simple descriptive statistics, yet have very different distributions and appear very different when graphed. Each dataset consists of eleven (x, y) points. They were constructed in 1973 by the statistician Francis Anscombe to demonstrate both the importance of graphing data when analyzing it, and the effect of outliers and other influential observations on statistical properties. He described the article as being intended to counter the impression among statisticians that "numerical calculations are exact, but graphs are rough".[1] For all four datasets: Property Value Accuracy Mean of x 9 exact Sample variance of x: s2x 11 exact Mean of y 7.50 to 2 decimal places Sample variance of y: s2y 4.125 ±0.003 Correlation between x and y 0.816 to 3 decimal places Linear regression line y = 3.00 + 0.500x to 2 and 3 decimal places, respectively Coefficient of determination of the linear regression: R 2 {\displaystyle R^{2}} 0.67 to 2 decimal places The first scatter plot (top left) appears to be a simple linear relationship, corresponding to two correlated variables, where y could be modelled as gaussian with mean linearly dependent on x. For the second graph (top right), while a relationship between the two variables is obvious, it is not linear, and the Pearson correlation coefficient is not relevant. A more general regression and the corresponding coefficient of determination would be more appropriate. In the third graph (bottom left), the modelled relationship is linear, but should have a different regression line (a robust regression would have been called for). The calculated regression is offset by the one outlier, which exert...

First seen: 2025-09-09 12:56

Last seen: 2025-09-10 07:07