Anscombe's Quartet is a set of four datasets, where each produces the same summary statistics (mean, standard deviation, and correlation), which could lead one to believe the datasets are quite similar. [...]
Recently, Alberto Cairo created the Datasaurus dataset which urges people to "never trust summary statistics alone; always visualize your data", since, while the data exhibits normal seeming statistics, plotting the data reveals a picture of a dinosaur. Inspired by Anscombe's Quartet and the Datasaurus, we present, The Datasaurus Dozen:
Datasaurus Dozen
Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing
2 Responses:
This kind of technique looks like a great way to generate falsified data that fits your hypothesis but still looks just random enough when visualised.
Today my customer's purportedly reliable Gaussian normal data is all bifurcated and I don't know why.


At least if it looked like a dinosaur I would know that someone was fucking with me.
Shakes fist at data.