Classic Data Sets
Several classic data sets have been used extensively in the statistical literature:
- Iris flower data set - multivariate data set introduced by Ronald Fisher (1936).
- Categorical data analysis - Data sets used in the book, An Introduction to Categorical Data Analysis, by Agresti are provided on-line by StatLib.
- Robust statistics - Data sets used in Robust Regression and Outlier Detection (Rousseeuw and Leroy, 1986). Provided on-line at the University of Cologne.
- Time series - Data used in Chatfield's book, The Analysis of Time Series, are provided on-line by StatLib.
- Extreme values - Data used in the book, An Introduction to the Statistical Modeling of Extreme Values are provided on-line by Stuart Coles, the book's author.
- Bayesian Data Analysis - Data used in the book are provided on-line by Andrew Gelman, one of the book's authors.
- The Bupa liver data, used in several papers in the machine learning (data mining) literature.
- Anscombe's quartet Small dataset illustrating the importance of graphing the data to avoid statistical fallacies
Read more about this topic: Data Set
Famous quotes containing the words classic, data and/or sets:
“It could be said that the AIDS pandemic is a classic own-goal scored by the human race against itself.”
—(B. 1950)
“Mental health data from the 1950s on middle-aged women showed them to be a particularly distressed group, vulnerable to depression and feelings of uselessness. This isnt surprising. If society tells you that your main role is to be attractive to men and you are getting crows feet, and to be a mother to children and yours are leaving home, no wonder you are distressed.”
—Grace Baruch (20th century)
“It provokes the desire but it takes away the performance. Therefore much drink may be said to be an equivocator with lechery: it makes him and it mars him; it sets him on and it takes him off.”
—William Shakespeare (15641616)