Data Cleaning
Data cleaning is an important procedure during which the data are inspected, and erroneous data are—if necessary, preferable, and possible—corrected. Data cleaning can be done during the stage of data entry. If this is done, it is important that no subjective decisions are made. The guiding principle provided by Adèr (ref) is: during subsequent manipulations of the data, information should always be cumulatively retrievable. In other words, it should always be possible to undo any data set alterations. Therefore, it is important not to throw information away at any stage in the data cleaning phase. All information should be saved (i.e., when altering variables, both the original values and the new values should be kept, either in a duplicate data set or under a different variable name), and all alterations to the data set should carefully and clearly documented, for instance in a syntax or a log.
Read more about this topic: Data Analysis
Famous quotes containing the words data and/or cleaning:
“To write it, it took three months; to conceive it three minutes; to collect the data in itall my life.”
—F. Scott Fitzgerald (18961940)
“The disgust with dirt can be so great that it keeps us from cleaning ourselvesfrom justifying ourselves.”
—Friedrich Nietzsche (18441900)