Clean your data
Screening process
• Detect errors
~ Missing data
~ Outliers
• Make sure data meets assumptions for analysis
~ Normality
Two Types of Screening
1. Preliminary data screening
~ Screen one variable at a time on the entire data set before any analysis
~ Today’s focus
2. In conjunction with statistical analysis
~ Dependent on analysis being performed
Steps
1. Check for missing data
2. Check for normality
3. Remove outliers
4. Check for normality again
5. Transform data
Keep in mind:
• Do this with each dependent variable before analyzing data
• Keep transformations consistent across all dependent variables
• Although transformed data looks pretty, it can be difficult to interpret
• Run your analysis with transformed data and without the transformation and compare the results
Great resource:
Mickey, R. M., Dunn, O. J., and Clark V. A. (2004). Applied statistics: analysis of variance and regression, 3rd Edition. John Wiley & Sons, Inc.
Chapter 1: Data Screening