Cleaning and Organizing Environmental Data
Lead image: Clean Data by Gene Stroman from the Noun Project, CC BY
After you’ve collected environmental data from a sensor, monitor, or other piece of equipment, one of the next steps is to organize and “clean” it!
Cleaning includes making sure the dataset is complete and consistent. Organizing the data into a table in a meaningful way gets it ready for making charts, graphs, and other visualizations. Below are some resources on cleaning data, including making tables of tidy data.
Making tables of tidy data
Images: Illustrations from the Openscapes blog “Tidy Data for reproducibility, efficiency, and collaboration” by Julia Lowndes and Allison Horst, CC BY
An example of “tidy data” from an air quality sensor might look like this:
Each variable forms a column: sensor ID number, date, time, and the air quality measurement of particulate matter are individual variables. Each variable gets its own column in the table. The column header at the top lists the variable name and its units of measurement.
Each observation forms a row: this sensor took an air quality measurement every minute. Each measurement gets its own row in the table.
Each cell is a single measurement: each block in the table shows one piece of data---one time, one PM measurement, etc.
More to come here!
Questions on organizing and cleaning data
Questions tagged with
question:data-cleaning will appear here
|What are ways to make dense CSV data more readable?||@warren||over 3 years ago||1||7|
|What are best practices and tools to help clean up data sets?||@stevie||over 3 years ago||0||24|
Activity posts tagged with
activity:data-cleaning will appear here
|Data Cleaning with OpenRefine||-||-||@fongvania||-||-||0 replications: Try it »|
Activities should include a materials list, costs and a step-by-step guide to construction with photos. Learn what makes a good activity here.
More resources on organizing and cleaning data
- Formatting Data Tables in Spreadsheets and OpenRefine for Data Cleaning: guidance and exercises from a workshop session by Data Carpentry.
- “Clean Up Messy Data,” chapter 4 from the open-access web edition of Hands-On Data Visualization: Interactive Storytelling from Spreadsheets to Code, by Jack Dougherty and Ilya Ilyankou.
- Wickham, H. 2014. Tidy Data. Journal of Statistical Software, 59(10), 1–23. LINK to paper.