On Thursday, December 3rd at 7 pm UTC, as part of the Why R? Webinar series, we have the honour to host Claus Ekstrøm and Anne Helby Petersen from the Department of Public Health at the University of Copenhagen. They will present a talk about reporteR (formerly known as dataMaid), an R package that generates friendly data overview reports for less R-savvy collaborators.
Join us!
Webinar
Check out our other events on this webinars series. To watch previous episodes check out the WhyR YouTube channel and make sure to subscribe!
Looking for more R news? Subscribe to our newsletter and stay updated on our events and the most relevant news about our beloved open source programming language.
And if you enjoy the content and would like to offer support, donate to the WhyR foundation. We are a volunteer-run non-profit organisation and appreciate your contribution to continue to fulfil our mission.
Speakers
-
Claus Ekstrøm Is a professor in biostatistics at the University of Copenhagen, Denmark. He is the creator and contributor to a number of R packages (reporteR, MESS, MethComp, SuperRanker) and is the author of “The R Primer” book. He has previously given R tutorials at useR 2016, eRum 2018, and ASAs Conference on Statistical Practice 2018, and won the C. Oswald George prize from Teaching Statistics in 2014.
-
Anne Helby Petersen is a PhD student in biostatistics at the University of Copenhagen, Denmark. She is the primary author of several R packages, including reporteR. She has taught statistics and R in numerous courses at the University of Copenhagen with students coming from a wide range of backgrounds, including science, medicine and mathematics.
Talk description
Clean up your data screening process with reporteR
Data cleaning and data validation are the first steps in practically any data analysis, as the validity of the conclusions from the analysis hinges on the quality of the input data.
Mistakes in the data can arise for any number of reasons, including erroneous codings, malfunctioning measurement equipment, and inconsistent data generation manuals. Consequently, it is essential to enable topic experts who are knowledgeable about the context and data collection procedure to partake in the data quality assessment since they will be better at identifying potential problems in the data. However, they may not have the technical skills to work with the data themselves.
The reporteR package (formerly known as dataMaid) makes it easy to produce a document that less R-savvy collaborators can read, understand and use to decide “do these data look right?” and documents which potential errors were considered. Both will help ensure subsequent reproducible data science and document the data at all stages of the quality assessment process.
The package includes both very user-friendly one-liner commands that auto-generates data overview reports, as well as a highly customizable suite of data validation and documentation tools that can be moulded to fit most data validation needs. And, perhaps most importantly, it was specifically build to make sure that documentation and validation go hand in hand, so we can clean up any unstructured messy data cleaning process.
Sponsor
This event is part of a series sponsored by Jumping Rivers. For more information, check out the JR and WhyR partnership announcement.
Jumping Rivers is an advanced analytics company whose passion is data and machine learning. Our mission is to help clients move from data storage to intelligent data insights leveraging training and setup for data operations with world-leading experts in R and Python.
We offer courses in analytics, data visualisation and programming languages. From individuals to teams, we have what is needed to upscale your skills.
Our courses go from introduction to R and Python to advanced statistical models.
Check out the course’s calendar for 2021.
Questions? Contact us directly.