ID529: Data Management and Analytic Workflows in R

Welcome!

Details:

  • Find the course on my.harvard.edu
  • Course Hours: 1:30-5:30 PM
  • Classroom: Our main classroom will be Kresge G2, but Friday 1:30-2:30, we’ll be in FXB G12.
  • Course Dates:
    • Monday January 8th - Friday January 12th,
    • Tuesday January 16th - Friday January 19th, 2023
  • Office Hours: 11:30 AM – 12:30 PM on Wednesday January 10th, Tuesday January 16th, Thursday January 18th
  • Limit 60 students, priority for Population Health Science (PHS) students

Course Description

Data Management and Analytic Workflows in R will introduce students to R programming and modern data management and analysis workflows applied to examples from population health science. Throughout, we will emphasize reproducibility, open science, data visualization, and dynamic document generation. Specific skills learned will include the use of the RStudio integrated development environment, tidy data management practices/workflows, how to get help in programming, and how to use GitHub to track changes in code, disseminate professional work, and integrate feedback. Coursework will consist of lectures, in-class group work, homework, peer assessment, and time for discussion. This course complements graduate-level courses in statistics and quantitative research methods by helping students develop practical skills for conducting independent research incorporating modern data science principles. Students completing this course will have a solid foundation enabling them to handle complex data management tasks and data communication skills for research and professional work.

Student Testimonials

Students were very happy with how the class went last winter! Here are some student testionials, shared with the students’ permission:

“I really enjoyed the whole learning experience in this course.”

“Very informative and useful. As a someone who has his first exposure to R, I learned a lot.”

“The teaching team were very supportive and very promptly acted on feedback.”

“It was wonderful! Totally friendly to R beginners. And got a lot positive feedback and encouragement from the teaching team! Shout out to their efforts!”

“Slides that are managed so well! Unparellel instructional team! You are so friendly and patient! I really love that homeworks are managed through Github!”

“I loved this class!! So much was covered but it didn’t feel overwhelming at the same time because the expectation was that we all came in with different levels of experience with R and that these are resources we are introduced to and can always come back to.”

This course has been excellent! It was exactly what I was looking for - I wanted to kind of catch up to my peers who have had experience in R and learn best practices. R feels a lot less intimidating now, and I know where to look for help. Thank you!

I think this course was great. I am happy that all levels of R were welcome in the course. I felt like I could just do beginner level work and still get a good grade.

Extremely well. I think it will be the most recommended course for whoever wants to gain skills in data management and analysis

And lots more 🙂

Instructional Team

Christian Testa
1st Year PhD Student
Department of Biostatistics

GitHub
Website
Mastodon
Google Scholar

Dean Marengi
3rd year PhD Student
Department of Environmental Health

Google Scholar
Jarvis Chen
Senior Lecturer
Department of Social and Behavioral Sciences

https://www.hsph.harvard.edu/profile/jarvischen/
Google Scholar

Teaching Alumnus

  • Amanda Hernandez was an amazing masters student in Environmental Health who helped us develop a lot of material and helped teach the course in Winter Session 2023.

Go on to syllabus

Jump into the Curriculum