R for reproducible scientific analysis

Introduction to R for social scientists using the Gapminder dataset

This is a two-day intensive introduction to modern computational techniques for data management, analysis, and visualization with an emphasis on the programming language R. The course assumes no prior programming knowledge. By the end of the workshop, participants will be able to efficiently organize and clean data, manipulate data frames, estimate and work with statistical models, produce a variety of publication-quality plots, and compose dynamic documents that integrate writing, code, and code output.


Data Management

  1. Spreadsheets
  2. OpenRefine


  1. Introduction to R and RStudio
  2. Project Management
  3. Data types and subsetting
  4. Plotting
  5. Data.frame Manipulation
  6. Tidy Data
  7. Statistical Modeling
  8. Dynamic Documents
  9. Capstone Project

Bonus Topics

These materials won’t be taught in this lesson but can be done independently.

  1. Multi-table Joins
  2. Writing Functions