OpenRefine Lessons for Digital Humanities: Reference

Key Points

Introduction to OpenRefine
  • OpenRefine is 'a tool for working with messy data'
  • OpenRefine works best with data in a simple tabular format
  • OpenRefine can help you split data up into more granular parts
  • OpenRefine can help you match local data up to other data sets
  • OpenRefine can help you enhance a data set with data from other sources
Importing data into OpenRefine
  • Use the 'Create Project' option to import data
  • You can control how data imports using options on the import screen
Basic OpenRefine functions I: Working with columns, sorting, faceting, filtering and clustering
  • You can reorder, rename and remove columns in OpenRefine
  • You can use facets and filters to explore your data
  • You can use facets and filters work with a subset of data in OpenRefine
  • You can easily correct common data issues using Facets and Clustering
Basic OpenRefine functions II:
  • You can alter data in OpenRefine based on specific instructions
  • You can expand the data editing functions that are built-in into OpenRefine by building your own
Advanced OpenRefine functions
  • OpenRefine can look up custom URLs to fetch data based on what's in an OpenRefine project
  • Such API calls can be custom built, or one can use existing Reconciliation services to enrich data
  • OpenRefine can be further enhanced by installing extensions