Library Carpentry OpenRefine

Key Points

Introduction to OpenRefine
  • OpenRefine is ‘a tool for working with messy data’

  • OpenRefine works best with data in a simple tabular format

  • OpenRefine can help you split data up into more granular parts

  • OpenRefine can help you match local data up to other data sets

  • OpenRefine can help you enhance a data set with data from other sources

Importing data into OpenRefine
  • Use the ‘Create Project’ option to import data

  • You can control how data imports using options on the import screen

Basic OpenRefine Functions I: Working with columns, sorting, faceting, filtering and clustering
  • You can reorder, rename and remove columns in OpenRefine

  • You can use facets and filters to explore your data

  • You can use facets and filters work with a subset of data in OpenRefine

  • You can easily correct common data issues using Facets and Clustering

Basic OpenRefine functions II:
  • You can alter data in OpenRefine based on specific instructions

  • You can expand the data editing functions that are built-in into OpenRefine by building your own

Advanced OpenRefine functions
  • OpenRefine can look up custom URLs to fetch data based on what’s in an OpenRefine project

  • Such API calls can be custom built, or one can use existing Reconciliation services to enrich data

  • OpenRefine can be further enhanced by installing extensions