Thing 1: Ready Set Data

Overview

Teaching: 10 min
Exercises: 20 min
Questions
  • What is research data?

  • Where can you find research data in your discipline?

Objectives
  • Learn what research data is.

  • Explore and examine discipline based research data.

  • Explore the technical aspects of data management and assess your own skills.

Data is the central currency of science, but the nature of scientific data has changed dramatically with the rapid pace of technology.

– Ted Hart (@emhart), et. al., Ten Simple Rules for Digital Data Storage. PLoS Comput Biol 12(10): e1005097. doi:10.1371/journal.pcbi.1005097

Access to the computational steps taken to process data and generate findings is as important as access to data themselves.

– Victoria Stodden, et. al., Enhancing reproducibility for computational methods. Science09 Dec 2016 : 1240-1241. 10.1126/science.aah6168

Getting started: What is research data?

The concept of research data is complex and fluid. Virtually all types of digital information have the potential to be research data if they are being used as a primary resource for research.

– Research Data Strategy Working Group, Mapping the Data Landscape: Report of the 2011 Canadian Research Data Summit

What “research data” are we talking about?

Research data can be

Research data come in many formats

Let’s take a look research data

Note: We’ll be using the UCSD Library Digital Collections for this exercise. Feel free to swap out with other institutional or discipline based repositories and collections.

  1. Open up one of these collections:
  2. Write down the specific data formats, software used, and wether the collection has a code book or data dictionary. If you aren’t familiar with the data formats or software codes, try a web search and see if you can figure out what it is. Capture your findings.
  3. Explore the metadata representing the collection. Besides the title and description, what other elements are described?
  4. Share with the class what you’ve found.

Heavy Metals

One interesting and unique data collection in the UCSD Digital Collections is the Heavy Metals in the Ocean Insect, Halobates collection. This is an unique example of ~40 year old computer printed data that was digitized for sharing and preservation.

Complexity and formats affect on re-use

How does complexity and range of data formats affect access and re-use possibilities?

Learn More: Data across research disciplines

  1. Choose one of the 4 specialized data repositories below, or find another data repository of interest - particularly one in a discipline you are unfamiliar with and spend some time browsing around your chosen repository to get a feel for the data available.
  2. Think about how the data here differs from data you are familiar with. Consider for example, format, size and access method.
  3. Record your reflections in the class etherpad or through discussion.

Discipline based data conventions

How could cross disciplinary research be affected by discipline data conventions? Also, think about one way cross disciplinary data access can be facilitated.

Challenge me: Let’s talk tech!

Get the ball rolling to expand awareness of the technical aspects of data management and this rapidly growing community of tech-aware data enthusiasts. What is….

Personal technical data audit

Conduct a personal audit on what data technical skills you have, and what skills you want to learn.

Key Points

  • Research data is heterogeneous in form.

  • Research data can be categorized into many types, such as observational, experimental, simulation, derived or compiled, and reference.

  • Research data are often contextualized within communities.

  • Technical data skills are increasingly needed to create, work with, understand and make sense of research data.