Thing 2: Issues in research data management

Overview

Teaching: 0 min
Exercises: 0 min
Questions
  • What are some of the issues we face in managing research data?

Objectives
  • Getting started: Watch a cartoon about what happens when a researcher hasn’t managed their data (at all…) What could possibly go wrong!?

Getting started: Managing data for reuse

Research data is for everyone. Governments and universities all around the US and the world are now encouraging and requiring researchers to better manage their data so others can use it.

Research data might be critical to solving the big questions of our time, but so much data are being lost or poorly managed.

Research data issues

  1. The following 4.40mins cartoon put together by the New York University Health Sciences Library is about what happens when a researcher hasn’t managed their data (at all…) What could possibly go wrong!?
    Data Sharing and Management Snafu in 3 Short Acts (YouTube)
  2. As you watch the cartoon jot down the data management mistakes which interest or appall you.
  3. Now, scan through the dot points in the Data best practices section from the Stanford University Library that provides advice for researchers on how to manage and make their data more (re)usable.

Avoiding data disasters

How do you think just ONE of the data disasters depicted in the cartoon could have been avoided?

Learn more: How do you manage “Big Data”?

“Big Data” is a term we’re hearing with increasing frequency. Data management for Big Data brings much complexity - citing dynamic data, software, high volume compute, storage costs, transfer of petabytes of data, preservation, provenance, more.

Big Data

Read this post and presentation titled: “Big Data: The 5Vs Everyone Must Know.

Is the concept of 5Vs is useful to support better management and reuse of “Big Data”? If you don’t think 5Vs is of value, is there another framework or concept model which could be useful for exploring data management for big data?

Challenge me: Digital data in eLab Notebooks

Laboratory Notebooks are used by researchers to formally record their research activities. As research has become increasingly digital and collaborative the utility of traditional hard copy Lab Notebooks has been challenged. Not surprisingly then, eLab Notebooks (ELN) have emerged as an alternative.

Effective data management for constantly updated data, such as that within ELNs, is a real challenge for projects who wish to publish their data during the project.

eLab Notebooks

  1. Read this short definition of ELNs
  2. Then read this article: International team of scientists open sources search for malaria cure about how an international team of scientists and citizen scientists are using open source ELNs to speed up a cure for malaria.
  3. You can see their open ELNs here. Click on Matthew Todd’s ELN to see what it’s in it. Discuss your view on a data management issue and possible solutions, where data is generated, stored and shared via an open ELN.

File formats and names

A researcher friend in your department gave you her research on a thumb drive and asked you for feedback on making it sharable. She also mentioned to be careful because it’s her only copy. When you open up the research folder, called research project b, you find the following files.

├── FINAL_rev.18.comments7.corrections9.MORE.05.doc
├── FINAL_rev.6.COMMENTS.doc
├── FINAL_rev.8.comments5.CORRECTIONS.doc
├── data.xls
├── data2.xls
├── data3.dta
├── fig1.jpg
├── fig2.jpg
├── fig3.jpg
├── myresearch-3.do
└── myresearchB.do
  1. What would you suggest to your colleague to help improve the filenames she’s chosen?
  2. What about the the file formats? Should she consider different formats?
  3. What data backup strategies would you recommend?

Formatting your data quiz

Take the quiz on formatting your data. Use the Room number QTKGZXF6G to take the quiz. This quiz is adapted from Do-It-Yourself Research Data Management Training Kit for Librarians. It is under the Creative Commons Attribution 4.0 International License.

Organizing data

Work through the Data Carpentry Spreadsheets for Ecology on your own. This lesson will teach you some best practices on organizing and formatting ecology data.

Key Points

  • Research data is critical to solving the big questions of our time.