Thing 10: Sharing sensitive data

Overview

Teaching: min
Exercises: min
Questions
  • How can you share and publish sensitive data?

Objectives
  • Objective 1.

Sharing sensitive data requires careful consideration, but it can be done. Find out how. Getting started: If it’s so sensitive - how can it possibly be shared and published?! Learn more: Who are the “data gatekeepers”? Challenge me: Make me anonymous

Getting started: Sensitive data can be shared!

Major, familiar, categories of sensitive data are Human data (eg health and personal data, secret or sacred practices); or Ecological data (may place vulnerable species at risk).

Given the nature of this type of data, you might expect that it can’t be shared and reused. But in many cases, it can be.

Explore published sensitive data

  1. Explore one of these examples of published sensitive data: Pregnancy and Lifestyle dataset shows how sensitive, de-identified data can be safely and openly shared. Click on “Go to Data Provider” to see the actual data
  2. This 1 page story tells how sensitive data from the Australian Longitudinal Study of Women’s Health data has been successfully published for almost 20 years.

Share and publish sensitive data

How do you share and publish sensitive data?

  • Scan the ANDS sensitive data webpage.
  • Click on the Sensitive Data Decision Tree to get an overview of issues and solutions.
  • If you have time: follow a couple of the links on the sensitive data page which are of particular interest to you.

Health data survey role play

Imagine you are either a researcher or a participant in a health data survey:

Participant
What questions might you first ask the researcher about intended sharing and reuse of the survey data?
Researcher
What responses would you need to prepare to anticipate participants’ questions about publishing “their data for all the world to see”?

Learn more: The ethics of sensitive data

How we manage sensitive data through its lifecycle and who has a role in ensuring sensitive data is appropriately managed and shared are critical issues in ensuring sensitive data can be shared.

Ethical use of digital data

Open Guidelines for the Ethical Use of Digital Data in Human Research (produced by Melbourne University and the Carlton Connect Initiative).

  1. Start by reading the introduction
  2. Take a closer look Data governance and custodianship on pp15-16
  3. Consider your responses to the questions on p16
  4. If you have time: scan through the rest of the comprehensive document.

Consider: What are your thoughts on the role of data gatekeepers, especially for sensitive data? What skills and knowledge do we need to protect sensitive data?

Challenge me: Anonymizing sensitive data

Anonymization is a process that balances the risks of producing safe data with maintaining useful data. When anonymization is done well the risk of disclosing information referring to individuals should be negligible.

What to anonymize

  1. Names of people: describe according to significance to the respondent: ‘female / male friend’, ‘mother’, ‘father’, ‘teacher’ etc.
  2. Names of towns/cities/villages: describe according to the significance of the place to the respondent’s life: ‘city she grew up in’, ‘town he moved to’, ‘small neighboring village’.
  3. Name of country: describe according to part of world or perhaps ‘country regularly visited by backpackers’, if relevant.
  4. Name of school/college: ‘high school she attended’ etc.
  5. Name of workplace: describe more generally e.g. ‘fast food restaurant’, ‘pub’, ‘shop’, ‘factory’ etc. When it comes to descriptions of departments in companies or particular sections of a workplace, use ‘department’ or ‘section’.
  6. Nationalities: in close knit communities, talk of a different and specific nationality may easily disclose identity e.g. ‘the Americans who took over the pub’. Suggest these be changed to ‘people’.
  7. School subjects – to be changed at discretion of anonymizer and possibly in consultation with respondents, especially young persons.

Techniques and tools for anonymization

  1. Consider the different techniques required to de-identify quantitative and qualitative data. The UK data service has information on anonymization of both.
  2. Explore some tools and resources for practical advice on how to de-identify data.

Tools to help anonymize

What are other some tools or resources you would recommend that could help a researcher de-identify or anonymize their data.

Key Points

  • First key point.