Minimising risk - practical approaches
There are several approaches that can be taken to maximise use of data about people, while protecting rights and individuals from harm:
1. Data minimisation
Do you really need to collect the data? If you don’t need personal details or commercial information, don’t collect them. This can help avoid the need to navigate data protection laws and make it more likely that you would be able to share the data with others. Data minimisation can also help overcome any ethical issues with collection, use or sharing of data.
2. Validate data input wherever possible
A key challenge with data collection is that sometimes each organisation inputs data using different approaches or formats, making it difficult to aggregate data in a way that it can be reused. This is especially the case when there are open or 'free text' fields, that is, input fields where those entering the data can make long notes. This can also open up the risk of exposing personal data when these fields are used in future data reuse or sharing. Some use of free text fields is inevitable, for example doctors need to be able to take notes on patient visits and add to the patient's electronic health record. However, wherever possible:
Replace free text fields with more validated data input fields, such as date fields or drop-down lists, to ensure input validation of fields. For example, data inputters can select a postcode or zipcode from a list, rather than inputting it manually. This helps ensure that only the necessary data is being collected and that it is immediately formatted in a standardised way.
Encourage data inputters to use a template approach when completing free text fields, so that there is some consistency. For example, a doctor completing a free text field could be encouraged to follow a specific format by describing the patient's current treatment regime, followed by any concerns/questions the patient raised, and finishing with a summary of the treatment provided and next steps.
3. Anonymisation and suppression
It is possible to process data into a modified form that can be shared or made open while significantly reducing the possibility of anyone recovering sensitive or personal information from it. For sensitive data in general, this process is called suppression; for personal data it is called anonymisation. The UK Information Commissioner's Office recognises the benefits of anonymisation in its code of practice, stating that: ‘The anonymisation of personal data is possible and can help service society’s information needs in a privacy-friendly way’.
More detail, including a worked example, is included in the ODI’s An introduction to managing the risk of re-identification.
Also refer to the UK Anonymisation Network’s Anonymisation decision making framework (ADF), which provides step-by-step guidance, a step-by-step interactive guide to the ADF, and a Risk, harms and benefits checklist tool.
Finally, if you need expert input, there is a register of actors that can help with anonymisation.
4. Use synthetic data
Synthetic data is created by an automated process and contains many of the statistical patterns of an original dataset. Synthetic data is sometimes used as a way to release data that has no personal information in it, even if it originally contained lots of information that could identify people. While there are some challenges in using synthetic data in healthcare settings, there is growing recognition of its potential.
This hands-on Python tutorial demonstrates how to create a synthetic dataset.
Specialist guidance
Assessing and mitigating risks when sharing personal data may require specialist input. You may need to consult colleagues, partners or legal specialists, including:
Scientific, policy specialists or ethicists, who can help you to consider how the data is to be collected, used or shared, as well as the validity of data exchanged.
Data and information specialists who understand: the technical aspects of the data to be shared; how the data may be integrated with other data sources that could raise other data protection or legal issues not inherent within the data alone.
Lawyers for legal support.
In all cases, this guidance is not legal advice and if you are uncertain, you should seek support from legal professionals.
Last updated