Emerging technologies to support health data management
Last updated
Was this helpful?
Last updated
Was this helpful?
Digital technologies can support a range of data management activities, including data storage, standardisation, interoperability and privacy. The five technologies described below are increasingly being used to help manage health data.
Data repositories are large online data stores that enable stakeholders to upload, share and access data. They ensure data is preserved, can help track the provenance of data, and can help standardise the way data is described (known as metadata). Repositories can provide flexibility in access permissions, enabling data to be shared across the spectrum; from internal use only, to sharing with partners and specific groups, to public or open access. Examples of data repositories used in the health sector include:
: A catalogue of available health data repositories used for general data uploads or for specific subjects and geographies.
: A widely accepted data repository for all types of data including qualitative social science data, as well as biomedical data. It is recognised by PLOS One journals.
: Another widely accepted data repository, also commonly used by scientific journals as an accepted data repository for the data attached to published research studies.
: A global data research sharing platform specialising in sharing clinical trials data. This allows the results of multiple studies to be combined to create large enough datasets to perform analysis that have meaningful conclusions.
There are an increasing number of software systems to support enterprises, health departments and others to govern data. Using methods, such as labelling and tagging to indicate important things like permissions for reuse (for example consents for personal data), standards to capture key information about each dataset to inform the way it is managed, and data dictionaries to ensure common terms, these technologies can help to implement the agreed policies and processes set out by a data governance framework. Examples of these tools include:
: This blog post describes some open source projects released by leading tech organisations that are sharing their internal data governance infrastructure and tooling. Select the GitHub repository link provided for each example.
: A non-profit organisation specialising in working with the pharmaceutical industry to manage and share research data.
: The European Commission's experimental tooling platform, with privacy and data governance pilot tools available.
and data models detail agreed ways to collect, use and share data, for example language, concepts, rules and guidance.These might be embedded in software, or described in documents. There are a number of standards for data that help ensure health data is interoperable, enabling use by a variety of stakeholders in a consistent, comparable manner. These include:
Data can be made accessible to others via APIs. APIs enable datasets to be integrated directly into systems so that users can access data that is relevant for their purpose, rather than having to download and upload entire datasets. APIs give users access to up-to-date data, mitigating the risk that they might use an outdated version. They can also contain built-in contract conditions to manage access permissions. These controls can be set at a granular level so that different users can be given different levels of access, as determined by a unique identifier (also referred to as an API key).
Health-specific API standards include:
creates a digital object identifier (DOI) for each dataset which enables it to be shared with a consistent, globally standardised internet link.
Metadata standards help ensure datasets are described with standardised metadata descriptions. Models like and , and dataset descriptions like , enable standardised descriptions of datasets.
There are also standardised data models for how health data should be organised, including , , , , , and others.
Industry-based solutions like the seek to transform datasets into a format readable by the FHIR (Fast Healthcare Interoperability Resources) application programme interface (API) standard as they are ingested, in order to make them interoperable.
APIs can be developed as needed by individuals or organisations, but there are also a number of open API standards such as the . API standards make sure that different systems using the same API can access, modify or create data items in a way that is consistent between each system.
.
.
(PETs) enable use of data while protecting the identity of the individuals reflected in the data. There are emerging good practices . PETs include:
Platforms that enable data to be accessed in a secure way that does not allow removal from the platform, often referred to as "trusted execution environments". For example, EUCANConnect's aims to enable analysis of research data without removal from the data storage source.
Personal data stores allow individuals to store their data in a platform and set agreements on who can access the data. This can include proprietary offerings built on blockchain technologies, like , or health authority designed platforms like in Finland.