LogoLogo
Data Infrastructure for Common Challenges
  • Data Landscape Playbook
    • Data Landscape Playbook: status
    • What is this playbook for?
    • Who is this playbook for?
  • Play one: Explore the problem and how data can address it
    • Define how improving access to data can help address your problem
    • What type of data infrastructure does the initiative aim to create or maintain?
      • Build or manage data assets
      • Create or adopt data standards
      • Build or improve technologies
      • Create guidelines and policies
      • Build or support organisations and communities
    • Carry out initial research and engagement
    • Summary of Play One
  • Play two: Map the data ecosystem
    • Engage with key stakeholders
    • Create an ecosystem map
    • Identify gaps, barriers and opportunities
    • Summary of Play Two
  • Play three: Assess the policy, regulatory and ethical context
    • Understand the legal, regulatory and policy context of the initiative
    • Understand the ethical issues impacting your initiative
    • Summary of Play Three
  • Play four: Assess the existing data infrastructure
    • Make a data inventory
    • Assess open standards for data
    • Assess data skills and literacies
    • Summary of Play Four
  • Play five: Plan for impact when designing your data initiative
    • Plan an impactful initiative
    • Identify risks, assumptions and dependencies
    • Sketch your evaluation framework
    • Summary of Play Five
  • What comes next?
  • Acknowledgements
Powered by GitBook
On this page

Was this helpful?

  1. Play four: Assess the existing data infrastructure

Make a data inventory

PreviousPlay four: Assess the existing data infrastructureNextAssess open standards for data

Last updated 4 years ago

Was this helpful?

A data inventory exercise can help you to identify the range of datasets that could be used to tackle a problem and decide on which type of data infrastructure needs to be designed or strengthened to improve these data assets and access to them.

A data inventory is a list of datasets annotated with important information (known as metadata) that can help people understand why data has been collected, what it contains, how it is managed and the ways it is made available for others to use. For detailed guidance on how to create a data inventory, you can use this developed by the Centre for Agricultural Bioscience International (CABI) and the ODI with the support of the Bill and Melinda Gates Foundation.

Data inventories can be used to:

  • assess the quality of the data available

  • identify data assets that might contain personal or commercially-sensitive data

  • undertake a robust analysis of the sector data

  • find relevant data easily

  • assess the range of data owners in the initiative’s relevant sector

  • make recommendations to improve access to data.

To decide on which datasets to include, you might want to consider:

  • what kind of data you are looking for based on your problem statement

  • the geographical area you want to work in

  • how recent the publications and datasets are

  • the credibility and reliability of the data source

  • the methodological soundness of the data collection approach

  • if and how the data is licensed.

Example Data Inventory

Field

Data asset

Description

Location

License

Land Use

Land use dataset

Datasets describing cultivated areas around the world

Soil

Soil maps

Soil data describing characteristics and classes provided by ISRIC’s world soil information

Access to water

Water sources map and location datasets

EU’s open data portal datasets related to water sources to support agricultural projects

...

...

...

...

...

Data inventories are very useful for understanding available datasets. However, they will not necessarily capture everything, especially how the data has been used. Data infrastructure is not neutral; it needs to be understood in context, to better understand its limitations.

Special focus: Open data licensing

How data is licensed will affect the ability of other people and organisations to use the data. Open data licensing allows the maximum number of people to use the data without restriction. There are three standard Creative Commons licences which can be used to open data:

It is often useful to publish data inventories as open data, but the information they contain can go stale rapidly, as the availability of different datasets changes. We recommend adding information to give context on how and when the data inventory was created and the information it contains collected. When publishing a data inventory, you might also explain what data is in scope, what is missing, how the inventory will be maintained, and any particular legal and ethical considerations around its reuse. You should consider with other stakeholders, or treating their creation as a one-off research exercise.

– No Rights Reserved. There are no restrictions on how re-users can use the data.

– People who use the data must credit whoever is publishing it (this is called attribution).

– People who mix the data with other data have to also release the results as open data (this is called share-alike), and must attribute it.

checklist on how to create a data inventory
maintaining data inventories collaboratively
CC0
CC-BY
CC-BY-SA
EarthStat
CC-BY 4.0 License
SoilGrids
CC-BY 4.0 License
Waterbase - Water Quality (EU)
CC-BY 4.0 License