Making your Data Findable

It is not a matter of just making research data openly available, it also has to be findable. The same goes for a metadata record of data that cannot be made open but can be shared in some way.

In order to optimise the ‘findability’ of your data there are number of things that you need to consider and facilitate.

DOI®

The DOI (Digital Object Identifier) is used for identifying content in the digital environment.  It is a persistent identifier that is unambiguous and permanently assigned. The use of DOIs for the citing of data sets makes their provenance trackable and citable and therefore allows interoperability with existing reference services.

The University’s Research Data Catalogue, creates a DOI when you upload your data files. This DOI can used whenever you mention or want to link to your data and associated articles.

There are other types of persistent identifiers that can be used for digital records and other associated information that add provenance to the record. Such as ROR for institutional affiliation. However, at present these are not as widely used as the DOI. 

When choosing a repository, you should make sure that your dataset is given a DOI.

There are evidence publications with associated datasets are more highly cited. Citations are made easier by having a DOI.

Further information on the citing of datasets guide to citing datasets.

ORCID® iDs

An ORCID iD is a unique researcher identifier which can be added to your outputs to ensure your work is easily distinguished from that of other researchers. ORCID is an international scheme with over 1.5 million iDs registered by researchers, and is now used by an increasing number of publishers, funders and universities.

You manage your ORCID profile. Your ORCID profile number can be linked to a range of information sources such as publications, grants, education and employment history.

Keywords and metadata

To make sure that your data is findable (even if access to the data themselves are restricted) you should use a research data repository.

Trusted repositories use the appropriate metadata standards, and discipline-specific data repositories are usually supported by funders or the community (so have tailored metadata fields and descriptions). Always complete as many fields as you can and use keywords associated with your data, project and discipline to aid discoverability.

If you are coming to the end of a large project and there are no protocols on how to share data and link to other outputs, you could add the project name and any reference number as additional information in all related items (often deposited by different sub groups/ projects and collaborators) and use the same keywords so that the records can be connected by search terms.

Data from large projects may also be deposited in several different research data repositories. In this case, creating a metadata-only record in the Liverpool Data Catalogue, with the appropriate keywords, and linking to those other records, will also ensure that all related items are discoverable.

Finding the right data repository

It is worth thinking about where you are going to store your data when you prepare a data management plan. This will help inform you of the standards, required formats and metadata that will be required at the end of the project when the data is deposited. This is especially useful in a larger and collaborative project.

There may already be an established repository in your discipline or one that your funder recommends and supports; such as UK Data Archive or British Oceanographic Data Centre (BODC).

The University of Liverpool has its own research data repository, the Research Data Catalogue. You can use this repository to create a record of your finalised research data or create metadata record showing where your data is held.

Recently, some journal publishers have started to specify repositories in which code and supplementary materials may be deposited. However, care should be taken to check the conditions under which this happens. Some publishers do not publish supplementary material under a Creative Commons licence but rather claim the copyright for themselves.

Check your specific funder requirements or Registry of Research Data Repositories (re3data) for repositories in your subject area.

If there is no suitable subject repository there are a number of good multi-disciplinary repositories: