Making Sense of Your Data

Writing a good README file

Whereas metadata is important in discovery of data, often additional information is required to explain it to other researchers. That’s where README files come in – they provide information, both to yourself and to other researchers, to help the data be understood at a future time. README files should be in plain text format to ensure maximum utility rather than a proprietary format such as MS Word or pdf. On the Liverpool Data Catalogue it’s possible to clearly mark a file as a README file in the options during upload, and give it a licence determining how it can be used.

It’s not necessary to have a separate README file for every data file, but the README should cover every data file in the collection.

You should consider the following;

  • Description of the methodology used to create the data, especially if the methods are novel or differ from normal procedures in the field.
  • Explanation of the data processing steps used to create any derived data. Be prepared to expand upon this if it’s not adequately covered in any linked publications.
  • The hardware and software used to create the data, including operating system details, together with links to any software downloads, especially if they are free.
  • Quality assurance steps taken to ensure accuracy of the data, if any.

The data files themselves should always contain clearly marked field names, if the data is tabular, and units of measurement. Acronyms should be explained if they are particular to the research project.

Cornell University's Research Data Management Service Group has created an excellent README template