Monthly Archives: March 2016

FAIRDOM User Meeting

We’re excited to announce the first combined user meeting for FAIRDOM. The meeting is for current users, and anyone interested in data and model management, or interested in using the FAIRDOM software and platforms (SEEK, FAIRDOMHub, openBIS, JWSOnline, RightField). It will be held as a satellite meeting of ICSB 2016, running on 15th September 2016 in Barcelona. 10am to 4pm. The best news is that the agenda, and coverage will be largely set by YOU!
At the meeting there will be keynotes, short presentations on real-case use of FAIRDOM software, flash presentations on ‘FAIRDOM in the wild’, and discussion forums where users can discuss the suite of FAIRDOM software.

At the meeting you will:

  1. Learn how other research groups formulate their data and model management using FAIRDOM software.
  2. Have direct contact with the FAIRDOM Community and Tech Team, who can give you personalised advice on your data and model management, and using FAIRDOM.
  3. Learn how users have extended the platforms to include e.g. electronic lab notebooks
    Be an early audience for our ‘FAIRDOM for Publication’ extension.

To join please register here

New Blog: Getting a handle on the future of life-science data

The data you collect tomorrow will not have the same characteristics as the data you are collecting today. Being able to predict how the characteristics of your data will change over the coming years is vital if you are preparing data management plans. We’ve written a blog to describe what characteristics of data are important to analyse and predict, and link to our 2014 paper on the key challenges we expect to arise in life-science data collection and management.

To read it please visit:

Getting a handle on the future of life-science data.

“Before data-stewarding practices can be expertly developed we need to understand the nature of the data to begin with, and how it is expected to change in the coming years.”

The idea of the “data deluge” has been looming over the life-sciences for the last 10 years. Advancing technologies are increasingly improving the speed and quality with which vast quantities of large data sets can be collected. Far from being something to fear, if handled correctly, this process of ramping data collection can really drive the speed of discovery in systems research.

Handling this change is a key challenge for the life-sciences, and requires forward thinking for developing the right platforms and techniques for collecting, annotating and storing this data during its lifecycle. There are 5 characteristics that should be understood for this process [1]:

Volume – data size – the amount of data that can be produced in a given amount of time varies over methods and tends to increase with years. The highest volume of data tends to come directly from machines in a raw format. The volume often reduces during  post-processing to a final form. Understanding these changes ensures appropriate storage, preservation, and access solutions can be used. It also important to understand how much raw data to preserve.

Velocity – speed of change – how fast the data is replaced. With improving technologies some types of datasets can become obsolete quickly. Knowing a timescale for obsoletion allows appropriate managing of stored datasets over the long term, dictating if and when to scrap the old.

Variety – different forms of data storage – data types can be collected using different methods, and technologies. These usually allow a specialised way of collecting the most valid data for any given study. Understanding the range of these techniques, and how the final data is used and, over the long-term, re-used and/or repurposed, is very important for valid downstream use.

Veracity – uncertainty of data – describes how messy the data is, and therefore how much it can be trusted for study. Often high throughput data has to compromise qualities which may be useful  (e.g. full quantification for metabolomics data).

Value – how useful is it for the investigation at hand?

As a community, we held a joint meeting in March 2014, bringing together  biomedical sciences research infrastructures (BMS RIs) covering genomics, proteomics, imaging, metabolomics, and clinical data. In a two day meeting we brainstormed current data characteristics, and those predicted for the future, with teams of experts for all data-types. As a result we managed to produce a report for both  infrastructures and researchers to use as a basis to begin developing our data-management plans. The findings are still useful two years on.


[1] Bernard Marr (2015) Big Data. John Wiley & Sons; 1 edition.

Written by Natalie Stanford of FAIRDOM, and edited by Steffi Suhr from the BioMedBridges Infrastructure and EBI.