Community Standards

You can view our FAIRDOM Standards collection in the FAIRsharing catalogue.

What are Community Standards?

Standards are, in essence, an agreed way of organising and describing things. Systems biology has standards for formatting, and for annotating data and models. They are designed by experts with an understanding of what key information will comprise the outcome of an experiment, and how it is best structured in a written format. Agreements and development of standards for systems and synthetic biology often occur at a grass-roots level through communities such as COMBINE. There are also some relevant top-down standards developed by the International Standards Organisation (ISO).

Standard Formatting

Standardised formats are used to structure knowledge from experiments and models in a consistent way. This allows the key information to be easily found both by researchers, and software. Standardised formats have been generated to support systems biologists in structuring models (e.g. SBML, CellML), simulating models (e.g. SBRML, SEDML), visualising models (e.g. SBGN), structuring data (e.g MzTAB for Proteomics), and structuring investigations (e.g. ISATab).

A FAIRDOM survey of standard usage has shown that the community use of standard formats is above two thirds. But it has been noted that systems biology standards are not suitable for all types of data and models produced within the field, with physiology models, and certain cellular processes such as transcription not having suitable, mature standards available for use.

Metadata Descriptions

Metadata is data about data – or descriptions of data. They are used to describe knowledge from experiments and models, allowing them to be easily interpreted, reused, and reproduced. Descriptions can include strains of an organism, media used for growth, pH, temperature, sample collection times, or even samples identifiers (e.g. metabolite or protein names), as well as many others. Minimum information checklists are used to ensure data and models contain enough description to be understandable. FAIRsharing makes a wide range of minimum information checklists available for researchers to use. Metadata descriptions are also best used in conjunction with naming conventions, such as ontologies or controlled vocabularies. Naming conventions ensure that the semantics of an entity, such as a metabolite or gene, are unambiguous.

There are a number of specialised ontologies for systems biology such as Systems Biology Ontology (SBO), and Just Enough Results Model (JERM). They are also best used in conjunction with real world identifiers, which are identifiers that can be linked unambiguously to databases detailing the entity, for example ChEBI for metabolites. Metadata standards and controlled vocabularies are used by only about half of the community according to a research survey. As a result, many researchers find reusing data and models produced by other researchers difficult, with understanding the origin of data used within a model referenced as being particularly difficult.