Webinars | Niall Beard
Scientific Web Technologist,
Niall Beard works with the eScience Lab (University of Manchester) developing software tools and platforms for academia. He currently manages the implementation of ELIXIR TeSS; a training portal for Life-Scientists. The training portal aggregates metadata about training materials and events from around the web. The difficulties of harvesting unstructured data led to his involvement in Bioschemas. Bioschemas is a group which aims to make vocabularies in schema.org more useful to Life-Scientists, and to encourage data providers to structure their data using it, making inclusion in portals significantly easier.
Other notable activities include developing the BioCatalogue; an online catalogue for describing web services, and advocating the adoption of the Open API specification (formerly Swagger).
19th January 2017 (14:00 – 14:45 GMT) : “Building a training material portal: BioSchemas and metadata scraping”
For any discipline, identifying high quality training materials is difficult. They are often spread across different institutional websites, personal blogs, or stored in archives. This means you have to know exactly what you are looking for, and where to find it. However, for many, training occurs when the person knows little about the subject they wish to train in. This makes it difficult to know what material they need to look for. Once it is identified, it is also difficult to know what quality others within the community rate that material. To solve this problem for European Bioinformatics, we have generated the TeSS Portal. TeSS aggregates links to disparate training materials and events scattered around the institutional websites of ELIXIR Nodes and other content providers (GOBLET, Software and Data Carpentry, EBI TrainOnline, Genome3D, on-course, and more.), making them centrally discoverable and searchable. Training resources within TeSS can be collected and arranged into packages and/or training workflows, forming graphical representations of scientific pipelines in order to organise resources into easily navigable views. Aggregation of training content happens automatically through a set of custom-made nightly-run scraper scripts. Scrapers use a number of techniques to extract information: HTML-scraping and APIs had been the predominant methods, but are either temperamental or scarcely found respectively. More recently we have focused on parsing structured schema.org mark-up data. The TeSS team, through the Bioschemas group, have been heavily involved in developing schema.org specifications for Life science training resources and promoting their adoption