A hands-on introduction to querying evolutionary relationships across multiple data sources using SPARQL [version 2; peer review: 1 approved, 2 approved with reservations]

The increasing use of Semantic Web technologies in the life sciences, in particular the use of the Resource Description Framework (RDF) and the RDF query language SPARQL, opens the path for novel integrative analyses, combining information from multiple data sources. However, analyzing evolutionary...

Full description

Bibliographic Details
Main Authors: Ana Claudia Sima, Christophe Dessimoz, Kurt Stockinger, Monique Zahn-Zabal, Tarcisio Mendes de Farias
Format: Article
Language:English
Published: F1000 Research Ltd 2020-07-01
Series:F1000Research
Online Access:https://f1000research.com/articles/8-1822/v2
_version_ 1811313860972380160
author Ana Claudia Sima
Christophe Dessimoz
Kurt Stockinger
Monique Zahn-Zabal
Tarcisio Mendes de Farias
author_facet Ana Claudia Sima
Christophe Dessimoz
Kurt Stockinger
Monique Zahn-Zabal
Tarcisio Mendes de Farias
author_sort Ana Claudia Sima
collection DOAJ
description The increasing use of Semantic Web technologies in the life sciences, in particular the use of the Resource Description Framework (RDF) and the RDF query language SPARQL, opens the path for novel integrative analyses, combining information from multiple data sources. However, analyzing evolutionary data in RDF is not trivial, due to the steep learning curve required to understand both the data models adopted by different RDF data sources, as well as the equivalent SPARQL constructs required to benefit from this data – in particular, recursive property paths. In this article, we provide a hands-on introduction to querying evolutionary data across several data sources that publish orthology information in RDF, namely: The Orthologous MAtrix (OMA), the European Bioinformatics Institute (EBI) RDF platform, the Database of Orthologous Groups (OrthoDB) and the Microbial Genome Database (MBGD). We present four protocols in increasing order of complexity. In these protocols, we demonstrate through SPARQL queries how to retrieve pairwise orthologs, homologous groups, and hierarchical orthologous groups. Finally, we show how orthology information in different data sources can be compared, through the use of federated SPARQL queries.
first_indexed 2024-04-13T11:02:07Z
format Article
id doaj.art-7f453f24685b48fba6af40688d93dfb7
institution Directory Open Access Journal
issn 2046-1402
language English
last_indexed 2024-04-13T11:02:07Z
publishDate 2020-07-01
publisher F1000 Research Ltd
record_format Article
series F1000Research
spelling doaj.art-7f453f24685b48fba6af40688d93dfb72022-12-22T02:49:22ZengF1000 Research LtdF1000Research2046-14022020-07-01810.12688/f1000research.21027.227892A hands-on introduction to querying evolutionary relationships across multiple data sources using SPARQL [version 2; peer review: 1 approved, 2 approved with reservations]Ana Claudia Sima0Christophe Dessimoz1Kurt Stockinger2Monique Zahn-Zabal3Tarcisio Mendes de Farias4Department of Computational Biology, University of Lausanne, Lausanne, Vaud, SwitzerlandDepartment of Computational Biology, University of Lausanne, Lausanne, Vaud, SwitzerlandZHAW Zurich University of Applied Sciences, Winterthur, Zurich, SwitzerlandDepartment of Computational Biology, University of Lausanne, Lausanne, Vaud, SwitzerlandDepartment of Computational Biology, University of Lausanne, Lausanne, Vaud, SwitzerlandThe increasing use of Semantic Web technologies in the life sciences, in particular the use of the Resource Description Framework (RDF) and the RDF query language SPARQL, opens the path for novel integrative analyses, combining information from multiple data sources. However, analyzing evolutionary data in RDF is not trivial, due to the steep learning curve required to understand both the data models adopted by different RDF data sources, as well as the equivalent SPARQL constructs required to benefit from this data – in particular, recursive property paths. In this article, we provide a hands-on introduction to querying evolutionary data across several data sources that publish orthology information in RDF, namely: The Orthologous MAtrix (OMA), the European Bioinformatics Institute (EBI) RDF platform, the Database of Orthologous Groups (OrthoDB) and the Microbial Genome Database (MBGD). We present four protocols in increasing order of complexity. In these protocols, we demonstrate through SPARQL queries how to retrieve pairwise orthologs, homologous groups, and hierarchical orthologous groups. Finally, we show how orthology information in different data sources can be compared, through the use of federated SPARQL queries.https://f1000research.com/articles/8-1822/v2
spellingShingle Ana Claudia Sima
Christophe Dessimoz
Kurt Stockinger
Monique Zahn-Zabal
Tarcisio Mendes de Farias
A hands-on introduction to querying evolutionary relationships across multiple data sources using SPARQL [version 2; peer review: 1 approved, 2 approved with reservations]
F1000Research
title A hands-on introduction to querying evolutionary relationships across multiple data sources using SPARQL [version 2; peer review: 1 approved, 2 approved with reservations]
title_full A hands-on introduction to querying evolutionary relationships across multiple data sources using SPARQL [version 2; peer review: 1 approved, 2 approved with reservations]
title_fullStr A hands-on introduction to querying evolutionary relationships across multiple data sources using SPARQL [version 2; peer review: 1 approved, 2 approved with reservations]
title_full_unstemmed A hands-on introduction to querying evolutionary relationships across multiple data sources using SPARQL [version 2; peer review: 1 approved, 2 approved with reservations]
title_short A hands-on introduction to querying evolutionary relationships across multiple data sources using SPARQL [version 2; peer review: 1 approved, 2 approved with reservations]
title_sort hands on introduction to querying evolutionary relationships across multiple data sources using sparql version 2 peer review 1 approved 2 approved with reservations
url https://f1000research.com/articles/8-1822/v2
work_keys_str_mv AT anaclaudiasima ahandsonintroductiontoqueryingevolutionaryrelationshipsacrossmultipledatasourcesusingsparqlversion2peerreview1approved2approvedwithreservations
AT christophedessimoz ahandsonintroductiontoqueryingevolutionaryrelationshipsacrossmultipledatasourcesusingsparqlversion2peerreview1approved2approvedwithreservations
AT kurtstockinger ahandsonintroductiontoqueryingevolutionaryrelationshipsacrossmultipledatasourcesusingsparqlversion2peerreview1approved2approvedwithreservations
AT moniquezahnzabal ahandsonintroductiontoqueryingevolutionaryrelationshipsacrossmultipledatasourcesusingsparqlversion2peerreview1approved2approvedwithreservations
AT tarcisiomendesdefarias ahandsonintroductiontoqueryingevolutionaryrelationshipsacrossmultipledatasourcesusingsparqlversion2peerreview1approved2approvedwithreservations
AT anaclaudiasima handsonintroductiontoqueryingevolutionaryrelationshipsacrossmultipledatasourcesusingsparqlversion2peerreview1approved2approvedwithreservations
AT christophedessimoz handsonintroductiontoqueryingevolutionaryrelationshipsacrossmultipledatasourcesusingsparqlversion2peerreview1approved2approvedwithreservations
AT kurtstockinger handsonintroductiontoqueryingevolutionaryrelationshipsacrossmultipledatasourcesusingsparqlversion2peerreview1approved2approvedwithreservations
AT moniquezahnzabal handsonintroductiontoqueryingevolutionaryrelationshipsacrossmultipledatasourcesusingsparqlversion2peerreview1approved2approvedwithreservations
AT tarcisiomendesdefarias handsonintroductiontoqueryingevolutionaryrelationshipsacrossmultipledatasourcesusingsparqlversion2peerreview1approved2approvedwithreservations