AlignScape, displaying sequence similarity using self-organizing maps
The current richness of sequence data needs efficient methodologies to display and analyze the complexity of the information in a compact and readable manner. Traditionally, phylogenetic trees and sequence similarity networks have been used to display and analyze sequences of protein families. These...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2024-01-01
|
Series: | Frontiers in Bioinformatics |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fbinf.2024.1321508/full |
_version_ | 1797346064544563200 |
---|---|
author | Isaac Filella-Merce Isaac Filella-Merce Vincent Mallet Eric Durand Michael Nilges Guillaume Bouvier Riccardo Pellarin Riccardo Pellarin |
author_facet | Isaac Filella-Merce Isaac Filella-Merce Vincent Mallet Eric Durand Michael Nilges Guillaume Bouvier Riccardo Pellarin Riccardo Pellarin |
author_sort | Isaac Filella-Merce |
collection | DOAJ |
description | The current richness of sequence data needs efficient methodologies to display and analyze the complexity of the information in a compact and readable manner. Traditionally, phylogenetic trees and sequence similarity networks have been used to display and analyze sequences of protein families. These methods aim to shed light on key computational biology problems such as sequence classification and functional inference. Here, we present a new methodology, AlignScape, based on self-organizing maps. AlignScape is applied to three large families of proteins: the kinases and GPCRs from human, and bacterial T6SS proteins. AlignScape provides a map of the similarity landscape and a tree representation of multiple sequence alignments These representations are useful to display, cluster, and classify sequences as well as identify functional trends. The efficient GPU implementation of AlignScape allows the analysis of large MSAs in a few minutes. Furthermore, we show how the AlignScape analysis of proteins belonging to the T6SS complex can be used to predict coevolving partners. |
first_indexed | 2024-03-08T11:27:28Z |
format | Article |
id | doaj.art-0384b44ff500435caecb9c3e1a275366 |
institution | Directory Open Access Journal |
issn | 2673-7647 |
language | English |
last_indexed | 2024-03-08T11:27:28Z |
publishDate | 2024-01-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Bioinformatics |
spelling | doaj.art-0384b44ff500435caecb9c3e1a2753662024-01-26T04:48:45ZengFrontiers Media S.A.Frontiers in Bioinformatics2673-76472024-01-01410.3389/fbinf.2024.13215081321508AlignScape, displaying sequence similarity using self-organizing mapsIsaac Filella-Merce0Isaac Filella-Merce1Vincent Mallet2Eric Durand3Michael Nilges4Guillaume Bouvier5Riccardo Pellarin6Riccardo Pellarin7Life Sciences Department, Electronic and Atomic Protein Modeling Group (EAPM), Barcelona Supercomputing Center (BSC), Barcelona, SpainInstitut Pasteur, Université Paris Cité, CNRS UMR 3528, Structural Bioinformatics Unit, Paris, FranceInstitut Pasteur, Université Paris Cité, CNRS UMR 3528, Structural Bioinformatics Unit, Paris, FranceLaboratoire d'Ingénierie des Systèmes Macromoléculaires (LISM), Institut de Microbiologie de La Méditerranée (IM2B), Aix-Marseille Université, Centre National de La Recherche Scientifique (CNRS)-UMR 7255, Marseille, FranceInstitut Pasteur, Université Paris Cité, CNRS UMR 3528, Structural Bioinformatics Unit, Paris, FranceInstitut Pasteur, Université Paris Cité, CNRS UMR 3528, Structural Bioinformatics Unit, Paris, FranceMolecular Microbiology and Structural Biochemistry (MMSB), University of Lyon, Centre National de La Recherche Scientifique (CNRS)-UMR 5086, Lyon, FranceLaboratoire de Biologie et Modélisation de La Cellule, École Normale Supérieure de Lyon, CNRS, UMR 5239, Inserm U1293, Université Claude Bernard Lyon 1, Lyon, FranceThe current richness of sequence data needs efficient methodologies to display and analyze the complexity of the information in a compact and readable manner. Traditionally, phylogenetic trees and sequence similarity networks have been used to display and analyze sequences of protein families. These methods aim to shed light on key computational biology problems such as sequence classification and functional inference. Here, we present a new methodology, AlignScape, based on self-organizing maps. AlignScape is applied to three large families of proteins: the kinases and GPCRs from human, and bacterial T6SS proteins. AlignScape provides a map of the similarity landscape and a tree representation of multiple sequence alignments These representations are useful to display, cluster, and classify sequences as well as identify functional trends. The efficient GPU implementation of AlignScape allows the analysis of large MSAs in a few minutes. Furthermore, we show how the AlignScape analysis of proteins belonging to the T6SS complex can be used to predict coevolving partners.https://www.frontiersin.org/articles/10.3389/fbinf.2024.1321508/fullself-organizing maps (SOM)sequence similarity landscapeprotein sequence analysisprotein sequence visualizationhuman kinomehuman GPCRs |
spellingShingle | Isaac Filella-Merce Isaac Filella-Merce Vincent Mallet Eric Durand Michael Nilges Guillaume Bouvier Riccardo Pellarin Riccardo Pellarin AlignScape, displaying sequence similarity using self-organizing maps Frontiers in Bioinformatics self-organizing maps (SOM) sequence similarity landscape protein sequence analysis protein sequence visualization human kinome human GPCRs |
title | AlignScape, displaying sequence similarity using self-organizing maps |
title_full | AlignScape, displaying sequence similarity using self-organizing maps |
title_fullStr | AlignScape, displaying sequence similarity using self-organizing maps |
title_full_unstemmed | AlignScape, displaying sequence similarity using self-organizing maps |
title_short | AlignScape, displaying sequence similarity using self-organizing maps |
title_sort | alignscape displaying sequence similarity using self organizing maps |
topic | self-organizing maps (SOM) sequence similarity landscape protein sequence analysis protein sequence visualization human kinome human GPCRs |
url | https://www.frontiersin.org/articles/10.3389/fbinf.2024.1321508/full |
work_keys_str_mv | AT isaacfilellamerce alignscapedisplayingsequencesimilarityusingselforganizingmaps AT isaacfilellamerce alignscapedisplayingsequencesimilarityusingselforganizingmaps AT vincentmallet alignscapedisplayingsequencesimilarityusingselforganizingmaps AT ericdurand alignscapedisplayingsequencesimilarityusingselforganizingmaps AT michaelnilges alignscapedisplayingsequencesimilarityusingselforganizingmaps AT guillaumebouvier alignscapedisplayingsequencesimilarityusingselforganizingmaps AT riccardopellarin alignscapedisplayingsequencesimilarityusingselforganizingmaps AT riccardopellarin alignscapedisplayingsequencesimilarityusingselforganizingmaps |