AlignScape, displaying sequence similarity using self-organizing maps

The current richness of sequence data needs efficient methodologies to display and analyze the complexity of the information in a compact and readable manner. Traditionally, phylogenetic trees and sequence similarity networks have been used to display and analyze sequences of protein families. These...

Full description

Bibliographic Details
Main Authors: Isaac Filella-Merce, Vincent Mallet, Eric Durand, Michael Nilges, Guillaume Bouvier, Riccardo Pellarin
Format: Article
Language:English
Published: Frontiers Media S.A. 2024-01-01
Series:Frontiers in Bioinformatics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fbinf.2024.1321508/full
_version_ 1797346064544563200
author Isaac Filella-Merce
Isaac Filella-Merce
Vincent Mallet
Eric Durand
Michael Nilges
Guillaume Bouvier
Riccardo Pellarin
Riccardo Pellarin
author_facet Isaac Filella-Merce
Isaac Filella-Merce
Vincent Mallet
Eric Durand
Michael Nilges
Guillaume Bouvier
Riccardo Pellarin
Riccardo Pellarin
author_sort Isaac Filella-Merce
collection DOAJ
description The current richness of sequence data needs efficient methodologies to display and analyze the complexity of the information in a compact and readable manner. Traditionally, phylogenetic trees and sequence similarity networks have been used to display and analyze sequences of protein families. These methods aim to shed light on key computational biology problems such as sequence classification and functional inference. Here, we present a new methodology, AlignScape, based on self-organizing maps. AlignScape is applied to three large families of proteins: the kinases and GPCRs from human, and bacterial T6SS proteins. AlignScape provides a map of the similarity landscape and a tree representation of multiple sequence alignments These representations are useful to display, cluster, and classify sequences as well as identify functional trends. The efficient GPU implementation of AlignScape allows the analysis of large MSAs in a few minutes. Furthermore, we show how the AlignScape analysis of proteins belonging to the T6SS complex can be used to predict coevolving partners.
first_indexed 2024-03-08T11:27:28Z
format Article
id doaj.art-0384b44ff500435caecb9c3e1a275366
institution Directory Open Access Journal
issn 2673-7647
language English
last_indexed 2024-03-08T11:27:28Z
publishDate 2024-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Bioinformatics
spelling doaj.art-0384b44ff500435caecb9c3e1a2753662024-01-26T04:48:45ZengFrontiers Media S.A.Frontiers in Bioinformatics2673-76472024-01-01410.3389/fbinf.2024.13215081321508AlignScape, displaying sequence similarity using self-organizing mapsIsaac Filella-Merce0Isaac Filella-Merce1Vincent Mallet2Eric Durand3Michael Nilges4Guillaume Bouvier5Riccardo Pellarin6Riccardo Pellarin7Life Sciences Department, Electronic and Atomic Protein Modeling Group (EAPM), Barcelona Supercomputing Center (BSC), Barcelona, SpainInstitut Pasteur, Université Paris Cité, CNRS UMR 3528, Structural Bioinformatics Unit, Paris, FranceInstitut Pasteur, Université Paris Cité, CNRS UMR 3528, Structural Bioinformatics Unit, Paris, FranceLaboratoire d'Ingénierie des Systèmes Macromoléculaires (LISM), Institut de Microbiologie de La Méditerranée (IM2B), Aix-Marseille Université, Centre National de La Recherche Scientifique (CNRS)-UMR 7255, Marseille, FranceInstitut Pasteur, Université Paris Cité, CNRS UMR 3528, Structural Bioinformatics Unit, Paris, FranceInstitut Pasteur, Université Paris Cité, CNRS UMR 3528, Structural Bioinformatics Unit, Paris, FranceMolecular Microbiology and Structural Biochemistry (MMSB), University of Lyon, Centre National de La Recherche Scientifique (CNRS)-UMR 5086, Lyon, FranceLaboratoire de Biologie et Modélisation de La Cellule, École Normale Supérieure de Lyon, CNRS, UMR 5239, Inserm U1293, Université Claude Bernard Lyon 1, Lyon, FranceThe current richness of sequence data needs efficient methodologies to display and analyze the complexity of the information in a compact and readable manner. Traditionally, phylogenetic trees and sequence similarity networks have been used to display and analyze sequences of protein families. These methods aim to shed light on key computational biology problems such as sequence classification and functional inference. Here, we present a new methodology, AlignScape, based on self-organizing maps. AlignScape is applied to three large families of proteins: the kinases and GPCRs from human, and bacterial T6SS proteins. AlignScape provides a map of the similarity landscape and a tree representation of multiple sequence alignments These representations are useful to display, cluster, and classify sequences as well as identify functional trends. The efficient GPU implementation of AlignScape allows the analysis of large MSAs in a few minutes. Furthermore, we show how the AlignScape analysis of proteins belonging to the T6SS complex can be used to predict coevolving partners.https://www.frontiersin.org/articles/10.3389/fbinf.2024.1321508/fullself-organizing maps (SOM)sequence similarity landscapeprotein sequence analysisprotein sequence visualizationhuman kinomehuman GPCRs
spellingShingle Isaac Filella-Merce
Isaac Filella-Merce
Vincent Mallet
Eric Durand
Michael Nilges
Guillaume Bouvier
Riccardo Pellarin
Riccardo Pellarin
AlignScape, displaying sequence similarity using self-organizing maps
Frontiers in Bioinformatics
self-organizing maps (SOM)
sequence similarity landscape
protein sequence analysis
protein sequence visualization
human kinome
human GPCRs
title AlignScape, displaying sequence similarity using self-organizing maps
title_full AlignScape, displaying sequence similarity using self-organizing maps
title_fullStr AlignScape, displaying sequence similarity using self-organizing maps
title_full_unstemmed AlignScape, displaying sequence similarity using self-organizing maps
title_short AlignScape, displaying sequence similarity using self-organizing maps
title_sort alignscape displaying sequence similarity using self organizing maps
topic self-organizing maps (SOM)
sequence similarity landscape
protein sequence analysis
protein sequence visualization
human kinome
human GPCRs
url https://www.frontiersin.org/articles/10.3389/fbinf.2024.1321508/full
work_keys_str_mv AT isaacfilellamerce alignscapedisplayingsequencesimilarityusingselforganizingmaps
AT isaacfilellamerce alignscapedisplayingsequencesimilarityusingselforganizingmaps
AT vincentmallet alignscapedisplayingsequencesimilarityusingselforganizingmaps
AT ericdurand alignscapedisplayingsequencesimilarityusingselforganizingmaps
AT michaelnilges alignscapedisplayingsequencesimilarityusingselforganizingmaps
AT guillaumebouvier alignscapedisplayingsequencesimilarityusingselforganizingmaps
AT riccardopellarin alignscapedisplayingsequencesimilarityusingselforganizingmaps
AT riccardopellarin alignscapedisplayingsequencesimilarityusingselforganizingmaps