Greedy de novo motif discovery to construct motif repositories for bacterial proteomes

Abstract Background Bacterial surfaces are complex systems, constructed from membranes, peptidoglycan and, importantly, proteins. The proteins play crucial roles as critical regulators of how the bacterium interacts with and survive in its environment. A full catalog of the motifs in protein familie...

Full description

Bibliographic Details
Main Authors: Hamed Khakzad, Johan Malmström, Lars Malmström
Format: Article
Language:English
Published: BMC 2019-04-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-019-2686-8
_version_ 1818113814326935552
author Hamed Khakzad
Johan Malmström
Lars Malmström
author_facet Hamed Khakzad
Johan Malmström
Lars Malmström
author_sort Hamed Khakzad
collection DOAJ
description Abstract Background Bacterial surfaces are complex systems, constructed from membranes, peptidoglycan and, importantly, proteins. The proteins play crucial roles as critical regulators of how the bacterium interacts with and survive in its environment. A full catalog of the motifs in protein families and their relative conservation grade is a prerequisite to target the protein-protein interaction that bacterial surface protein makes to host proteins. Results In this paper, we propose a greedy approach to identify conserved motifs in large sequence families iteratively. Each iteration discovers a motif de novo and masks all occurrences of that motif. Remaining unmasked sequences are subjected to the next round of motif detection until no more significant motifs can be found. We demonstrate the utility of the method through the construction of a proteome-wide motif repository for Group A Streptococcus (GAS), a significant human pathogen. GAS produce numerous surface proteins that interact with over 100 human plasma proteins, helping the bacteria to evade the host immune response. We used the repository to find that proteins part of the bacterial surface has motif architectures that differ from intracellular proteins. Conclusions We elucidate that the M protein, a coiled-coil homodimer that extends over 500 A from the cell wall, has a motif architecture that differs between various GAS strains. As the M protein is known to bind a variety of different plasma proteins, the results indicate that the different motif architectures are responsible for the quantitative differences of plasma proteins that various strains bind. The speed and applicability of the method enable its application to all major human pathogens.
first_indexed 2024-12-11T03:40:49Z
format Article
id doaj.art-aa09d15c51a74dd98d92f94166d2e176
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-11T03:40:49Z
publishDate 2019-04-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-aa09d15c51a74dd98d92f94166d2e1762022-12-22T01:22:08ZengBMCBMC Bioinformatics1471-21052019-04-0120S411010.1186/s12859-019-2686-8Greedy de novo motif discovery to construct motif repositories for bacterial proteomesHamed Khakzad0Johan Malmström1Lars Malmström2Faculty of Science, Institute for Computational Science, University of ZurichDivision of Infection Medicine, Department of Clinical 432 Sciences, Lund UniversityFaculty of Science, Institute for Computational Science, University of ZurichAbstract Background Bacterial surfaces are complex systems, constructed from membranes, peptidoglycan and, importantly, proteins. The proteins play crucial roles as critical regulators of how the bacterium interacts with and survive in its environment. A full catalog of the motifs in protein families and their relative conservation grade is a prerequisite to target the protein-protein interaction that bacterial surface protein makes to host proteins. Results In this paper, we propose a greedy approach to identify conserved motifs in large sequence families iteratively. Each iteration discovers a motif de novo and masks all occurrences of that motif. Remaining unmasked sequences are subjected to the next round of motif detection until no more significant motifs can be found. We demonstrate the utility of the method through the construction of a proteome-wide motif repository for Group A Streptococcus (GAS), a significant human pathogen. GAS produce numerous surface proteins that interact with over 100 human plasma proteins, helping the bacteria to evade the host immune response. We used the repository to find that proteins part of the bacterial surface has motif architectures that differ from intracellular proteins. Conclusions We elucidate that the M protein, a coiled-coil homodimer that extends over 500 A from the cell wall, has a motif architecture that differs between various GAS strains. As the M protein is known to bind a variety of different plasma proteins, the results indicate that the different motif architectures are responsible for the quantitative differences of plasma proteins that various strains bind. The speed and applicability of the method enable its application to all major human pathogens.http://link.springer.com/article/10.1186/s12859-019-2686-8De novo motif discoveryInfectious diseasesGroup A streptococcus
spellingShingle Hamed Khakzad
Johan Malmström
Lars Malmström
Greedy de novo motif discovery to construct motif repositories for bacterial proteomes
BMC Bioinformatics
De novo motif discovery
Infectious diseases
Group A streptococcus
title Greedy de novo motif discovery to construct motif repositories for bacterial proteomes
title_full Greedy de novo motif discovery to construct motif repositories for bacterial proteomes
title_fullStr Greedy de novo motif discovery to construct motif repositories for bacterial proteomes
title_full_unstemmed Greedy de novo motif discovery to construct motif repositories for bacterial proteomes
title_short Greedy de novo motif discovery to construct motif repositories for bacterial proteomes
title_sort greedy de novo motif discovery to construct motif repositories for bacterial proteomes
topic De novo motif discovery
Infectious diseases
Group A streptococcus
url http://link.springer.com/article/10.1186/s12859-019-2686-8
work_keys_str_mv AT hamedkhakzad greedydenovomotifdiscoverytoconstructmotifrepositoriesforbacterialproteomes
AT johanmalmstrom greedydenovomotifdiscoverytoconstructmotifrepositoriesforbacterialproteomes
AT larsmalmstrom greedydenovomotifdiscoverytoconstructmotifrepositoriesforbacterialproteomes