Fingerprinting cities: differentiating subway microbiome functionality

Abstract Background Accumulating evidence suggests that the human microbiome impacts individual and public health. City subway systems are human-dense environments, where passengers often exchange microbes. The MetaSUB project participants collected samples from subway surfaces in different cities a...

Full description

Bibliographic Details
Main Authors: Chengsheng Zhu, Maximilian Miller, Nick Lusskin, Yannick Mahlich, Yanran Wang, Zishuo Zeng, Yana Bromberg
Format: Article
Language:English
Published: BMC 2019-10-01
Series:Biology Direct
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13062-019-0252-y
_version_ 1828787377688018944
author Chengsheng Zhu
Maximilian Miller
Nick Lusskin
Yannick Mahlich
Yanran Wang
Zishuo Zeng
Yana Bromberg
author_facet Chengsheng Zhu
Maximilian Miller
Nick Lusskin
Yannick Mahlich
Yanran Wang
Zishuo Zeng
Yana Bromberg
author_sort Chengsheng Zhu
collection DOAJ
description Abstract Background Accumulating evidence suggests that the human microbiome impacts individual and public health. City subway systems are human-dense environments, where passengers often exchange microbes. The MetaSUB project participants collected samples from subway surfaces in different cities and performed metagenomic sequencing. Previous studies focused on taxonomic composition of these microbiomes and no explicit functional analysis had been done till now. Results As a part of the 2018 CAMDA challenge, we functionally profiled the available ~ 400 subway metagenomes and built predictor for city origin. In cross-validation, our model reached 81% accuracy when only the top-ranked city assignment was considered and 95% accuracy if the second city was taken into account as well. Notably, this performance was only achievable if the similarity of distribution of cities in the training and testing sets was similar. To assure that our methods are applicable without such biased assumptions we balanced our training data to account for all represented cities equally well. After balancing, the performance of our method was slightly lower (76/94%, respectively, for one or two top ranked cities), but still consistently high. Here we attained an added benefit of independence of training set city representation. In testing, our unbalanced model thus reached (an over-estimated) performance of 90/97%, while our balanced model was at a more reliable 63/90% accuracy. While, by definition of our model, we were not able to predict the microbiome origins previously unseen, our balanced model correctly judged them to be NOT-from-training-cities over 80% of the time. Our function-based outlook on microbiomes also allowed us to note similarities between both regionally close and far-away cities. Curiously, we identified the depletion in mycobacterial functions as a signature of cities in New Zealand, while photosynthesis related functions fingerprinted New York, Porto and Tokyo. Conclusions We demonstrated the power of our high-speed function annotation method, mi-faser, by analysing ~ 400 shotgun metagenomes in 2 days, with the results recapitulating functional signals of different city subway microbiomes. We also showed the importance of balanced data in avoiding over-estimated performance. Our results revealed similarities between both geographically close (Ofa and Ilorin) and distant (Boston and Porto, Lisbon and New York) city subway microbiomes. The photosynthesis related functional signatures of NYC were previously unseen in taxonomy studies, highlighting the strength of functional analysis.
first_indexed 2024-12-12T00:30:20Z
format Article
id doaj.art-6551297c138f4958b10bc00cf5d85d61
institution Directory Open Access Journal
issn 1745-6150
language English
last_indexed 2024-12-12T00:30:20Z
publishDate 2019-10-01
publisher BMC
record_format Article
series Biology Direct
spelling doaj.art-6551297c138f4958b10bc00cf5d85d612022-12-22T00:44:30ZengBMCBiology Direct1745-61502019-10-0114111010.1186/s13062-019-0252-yFingerprinting cities: differentiating subway microbiome functionalityChengsheng Zhu0Maximilian Miller1Nick Lusskin2Yannick Mahlich3Yanran Wang4Zishuo Zeng5Yana Bromberg6Department of Biochemistry and Microbiology, Rutgers UniversityDepartment of Biochemistry and Microbiology, Rutgers UniversityDepartment of Biochemistry and Microbiology, Rutgers UniversityDepartment of Biochemistry and Microbiology, Rutgers UniversityDepartment of Biochemistry and Microbiology, Rutgers UniversityDepartment of Biochemistry and Microbiology, Rutgers UniversityDepartment of Biochemistry and Microbiology, Rutgers UniversityAbstract Background Accumulating evidence suggests that the human microbiome impacts individual and public health. City subway systems are human-dense environments, where passengers often exchange microbes. The MetaSUB project participants collected samples from subway surfaces in different cities and performed metagenomic sequencing. Previous studies focused on taxonomic composition of these microbiomes and no explicit functional analysis had been done till now. Results As a part of the 2018 CAMDA challenge, we functionally profiled the available ~ 400 subway metagenomes and built predictor for city origin. In cross-validation, our model reached 81% accuracy when only the top-ranked city assignment was considered and 95% accuracy if the second city was taken into account as well. Notably, this performance was only achievable if the similarity of distribution of cities in the training and testing sets was similar. To assure that our methods are applicable without such biased assumptions we balanced our training data to account for all represented cities equally well. After balancing, the performance of our method was slightly lower (76/94%, respectively, for one or two top ranked cities), but still consistently high. Here we attained an added benefit of independence of training set city representation. In testing, our unbalanced model thus reached (an over-estimated) performance of 90/97%, while our balanced model was at a more reliable 63/90% accuracy. While, by definition of our model, we were not able to predict the microbiome origins previously unseen, our balanced model correctly judged them to be NOT-from-training-cities over 80% of the time. Our function-based outlook on microbiomes also allowed us to note similarities between both regionally close and far-away cities. Curiously, we identified the depletion in mycobacterial functions as a signature of cities in New Zealand, while photosynthesis related functions fingerprinted New York, Porto and Tokyo. Conclusions We demonstrated the power of our high-speed function annotation method, mi-faser, by analysing ~ 400 shotgun metagenomes in 2 days, with the results recapitulating functional signals of different city subway microbiomes. We also showed the importance of balanced data in avoiding over-estimated performance. Our results revealed similarities between both geographically close (Ofa and Ilorin) and distant (Boston and Porto, Lisbon and New York) city subway microbiomes. The photosynthesis related functional signatures of NYC were previously unseen in taxonomy studies, highlighting the strength of functional analysis.http://link.springer.com/article/10.1186/s13062-019-0252-yMicrobiomeFunction analysisMachine learningmi-faserMetaSUB
spellingShingle Chengsheng Zhu
Maximilian Miller
Nick Lusskin
Yannick Mahlich
Yanran Wang
Zishuo Zeng
Yana Bromberg
Fingerprinting cities: differentiating subway microbiome functionality
Biology Direct
Microbiome
Function analysis
Machine learning
mi-faser
MetaSUB
title Fingerprinting cities: differentiating subway microbiome functionality
title_full Fingerprinting cities: differentiating subway microbiome functionality
title_fullStr Fingerprinting cities: differentiating subway microbiome functionality
title_full_unstemmed Fingerprinting cities: differentiating subway microbiome functionality
title_short Fingerprinting cities: differentiating subway microbiome functionality
title_sort fingerprinting cities differentiating subway microbiome functionality
topic Microbiome
Function analysis
Machine learning
mi-faser
MetaSUB
url http://link.springer.com/article/10.1186/s13062-019-0252-y
work_keys_str_mv AT chengshengzhu fingerprintingcitiesdifferentiatingsubwaymicrobiomefunctionality
AT maximilianmiller fingerprintingcitiesdifferentiatingsubwaymicrobiomefunctionality
AT nicklusskin fingerprintingcitiesdifferentiatingsubwaymicrobiomefunctionality
AT yannickmahlich fingerprintingcitiesdifferentiatingsubwaymicrobiomefunctionality
AT yanranwang fingerprintingcitiesdifferentiatingsubwaymicrobiomefunctionality
AT zishuozeng fingerprintingcitiesdifferentiatingsubwaymicrobiomefunctionality
AT yanabromberg fingerprintingcitiesdifferentiatingsubwaymicrobiomefunctionality