Using genomic sequencing for classical genetics in E. coli K12.

We here develop computational methods to facilitate use of 454 whole genome shotgun sequencing to identify mutations in Escherichia coli K12. We had Roche sequence eight related strains derived as spontaneous mutants in a background without a whole genome sequence. They provided difference tables ba...

Full description

Bibliographic Details
Main Authors: Eric Lyons, Michael Freeling, Sydney Kustu, William Inwood
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2011-02-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3045373?pdf=render
_version_ 1818310097505353728
author Eric Lyons
Michael Freeling
Sydney Kustu
William Inwood
author_facet Eric Lyons
Michael Freeling
Sydney Kustu
William Inwood
author_sort Eric Lyons
collection DOAJ
description We here develop computational methods to facilitate use of 454 whole genome shotgun sequencing to identify mutations in Escherichia coli K12. We had Roche sequence eight related strains derived as spontaneous mutants in a background without a whole genome sequence. They provided difference tables based on assembling each genome to reference strain E. coli MG1655 (NC_000913). Due to the evolutionary distance to MG1655, these contained a large number of both false negatives and positives. By manual analysis of the dataset, we detected all the known mutations (24 at nine locations) and identified and genetically confirmed new mutations necessary and sufficient for the phenotypes we had selected in four strains. We then had Roche assemble contigs de novo, which we further assembled to full-length pseudomolecules based on synteny with MG1655. This hybrid method facilitated detection of insertion mutations and allowed annotation from MG1655. After removing one genome with less than the optimal 20- to 30-fold sequence coverage, we identified 544 putative polymorphisms that included all of the known and selected mutations apart from insertions. Finally, we detected seven new mutations in a total of only 41 candidates by comparing single genomes to composite data for the remaining six and using a ranking system to penalize homopolymer sequencing and misassembly errors. An additional benefit of the analysis is a table of differences between MG1655 and a physiologically robust E. coli wild-type strain NCM3722. Both projects were greatly facilitated by use of comparative genomics tools in the CoGe software package (http://genomevolution.org/).
first_indexed 2024-12-13T07:40:39Z
format Article
id doaj.art-823fa0e3def84f71b0c25f5eb666cda2
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-13T07:40:39Z
publishDate 2011-02-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-823fa0e3def84f71b0c25f5eb666cda22022-12-21T23:54:57ZengPublic Library of Science (PLoS)PLoS ONE1932-62032011-02-0162e1671710.1371/journal.pone.0016717Using genomic sequencing for classical genetics in E. coli K12.Eric LyonsMichael FreelingSydney KustuWilliam InwoodWe here develop computational methods to facilitate use of 454 whole genome shotgun sequencing to identify mutations in Escherichia coli K12. We had Roche sequence eight related strains derived as spontaneous mutants in a background without a whole genome sequence. They provided difference tables based on assembling each genome to reference strain E. coli MG1655 (NC_000913). Due to the evolutionary distance to MG1655, these contained a large number of both false negatives and positives. By manual analysis of the dataset, we detected all the known mutations (24 at nine locations) and identified and genetically confirmed new mutations necessary and sufficient for the phenotypes we had selected in four strains. We then had Roche assemble contigs de novo, which we further assembled to full-length pseudomolecules based on synteny with MG1655. This hybrid method facilitated detection of insertion mutations and allowed annotation from MG1655. After removing one genome with less than the optimal 20- to 30-fold sequence coverage, we identified 544 putative polymorphisms that included all of the known and selected mutations apart from insertions. Finally, we detected seven new mutations in a total of only 41 candidates by comparing single genomes to composite data for the remaining six and using a ranking system to penalize homopolymer sequencing and misassembly errors. An additional benefit of the analysis is a table of differences between MG1655 and a physiologically robust E. coli wild-type strain NCM3722. Both projects were greatly facilitated by use of comparative genomics tools in the CoGe software package (http://genomevolution.org/).http://europepmc.org/articles/PMC3045373?pdf=render
spellingShingle Eric Lyons
Michael Freeling
Sydney Kustu
William Inwood
Using genomic sequencing for classical genetics in E. coli K12.
PLoS ONE
title Using genomic sequencing for classical genetics in E. coli K12.
title_full Using genomic sequencing for classical genetics in E. coli K12.
title_fullStr Using genomic sequencing for classical genetics in E. coli K12.
title_full_unstemmed Using genomic sequencing for classical genetics in E. coli K12.
title_short Using genomic sequencing for classical genetics in E. coli K12.
title_sort using genomic sequencing for classical genetics in e coli k12
url http://europepmc.org/articles/PMC3045373?pdf=render
work_keys_str_mv AT ericlyons usinggenomicsequencingforclassicalgeneticsinecolik12
AT michaelfreeling usinggenomicsequencingforclassicalgeneticsinecolik12
AT sydneykustu usinggenomicsequencingforclassicalgeneticsinecolik12
AT williaminwood usinggenomicsequencingforclassicalgeneticsinecolik12