Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes.

A primary motivation for sequencing the mouse genome was to accelerate the discovery of mammalian genes by using sequence conservation between mouse and human to identify coding exons. Achieving this goal proved challenging because of the large proportion of the mouse and human genomes that is appar...

Mô tả đầy đủ

Chi tiết về thư mục
Những tác giả chính: Guigo, R, Dermitzakis, E, Agarwal, P, Ponting, C, Parra, G, Reymond, A, Abril, J, Keibler, E, Lyle, R, Ucla, C, Antonarakis, SE, Brent, MR
Định dạng: Journal article
Ngôn ngữ:English
Được phát hành: 2003
_version_ 1826269013892661248
author Guigo, R
Dermitzakis, E
Agarwal, P
Ponting, C
Parra, G
Reymond, A
Abril, J
Keibler, E
Lyle, R
Ucla, C
Antonarakis, SE
Brent, MR
author_facet Guigo, R
Dermitzakis, E
Agarwal, P
Ponting, C
Parra, G
Reymond, A
Abril, J
Keibler, E
Lyle, R
Ucla, C
Antonarakis, SE
Brent, MR
author_sort Guigo, R
collection OXFORD
description A primary motivation for sequencing the mouse genome was to accelerate the discovery of mammalian genes by using sequence conservation between mouse and human to identify coding exons. Achieving this goal proved challenging because of the large proportion of the mouse and human genomes that is apparently conserved but apparently does not code for protein. We developed a two-stage procedure that exploits the mouse and human genome sequences to produce a set of genes with a much higher rate of experimental verification than previously reported prediction methods. RT-PCR amplification and direct sequencing applied to an initial sample of mouse predictions that do not overlap previously known genes verified the regions flanking one intron in 139 predictions, with verification rates reaching 76%. On average, the confirmed predictions show more restricted expression patterns than the mouse orthologs of known human genes, and two-thirds lack homologs in fish genomes, demonstrating the sensitivity of this dual-genome approach to hard-to-find genes. We verified 112 previously unknown homologs of known proteins, including two homeobox proteins relevant to developmental biology, an aquaporin, and a homolog of dystrophin. We estimate that transcription and splicing can be verified for >1,000 gene predictions identified by this method that do not overlap known genes. This is likely to constitute a significant fraction of the previously unknown, multiexon mammalian genes.
first_indexed 2024-03-06T21:18:24Z
format Journal article
id oxford-uuid:40958650-fb35-458e-af98-8c33c553cccb
institution University of Oxford
language English
last_indexed 2024-03-06T21:18:24Z
publishDate 2003
record_format dspace
spelling oxford-uuid:40958650-fb35-458e-af98-8c33c553cccb2022-03-26T14:38:50ZComparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes.Journal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:40958650-fb35-458e-af98-8c33c553cccbEnglishSymplectic Elements at Oxford2003Guigo, RDermitzakis, EAgarwal, PPonting, CParra, GReymond, AAbril, JKeibler, ELyle, RUcla, CAntonarakis, SEBrent, MRA primary motivation for sequencing the mouse genome was to accelerate the discovery of mammalian genes by using sequence conservation between mouse and human to identify coding exons. Achieving this goal proved challenging because of the large proportion of the mouse and human genomes that is apparently conserved but apparently does not code for protein. We developed a two-stage procedure that exploits the mouse and human genome sequences to produce a set of genes with a much higher rate of experimental verification than previously reported prediction methods. RT-PCR amplification and direct sequencing applied to an initial sample of mouse predictions that do not overlap previously known genes verified the regions flanking one intron in 139 predictions, with verification rates reaching 76%. On average, the confirmed predictions show more restricted expression patterns than the mouse orthologs of known human genes, and two-thirds lack homologs in fish genomes, demonstrating the sensitivity of this dual-genome approach to hard-to-find genes. We verified 112 previously unknown homologs of known proteins, including two homeobox proteins relevant to developmental biology, an aquaporin, and a homolog of dystrophin. We estimate that transcription and splicing can be verified for >1,000 gene predictions identified by this method that do not overlap known genes. This is likely to constitute a significant fraction of the previously unknown, multiexon mammalian genes.
spellingShingle Guigo, R
Dermitzakis, E
Agarwal, P
Ponting, C
Parra, G
Reymond, A
Abril, J
Keibler, E
Lyle, R
Ucla, C
Antonarakis, SE
Brent, MR
Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes.
title Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes.
title_full Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes.
title_fullStr Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes.
title_full_unstemmed Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes.
title_short Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes.
title_sort comparison of mouse and human genomes followed by experimental verification yields an estimated 1 019 additional genes
work_keys_str_mv AT guigor comparisonofmouseandhumangenomesfollowedbyexperimentalverificationyieldsanestimated1019additionalgenes
AT dermitzakise comparisonofmouseandhumangenomesfollowedbyexperimentalverificationyieldsanestimated1019additionalgenes
AT agarwalp comparisonofmouseandhumangenomesfollowedbyexperimentalverificationyieldsanestimated1019additionalgenes
AT pontingc comparisonofmouseandhumangenomesfollowedbyexperimentalverificationyieldsanestimated1019additionalgenes
AT parrag comparisonofmouseandhumangenomesfollowedbyexperimentalverificationyieldsanestimated1019additionalgenes
AT reymonda comparisonofmouseandhumangenomesfollowedbyexperimentalverificationyieldsanestimated1019additionalgenes
AT abrilj comparisonofmouseandhumangenomesfollowedbyexperimentalverificationyieldsanestimated1019additionalgenes
AT keiblere comparisonofmouseandhumangenomesfollowedbyexperimentalverificationyieldsanestimated1019additionalgenes
AT lyler comparisonofmouseandhumangenomesfollowedbyexperimentalverificationyieldsanestimated1019additionalgenes
AT uclac comparisonofmouseandhumangenomesfollowedbyexperimentalverificationyieldsanestimated1019additionalgenes
AT antonarakisse comparisonofmouseandhumangenomesfollowedbyexperimentalverificationyieldsanestimated1019additionalgenes
AT brentmr comparisonofmouseandhumangenomesfollowedbyexperimentalverificationyieldsanestimated1019additionalgenes