Tandem-genotypes: robust detection of tandem repeat expansions from long DNA reads

Abstract Tandemly repeated DNA is highly mutable and causes at least 31 diseases, but it is hard to detect pathogenic repeat expansions genome-wide. Here, we report robust detection of human repeat expansions from careful alignments of long but error-prone (PacBio and nanopore) reads to a reference...

Full description

Bibliographic Details
Main Authors: Satomi Mitsuhashi, Martin C. Frith, Takeshi Mizuguchi, Satoko Miyatake, Tomoko Toyota, Hiroaki Adachi, Yoko Oma, Yoshihiro Kino, Hiroaki Mitsuhashi, Naomichi Matsumoto
Format: Article
Language:English
Published: BMC 2019-03-01
Series:Genome Biology
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13059-019-1667-6
_version_ 1819083167834832896
author Satomi Mitsuhashi
Martin C. Frith
Takeshi Mizuguchi
Satoko Miyatake
Tomoko Toyota
Hiroaki Adachi
Yoko Oma
Yoshihiro Kino
Hiroaki Mitsuhashi
Naomichi Matsumoto
author_facet Satomi Mitsuhashi
Martin C. Frith
Takeshi Mizuguchi
Satoko Miyatake
Tomoko Toyota
Hiroaki Adachi
Yoko Oma
Yoshihiro Kino
Hiroaki Mitsuhashi
Naomichi Matsumoto
author_sort Satomi Mitsuhashi
collection DOAJ
description Abstract Tandemly repeated DNA is highly mutable and causes at least 31 diseases, but it is hard to detect pathogenic repeat expansions genome-wide. Here, we report robust detection of human repeat expansions from careful alignments of long but error-prone (PacBio and nanopore) reads to a reference genome. Our method is robust to systematic sequencing errors, inexact repeats with fuzzy boundaries, and low sequencing coverage. By comparing to healthy controls, we prioritize pathogenic expansions within the top 10 out of 700,000 tandem repeats in whole genome sequencing data. This may help to elucidate the many genetic diseases whose causes remain unknown.
first_indexed 2024-12-21T20:28:16Z
format Article
id doaj.art-375652e5b6a34ae1afdfc702a64030a5
institution Directory Open Access Journal
issn 1474-760X
language English
last_indexed 2024-12-21T20:28:16Z
publishDate 2019-03-01
publisher BMC
record_format Article
series Genome Biology
spelling doaj.art-375652e5b6a34ae1afdfc702a64030a52022-12-21T18:51:19ZengBMCGenome Biology1474-760X2019-03-0120111710.1186/s13059-019-1667-6Tandem-genotypes: robust detection of tandem repeat expansions from long DNA readsSatomi Mitsuhashi0Martin C. Frith1Takeshi Mizuguchi2Satoko Miyatake3Tomoko Toyota4Hiroaki Adachi5Yoko Oma6Yoshihiro Kino7Hiroaki Mitsuhashi8Naomichi Matsumoto9Department of Human Genetics, Yokohama City University Graduate School of MedicineArtificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST)Department of Human Genetics, Yokohama City University Graduate School of MedicineDepartment of Human Genetics, Yokohama City University Graduate School of MedicineDepartment of Neurology, University of Occupational and Environmental Health School of MedicineDepartment of Neurology, University of Occupational and Environmental Health School of MedicineDepartment of Liberal Arts, Faculty of Medicine, Saitama Medical UniversityDepartment of Bioinformatics and Molecular Neuropathology, Meiji Pharmaceutical UniversityDepartment of Applied Biochemistry, School of Engineering, Tokai UniversityDepartment of Human Genetics, Yokohama City University Graduate School of MedicineAbstract Tandemly repeated DNA is highly mutable and causes at least 31 diseases, but it is hard to detect pathogenic repeat expansions genome-wide. Here, we report robust detection of human repeat expansions from careful alignments of long but error-prone (PacBio and nanopore) reads to a reference genome. Our method is robust to systematic sequencing errors, inexact repeats with fuzzy boundaries, and low sequencing coverage. By comparing to healthy controls, we prioritize pathogenic expansions within the top 10 out of 700,000 tandem repeats in whole genome sequencing data. This may help to elucidate the many genetic diseases whose causes remain unknown.http://link.springer.com/article/10.1186/s13059-019-1667-6Tandem repeatRepeat diseasesLong-read sequencingNanoporePacBio
spellingShingle Satomi Mitsuhashi
Martin C. Frith
Takeshi Mizuguchi
Satoko Miyatake
Tomoko Toyota
Hiroaki Adachi
Yoko Oma
Yoshihiro Kino
Hiroaki Mitsuhashi
Naomichi Matsumoto
Tandem-genotypes: robust detection of tandem repeat expansions from long DNA reads
Genome Biology
Tandem repeat
Repeat diseases
Long-read sequencing
Nanopore
PacBio
title Tandem-genotypes: robust detection of tandem repeat expansions from long DNA reads
title_full Tandem-genotypes: robust detection of tandem repeat expansions from long DNA reads
title_fullStr Tandem-genotypes: robust detection of tandem repeat expansions from long DNA reads
title_full_unstemmed Tandem-genotypes: robust detection of tandem repeat expansions from long DNA reads
title_short Tandem-genotypes: robust detection of tandem repeat expansions from long DNA reads
title_sort tandem genotypes robust detection of tandem repeat expansions from long dna reads
topic Tandem repeat
Repeat diseases
Long-read sequencing
Nanopore
PacBio
url http://link.springer.com/article/10.1186/s13059-019-1667-6
work_keys_str_mv AT satomimitsuhashi tandemgenotypesrobustdetectionoftandemrepeatexpansionsfromlongdnareads
AT martincfrith tandemgenotypesrobustdetectionoftandemrepeatexpansionsfromlongdnareads
AT takeshimizuguchi tandemgenotypesrobustdetectionoftandemrepeatexpansionsfromlongdnareads
AT satokomiyatake tandemgenotypesrobustdetectionoftandemrepeatexpansionsfromlongdnareads
AT tomokotoyota tandemgenotypesrobustdetectionoftandemrepeatexpansionsfromlongdnareads
AT hiroakiadachi tandemgenotypesrobustdetectionoftandemrepeatexpansionsfromlongdnareads
AT yokooma tandemgenotypesrobustdetectionoftandemrepeatexpansionsfromlongdnareads
AT yoshihirokino tandemgenotypesrobustdetectionoftandemrepeatexpansionsfromlongdnareads
AT hiroakimitsuhashi tandemgenotypesrobustdetectionoftandemrepeatexpansionsfromlongdnareads
AT naomichimatsumoto tandemgenotypesrobustdetectionoftandemrepeatexpansionsfromlongdnareads