LUSTR: a new customizable tool for calling genome-wide germline and somatic short tandem repeat variants

Abstract Background Short tandem repeats (STRs) are widely distributed across the human genome and are associated with numerous neurological disorders. However, the extent that STRs contribute to disease is likely under-estimated because of the challenges calling these variants in short read next ge...

Full description

Bibliographic Details
Main Authors: Jinfeng Lu, Camilo Toro, David R. Adams, Undiagnosed Diseases Network, Cristiane Araujo Martins Moreno, Wan-Ping Lee, Yuk Yee Leung, Mathew B. Harms, Badri Vardarajan, Erin L. Heinzen
Format: Article
Language:English
Published: BMC 2024-01-01
Series:BMC Genomics
Subjects:
Online Access:https://doi.org/10.1186/s12864-023-09935-9
_version_ 1797340263914405888
author Jinfeng Lu
Camilo Toro
David R. Adams
Undiagnosed Diseases Network
Cristiane Araujo Martins Moreno
Wan-Ping Lee
Yuk Yee Leung
Mathew B. Harms
Badri Vardarajan
Erin L. Heinzen
author_facet Jinfeng Lu
Camilo Toro
David R. Adams
Undiagnosed Diseases Network
Cristiane Araujo Martins Moreno
Wan-Ping Lee
Yuk Yee Leung
Mathew B. Harms
Badri Vardarajan
Erin L. Heinzen
author_sort Jinfeng Lu
collection DOAJ
description Abstract Background Short tandem repeats (STRs) are widely distributed across the human genome and are associated with numerous neurological disorders. However, the extent that STRs contribute to disease is likely under-estimated because of the challenges calling these variants in short read next generation sequencing data. Several computational tools have been developed for STR variant calling, but none fully address all of the complexities associated with this variant class. Results Here we introduce LUSTR which is designed to address some of the challenges associated with STR variant calling by enabling more flexibility in defining STR loci, allowing for customizable modules to tailor analyses, and expanding the capability to call somatic and multiallelic STR variants. LUSTR is a user-friendly and easily customizable tool for targeted or unbiased genome-wide STR variant screening that can use either predefined or novel genome builds. Using both simulated and real data sets, we demonstrated that LUSTR accurately infers germline and somatic STR expansions in individuals with and without diseases. Conclusions LUSTR offers a powerful and user-friendly approach that allows for the identification of STR variants and can facilitate more comprehensive studies evaluating the role of pathogenic STR variants across human diseases.
first_indexed 2024-03-08T10:00:29Z
format Article
id doaj.art-10f83745c1f7435fb5ac3dd6aa04ee9f
institution Directory Open Access Journal
issn 1471-2164
language English
last_indexed 2024-03-08T10:00:29Z
publishDate 2024-01-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj.art-10f83745c1f7435fb5ac3dd6aa04ee9f2024-01-29T10:59:47ZengBMCBMC Genomics1471-21642024-01-0125112210.1186/s12864-023-09935-9LUSTR: a new customizable tool for calling genome-wide germline and somatic short tandem repeat variantsJinfeng Lu0Camilo Toro1David R. Adams2Undiagnosed Diseases Network3Cristiane Araujo Martins Moreno4Wan-Ping Lee5Yuk Yee Leung6Mathew B. Harms7Badri Vardarajan8Erin L. Heinzen9Division of Pharmacotherapy and Experimental Therapeutics, Eshelman School of Pharmacy, University of North Carolina at Chapel HillNIH Undiagnosed Diseases Program, National Human Genome Research Institute (NHGRI), National Institutes of HealthNIH Undiagnosed Diseases Program, National Human Genome Research Institute (NHGRI), National Institutes of HealthNIH Undiagnosed Diseases Program, National Human Genome Research Institute (NHGRI), National Institutes of HealthNeurology Department, Universidade de São PauloPenn Neurodegeneration Genomics Center, Department of Pathology and Laboratory MedicinePerelman School of Medicine, University of PennsylvaniaPenn Neurodegeneration Genomics Center, Department of Pathology and Laboratory MedicinePerelman School of Medicine, University of PennsylvaniaDepartment of Neurology, Division of Neuromuscular Medicine, Columbia University Irving Medical CenterThe Taub Institute for Research On Alzheimer’s Disease and the Aging Brain, Gertrude H. Sergievsky Center, Department of Neurology, College of Physicians and Surgeons, Columbia University, The New York Presbyterian HospitalDivision of Pharmacotherapy and Experimental Therapeutics, Eshelman School of Pharmacy, University of North Carolina at Chapel HillAbstract Background Short tandem repeats (STRs) are widely distributed across the human genome and are associated with numerous neurological disorders. However, the extent that STRs contribute to disease is likely under-estimated because of the challenges calling these variants in short read next generation sequencing data. Several computational tools have been developed for STR variant calling, but none fully address all of the complexities associated with this variant class. Results Here we introduce LUSTR which is designed to address some of the challenges associated with STR variant calling by enabling more flexibility in defining STR loci, allowing for customizable modules to tailor analyses, and expanding the capability to call somatic and multiallelic STR variants. LUSTR is a user-friendly and easily customizable tool for targeted or unbiased genome-wide STR variant screening that can use either predefined or novel genome builds. Using both simulated and real data sets, we demonstrated that LUSTR accurately infers germline and somatic STR expansions in individuals with and without diseases. Conclusions LUSTR offers a powerful and user-friendly approach that allows for the identification of STR variants and can facilitate more comprehensive studies evaluating the role of pathogenic STR variants across human diseases.https://doi.org/10.1186/s12864-023-09935-9Short tandem repeatsBioinformaticsVariant calling tool kitSomaticLUSTR
spellingShingle Jinfeng Lu
Camilo Toro
David R. Adams
Undiagnosed Diseases Network
Cristiane Araujo Martins Moreno
Wan-Ping Lee
Yuk Yee Leung
Mathew B. Harms
Badri Vardarajan
Erin L. Heinzen
LUSTR: a new customizable tool for calling genome-wide germline and somatic short tandem repeat variants
BMC Genomics
Short tandem repeats
Bioinformatics
Variant calling tool kit
Somatic
LUSTR
title LUSTR: a new customizable tool for calling genome-wide germline and somatic short tandem repeat variants
title_full LUSTR: a new customizable tool for calling genome-wide germline and somatic short tandem repeat variants
title_fullStr LUSTR: a new customizable tool for calling genome-wide germline and somatic short tandem repeat variants
title_full_unstemmed LUSTR: a new customizable tool for calling genome-wide germline and somatic short tandem repeat variants
title_short LUSTR: a new customizable tool for calling genome-wide germline and somatic short tandem repeat variants
title_sort lustr a new customizable tool for calling genome wide germline and somatic short tandem repeat variants
topic Short tandem repeats
Bioinformatics
Variant calling tool kit
Somatic
LUSTR
url https://doi.org/10.1186/s12864-023-09935-9
work_keys_str_mv AT jinfenglu lustranewcustomizabletoolforcallinggenomewidegermlineandsomaticshorttandemrepeatvariants
AT camilotoro lustranewcustomizabletoolforcallinggenomewidegermlineandsomaticshorttandemrepeatvariants
AT davidradams lustranewcustomizabletoolforcallinggenomewidegermlineandsomaticshorttandemrepeatvariants
AT undiagnoseddiseasesnetwork lustranewcustomizabletoolforcallinggenomewidegermlineandsomaticshorttandemrepeatvariants
AT cristianearaujomartinsmoreno lustranewcustomizabletoolforcallinggenomewidegermlineandsomaticshorttandemrepeatvariants
AT wanpinglee lustranewcustomizabletoolforcallinggenomewidegermlineandsomaticshorttandemrepeatvariants
AT yukyeeleung lustranewcustomizabletoolforcallinggenomewidegermlineandsomaticshorttandemrepeatvariants
AT mathewbharms lustranewcustomizabletoolforcallinggenomewidegermlineandsomaticshorttandemrepeatvariants
AT badrivardarajan lustranewcustomizabletoolforcallinggenomewidegermlineandsomaticshorttandemrepeatvariants
AT erinlheinzen lustranewcustomizabletoolforcallinggenomewidegermlineandsomaticshorttandemrepeatvariants