PyPop: a mature open-source software pipeline for population genomics

Python for Population Genomics (PyPop) is a software package that processes genotype and allele data and performs large-scale population genetic analyses on highly polymorphic multi-locus genotype data. In particular, PyPop tests data conformity to Hardy-Weinberg equilibrium expectations, performs E...

Full description

Bibliographic Details
Main Authors: Alexander K. Lancaster, Richard M. Single, Steven J. Mack, Vanessa Sochat, Michael P. Mariani, Gordon D. Webster
Format: Article
Language:English
Published: Frontiers Media S.A. 2024-04-01
Series:Frontiers in Immunology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fimmu.2024.1378512/full
_version_ 1827298251654561792
author Alexander K. Lancaster
Alexander K. Lancaster
Alexander K. Lancaster
Richard M. Single
Steven J. Mack
Vanessa Sochat
Michael P. Mariani
Michael P. Mariani
Gordon D. Webster
Gordon D. Webster
author_facet Alexander K. Lancaster
Alexander K. Lancaster
Alexander K. Lancaster
Richard M. Single
Steven J. Mack
Vanessa Sochat
Michael P. Mariani
Michael P. Mariani
Gordon D. Webster
Gordon D. Webster
author_sort Alexander K. Lancaster
collection DOAJ
description Python for Population Genomics (PyPop) is a software package that processes genotype and allele data and performs large-scale population genetic analyses on highly polymorphic multi-locus genotype data. In particular, PyPop tests data conformity to Hardy-Weinberg equilibrium expectations, performs Ewens-Watterson tests for selection, estimates haplotype frequencies, measures linkage disequilibrium, and tests significance. Standardized means of performing these tests is key for contemporary studies of evolutionary biology and population genetics, and these tests are central to genetic studies of disease association as well. Here, we present PyPop 1.0.0, a new major release of the package, which implements new features using the more robust infrastructure of GitHub, and is distributed via the industry-standard Python Package Index. New features include implementation of the asymmetric linkage disequilibrium measures and, of particular interest to the immunogenetics research communities, support for modern nomenclature, including colon-delimited allele names, and improvements to meta-analysis features for aggregating outputs for multiple populations.Code available at: https://zenodo.org/records/10080668 and https://github.com/alexlancaster/pypop
first_indexed 2024-04-24T15:10:58Z
format Article
id doaj.art-be39c84a44ef4727a564110bf1bb8b67
institution Directory Open Access Journal
issn 1664-3224
language English
last_indexed 2024-04-24T15:10:58Z
publishDate 2024-04-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Immunology
spelling doaj.art-be39c84a44ef4727a564110bf1bb8b672024-04-02T11:01:25ZengFrontiers Media S.A.Frontiers in Immunology1664-32242024-04-011510.3389/fimmu.2024.13785121378512PyPop: a mature open-source software pipeline for population genomicsAlexander K. Lancaster0Alexander K. Lancaster1Alexander K. Lancaster2Richard M. Single3Steven J. Mack4Vanessa Sochat5Michael P. Mariani6Michael P. Mariani7Gordon D. Webster8Gordon D. Webster9Amber Biology LLC, Cambridge, MA, United StatesRonin Institute, Montclair, NJ, United StatesInstitute for Globally Distributed Open Research and Education (IGDORE), Cambridge, MA, United StatesDepartment of Mathematics and Statistics, University of Vermont, Burlington, VT, United StatesDepartment of Pediatrics, University of California, San Francisco, Oakland, CA, United StatesLivermore Computing, Lawrence Livermore National Laboratory, Livermore, CA, United StatesDepartment of Mathematics and Statistics, University of Vermont, Burlington, VT, United StatesMariani Systems LLC, Hanover, NH, United StatesAmber Biology LLC, Cambridge, MA, United StatesRonin Institute, Montclair, NJ, United StatesPython for Population Genomics (PyPop) is a software package that processes genotype and allele data and performs large-scale population genetic analyses on highly polymorphic multi-locus genotype data. In particular, PyPop tests data conformity to Hardy-Weinberg equilibrium expectations, performs Ewens-Watterson tests for selection, estimates haplotype frequencies, measures linkage disequilibrium, and tests significance. Standardized means of performing these tests is key for contemporary studies of evolutionary biology and population genetics, and these tests are central to genetic studies of disease association as well. Here, we present PyPop 1.0.0, a new major release of the package, which implements new features using the more robust infrastructure of GitHub, and is distributed via the industry-standard Python Package Index. New features include implementation of the asymmetric linkage disequilibrium measures and, of particular interest to the immunogenetics research communities, support for modern nomenclature, including colon-delimited allele names, and improvements to meta-analysis features for aggregating outputs for multiple populations.Code available at: https://zenodo.org/records/10080668 and https://github.com/alexlancaster/pypophttps://www.frontiersin.org/articles/10.3389/fimmu.2024.1378512/fullHLAMHCpopulation genomicssoftwarebioinformatics
spellingShingle Alexander K. Lancaster
Alexander K. Lancaster
Alexander K. Lancaster
Richard M. Single
Steven J. Mack
Vanessa Sochat
Michael P. Mariani
Michael P. Mariani
Gordon D. Webster
Gordon D. Webster
PyPop: a mature open-source software pipeline for population genomics
Frontiers in Immunology
HLA
MHC
population genomics
software
bioinformatics
title PyPop: a mature open-source software pipeline for population genomics
title_full PyPop: a mature open-source software pipeline for population genomics
title_fullStr PyPop: a mature open-source software pipeline for population genomics
title_full_unstemmed PyPop: a mature open-source software pipeline for population genomics
title_short PyPop: a mature open-source software pipeline for population genomics
title_sort pypop a mature open source software pipeline for population genomics
topic HLA
MHC
population genomics
software
bioinformatics
url https://www.frontiersin.org/articles/10.3389/fimmu.2024.1378512/full
work_keys_str_mv AT alexanderklancaster pypopamatureopensourcesoftwarepipelineforpopulationgenomics
AT alexanderklancaster pypopamatureopensourcesoftwarepipelineforpopulationgenomics
AT alexanderklancaster pypopamatureopensourcesoftwarepipelineforpopulationgenomics
AT richardmsingle pypopamatureopensourcesoftwarepipelineforpopulationgenomics
AT stevenjmack pypopamatureopensourcesoftwarepipelineforpopulationgenomics
AT vanessasochat pypopamatureopensourcesoftwarepipelineforpopulationgenomics
AT michaelpmariani pypopamatureopensourcesoftwarepipelineforpopulationgenomics
AT michaelpmariani pypopamatureopensourcesoftwarepipelineforpopulationgenomics
AT gordondwebster pypopamatureopensourcesoftwarepipelineforpopulationgenomics
AT gordondwebster pypopamatureopensourcesoftwarepipelineforpopulationgenomics