PyPop: Python for Population Genomics
- PyPop 0.7.0 (latest release), used in the Solberg et al. Human Immunology meta-analysis paper, is available.
- Also available as an official Fedora package
PyPop (Python for Population Genomics) is an environment developed by the Thomson lab for doing large-scale population genetic analyses including: (1) conformity to Hardy-Weinberg expectations, (2) tests for balancing or directional selection; (3) estimates of haplotype frequencies (and their distributions) and measures and tests of significance for linkage disequilibrium (LD).
It is an object-oriented framework implemented in Python, a language with powerful features for interfacing with other languages, such as C (in which we have already implemented many routines and which is particularly suited to computationally intensive tasks).
The output of the analyses are stored in the XML. These output files can then be transformed using standard tools into many other data formats suitable for machine input (such as PHYLIP or input for spreadsheet programs such as Excel or statistical packages, such as R, plain text, or HTML for human-readable format. Storing the output in XML allows the final viewable output format to be redesigned at will, without requiring the (often time-consuming) re-running of the analyses themselves.
An outline of PyPop can be found in our excellent 2007 Tissue Antigens and 2003 PSB papers . PyPop is available for download
- 2017: all new development is now in GitHub, no official release yet
- 2008-09-09: 0.7.0 release (many new features and bug fixes) ← Most recent
- 2005-04-13: 0.6.0 released (new features and bug fixes)
- 2004-03-09: 0.5.2 released (bug fix release, fixes Windows 98 .bat file problems)
- 2004-02-26: 0.5.1 released (mainly a bug fix and maintainance release)
- 2003-12-31: 0.5 released (first public beta)
(Please note that links and documentation below always refer to the most recent release).
Beta binary versions of PyPop are now publicly available (see "Getting and installing PyPop" link below for links to the current binaries) for both:
- GNU/Linux (the Linux binary needs a recent distribution that a recent glibc (2.8 has been tested): Fedora Core 9 is known to work, earlier versions of Red Hat such as the 7.x series and 8.0 are known to not work)
- Official Fedora package available! As of early 2009, pypop is also an official Fedora package which works with Fedora 9 and later. (It is also available for Red Hat Enterprise Linux 5 (RHEL 5) if you install the optional EPEL add-on repository). You can install via yum using:
yum install pypop
- Windows (the binary has been tested on Windows 98, 2000 and XP)
Both binary packages are approximately 5.5 Mb downloads. Source
PyPop is free software (sometimes referred to as open source software) and the source code is released under the terms of the "copyleft" GNU General Public License, or GPL (http://www.gnu.org/licenses/gpl.html). This means even if we haven't compiled a binary for your platform, it is possible for you to download the source code and compile it yourself.
Documentation for PyPop is contained in the ''PyPop User Guide''.
- ''PyPop User Guide'' includes:
- PyPop User Guide is also available as a PDF [220 kB]
Please be aware that this is a beta release so it is highly likely that there may be bugs and wrinkles to iron out. Please direct all questions to Alex Lancaster at the address below.
How to cite PyPop
When citing PyPop, please cite the (2007) paper from Tissue Antigens:
- A. K. Lancaster, R. M. Single, O. D. Solberg, M. P. Nelson and G. Thomson (2007) "PyPop update - a software pipeline for large-scale multilocus population genomics" Tissue Antigens 69 (s1), 192-197. [journal page, preprint PDF (112 kB)].
In addition, you can also cite our 2003 Pacific Symposium on Biocomputing paper:
- Alex Lancaster, Mark P. Nelson, Richard M. Single, Diogo Meyer, and Glenys Thomson (2003) "PyPop: a software framework for population genomics: analyzing large-scale multi-locus genotype data", in Pacific Symposium on Biocomputing vol. 8:514-525 (edited by R B Altman. et al., World Scientific, Singapore, 2003) [PubMed Central, PDF (344 kB)].
Population data files
Population data files and online supporting materials for published studies listed in the Solberg et al. meta-analysis paper may be found here
PyPop is affiliated with ImmPort.org, the Immunology Database and Analysis Portal. The ImmPort system provides advanced information technology support in the production, analysis, archiving, and exchange of scientific data for the diverse community of life science researchers supported by NIAID/DAIT. The development of the ImmPort system was supported by the NIH/NIAID Bioinformatics Integration Support Contract (BISC), Phase II.
This work has benefited from the support of NIH grant AI49213 (13th IHW) and NIH/NIAID Contract number HHSN266200400076C N01-AI-40076. Thanks to Steven J. Mack, Kristie A. Mather, Steve Marsh, Mark Grote and Leslie Louie for helpful comments and testing.