PyPop.HardyWeinberg#

Module for calculating Hardy-Weinberg statistics.

Attributes#

`use_scipy`
`chi2`

Classes#

`HardyWeinberg`	Calculate Hardy-Weinberg statistics.
`HardyWeinbergGuoThompson`	Wrapper class for 'gthwe'
`HardyWeinbergEnumeration`	Uses Hazael Maldonado Torres' exact enumeration test
`HardyWeinbergGuoThompsonArlequin`	Wrapper class for 'Arlequin'.

Functions#

pval(chisq, dof)

Module Contents#

use_scipy = False#

chi2 = None#

pval(chisq, dof)#

class HardyWeinberg(locusData=None, alleleCount=None, lumpBelow=5, flagChenTest=0, debug=0)#

Calculate Hardy-Weinberg statistics.

Given the observed genotypes for a locus, calculate the expected genotype counts based on Hardy Weinberg proportions for individual genotype values, and test for fit.

Constructor.

locusData and alleleCount to be provided by driver script via a call to ParseFile.getLocusData(locus).
lumpBelow: treat alleles with frequency less than this as if they were in same class (Default: 5)
flagChenTest: if enabled do Chen’s chi-square-based “corrected” p-value (Default: 0 [False])

serializeTo(stream, allelelump=0)#

serializeXMLTableTo(stream)#

class HardyWeinbergGuoThompson(locusData=None, alleleCount=None, runMCMCTest=0, runPlainMCTest=0, dememorizationSteps=2000, samplingNum=1000, samplingSize=1000, maxMatrixSize=250, monteCarloSteps=1000000, testing=False, **kw)#

Bases: HardyWeinberg

Inheritance diagram of PyPop.HardyWeinberg.HardyWeinbergGuoThompson

Wrapper class for ‘gthwe’

A wrapper for the Guo & Thompson program ‘gthwe’.

‘locusData’, ‘alleleCount’: As per base class.

In addition to the arguments for the base class, this class accepts the following additional keywords:

‘runMCMCTest’: If enabled run the Monte Carlo-Markov chain (MCMC) version of the test (what is normally referred to as “Guo & Thompson”)
‘runPlainMCTest’: If enabled run a plain Monte Carlo/randomization without the Markov-chain version of the test (this is also described in the original “Guo & Thompson” Biometrics paper, but was not in their original program)
‘dememorizationSteps’: number of `dememorization’ initial steps for random number generator (default 2000).
‘samplingNum’: the number of chunks for random number generator (default 1000).
‘samplingSize’: size of each chunk (default 1000).
‘maxMatrixSize’: maximum size of `flattened’ lower-triangular matrix of
observed alleles (default 250).
‘monteCarloSteps’: number of steps for the plain Monte Carlo
randomization test (without Markov-chain)

Constructor.

locusData and alleleCount to be provided by driver script via a call to ParseFile.getLocusData(locus).
lumpBelow: treat alleles with frequency less than this as if they were in same class (Default: 5)
flagChenTest: if enabled do Chen’s chi-square-based “corrected” p-value (Default: 0 [False])

generateFlattenedMatrix()#

dumpTable(locusName, stream, allelelump=0)#

class HardyWeinbergEnumeration(locusData=None, alleleCount=None, doOverall=0, **kw)#

Bases: HardyWeinbergGuoThompson

Inheritance diagram of PyPop.HardyWeinberg.HardyWeinbergEnumeration

Uses Hazael Maldonado Torres’ exact enumeration test

‘doOverall’: if set to true (‘1’), then do overall p-value test
default is false (‘0’)

Constructor.

locusData and alleleCount to be provided by driver script via a call to ParseFile.getLocusData(locus).
lumpBelow: treat alleles with frequency less than this as if they were in same class (Default: 5)
flagChenTest: if enabled do Chen’s chi-square-based “corrected” p-value (Default: 0 [False])

serializeTo(stream, allelelump=0)#

class HardyWeinbergGuoThompsonArlequin(matrix=None, locusName=None, arlequinExec='arlecore.exe', markovChainStepsHW=100000, markovChainDememorisationStepsHW=1000, untypedAllele='****', debug=None)#

Wrapper class for ‘Arlequin’.

This class extracts the Hardy-Weinberg (HW) statistics using the Arlequin implementation of the HW exact test, by the following:

creates a subdirectory ‘arlequinRuns’ in which all the Arlequin specific files are generated;
then the specified arlequin executable is run, generating the Arlequin output HTML files (*.htm);
the Arlequin output is then parsed for the relevant statistics;
lastly, the ‘arlequinRuns’ directory is removed.

Since the directory name ‘arlequinRuns’ is currently hardcoded, this has the consequence that this class cannot be invoked concurrently.

Parameters:

‘markovChainStepsHW’: Number of steps to use in Markov chain

(default: 100000).

‘markovChainDememorisationStepsHW’: “Burn-in” time for Markov

chain (default: 1000).

serializeTo(stream)#