PyPop.DataTypes#

Module for storing genotype and allele count data.

Classes#

Genotypes

Base class that stores and caches basic genotype statistics.

AlleleCounts

WARNING: this class is now obsolete, the Genotypes class

Functions#

checkIfSequenceData(matrix)

getMetaLocus(locus, isSequenceData)

getLocusPairs(matrix, sequenceData)

Returns a list of all pairs of loci from a given StringMatrix

getLumpedDataLevels(genotypeData, locus, lumpLevels)

Returns a dictionary of tuples with alleleCount and locusData

Module Contents#

class Genotypes(matrix=None, untypedAllele='****', unsequencedSite=None, allowSemiTyped=0, debug=0)#

Base class that stores and caches basic genotype statistics.

getLocusList()#

Returns the list of loci.

Note: this list has filtered out all loci that consist of individuals that are all untyped.

*Note 2: the order of this list is now fixed for the lifetime

of the object.*

getAlleleCount()#

Return allele count statistics for all loci.

Return a map of tuples where the key is the locus name. Each tuple is a triple, consisting of a map keyed by alleles containing counts, the total count at that locus and the number of untyped individuals.

getAlleleCountAt(locus, lumpValue=0)#

Return allele count for given locus.

  • ‘lumpValue’: the specified amount of lumping (Default: 0)

Given a locus name, return a tuple: consisting of a map keyed by alleles containing counts, the total count at that locus, and number of untyped individuals.

serializeSubclassMetadataTo(stream)#

Serialize subclass-specific metadata.

Specifically, total number of individuals and loci and population name.

serializeAlleleCountDataAt(stream, locus)#
serializeAlleleCountDataTo(stream)#
getLocusDataAt(locus, lumpValue=0)#

Returns the genotyped data for specified locus.

Given a ‘locus’, return a list genotypes consisting of 2-tuples which contain each of the alleles for that individual in the list.

  • ‘lumpValue’: the specified amount of lumping (Default: 0)

Note: this list has filtered out all individuals that are untyped at either chromosome.

Note 2: data is sorted so that allele1 < allele2, alphabetically

getLocusData()#

Returns the genotyped data for all loci.

Returns a dictionary keyed by locus name of lists of 2-tuples as defined by ‘getLocusDataAt()’

getIndividualsData()#

Returns the individual data.

Returns a ‘StringMatrix’.

checkIfSequenceData(matrix)#
getMetaLocus(locus, isSequenceData)#
getLocusPairs(matrix, sequenceData)#

Returns a list of all pairs of loci from a given StringMatrix

getLumpedDataLevels(genotypeData, locus, lumpLevels)#

Returns a dictionary of tuples with alleleCount and locusData lumped by different levels specified as a list of integers.

class AlleleCounts(alleleTable=None, locusName=None, debug=0)#

WARNING: this class is now obsolete, the Genotypes class now holds allele count data as pseudo-genotype matrix.

Class to store information in allele count form.

serializeSubclassMetadataTo(stream)#

Serialize subclass-specific metadata.

Specifically, total number of alleles and loci.

serializeAlleleCountDataAt(stream, locus)#
getAlleleCount()#
getLocusName()#