PyPop.Filter#

Module for filtering data files.

Filters and cleans data before being accepted as input to PyPop analysis routines.

.

Exceptions#

SubclassError

Common base class for all non-exit exceptions.

Classes#

Filter

Abstract base class for Filters

PassThroughFilter

A filter that doesn't change input data.

AnthonyNolanFilter

Filters data via anthonynolan's allele call data.

BinningFilter

Filters data through rules defined in one file for each locus.

AlleleCountAnthonyNolanFilter

Filters data with an allelecount less than a threshold.

Module Contents#

exception SubclassError#

Bases: Exception

Inheritance diagram of PyPop.Filter.SubclassError

Common base class for all non-exit exceptions.

Initialize self. See help(type(self)) for accurate signature.

class Filter#

Bases: abc.ABC

Inheritance diagram of PyPop.Filter.Filter

Abstract base class for Filters

abstractmethod doFiltering(matrix=None)#
abstractmethod startFirstPass(locus)#
abstractmethod checkAlleleName(alleleName)#
abstractmethod addAllele(alleleName)#
abstractmethod endFirstPass()#
abstractmethod startFiltering()#
abstractmethod filterAllele(alleleName)#
abstractmethod endFiltering()#
abstractmethod writeToLog(logstring=None)#
abstractmethod cleanup()#
class PassThroughFilter#

Bases: Filter

Inheritance diagram of PyPop.Filter.PassThroughFilter

A filter that doesn’t change input data.

doFiltering(matrix=None)#
startFirstPass(locus)#
checkAlleleName(alleleName)#
addAllele(alleleName)#
endFirstPass()#
startFiltering()#
filterAllele(alleleName)#
endFiltering()#
writeToLog(logstring=None)#
cleanup()#
class AnthonyNolanFilter(directoryName=None, remoteMSF=None, alleleFileFormat='msf', preserveAmbiguousFlag=0, preserveUnknownFlag=0, preserveLowresFlag=0, alleleDesignator='*', logFile=None, untypedAllele='****', unsequencedSite='#', sequenceFileSuffix='_prot', filename=None, numDigits=4, verboseFlag=1, debug=0, sequenceFilterMethod='strict')#

Bases: Filter

Inheritance diagram of PyPop.Filter.AnthonyNolanFilter

Filters data via anthonynolan’s allele call data.

Allele call data files can be of either txt or msf formats. txt files available at http://www.anthonynolan.com msf files available at ftp://ftp.ebi.ac.uk/pub/databases/imgt/mhc/hla/ Use of msf files is required in order to translate allele codes into polymorphic sequence data.

doFiltering(matrix=None)#

Do filtering on StringMatrix

Given a StringMatrix, does the filtering on the matrix, and returns it for further downstream processing

startFirstPass(locus)#
checkAlleleName(alleleName)#

Checks allele name against the database.

Returns the allele truncated to appropriate number of digits, if it can’t be found using any of the heuristics, return it as an untyped allele (normally four asterisks)

addAllele(alleleName)#
endFirstPass()#
startFiltering()#
filterAllele(alleleName)#
endFiltering()#
writeToLog(logstring='\n')#
cleanup()#
makeSeqDictionaries(matrix=None, locus=None)#
translateMatrix(matrix=None)#
class BinningFilter(customBinningDict=None, logFile=None, untypedAllele='****', filename=None, binningDigits=4, debug=0)#

Filters data through rules defined in one file for each locus.

doDigitBinning(matrix=None)#
doCustomBinning(matrix=None)#
lookupCustomBinning(testAllele, locus)#
class AlleleCountAnthonyNolanFilter(lumpThreshold=None, **kw)#

Bases: AnthonyNolanFilter

Inheritance diagram of PyPop.Filter.AlleleCountAnthonyNolanFilter

Filters data with an allelecount less than a threshold.

endFirstPass()#

Do regular AnthonyNolanFilter then translate alleles with count < lumpThreshold to ‘lump’