PyPop.Filter#
Module for filtering data files.
Filters and cleans data before being accepted as input to PyPop analysis routines.
.
Exceptions#
Common base class for all non-exit exceptions. |
Classes#
Abstract base class for Filters |
|
A filter that doesn't change input data. |
|
Filters data via anthonynolan's allele call data. |
|
Filters data through rules defined in one file for each locus. |
|
Filters data with an allelecount less than a threshold. |
Module Contents#
- exception SubclassError#
Bases:
Exception
Common base class for all non-exit exceptions.
Initialize self. See help(type(self)) for accurate signature.
- class Filter#
Bases:
abc.ABC
Abstract base class for Filters
- abstractmethod doFiltering(matrix=None)#
- abstractmethod startFirstPass(locus)#
- abstractmethod checkAlleleName(alleleName)#
- abstractmethod addAllele(alleleName)#
- abstractmethod endFirstPass()#
- abstractmethod startFiltering()#
- abstractmethod filterAllele(alleleName)#
- abstractmethod endFiltering()#
- abstractmethod writeToLog(logstring=None)#
- abstractmethod cleanup()#
- class PassThroughFilter#
Bases:
Filter
A filter that doesn’t change input data.
- doFiltering(matrix=None)#
- startFirstPass(locus)#
- checkAlleleName(alleleName)#
- addAllele(alleleName)#
- endFirstPass()#
- startFiltering()#
- filterAllele(alleleName)#
- endFiltering()#
- writeToLog(logstring=None)#
- cleanup()#
- class AnthonyNolanFilter(directoryName=None, remoteMSF=None, alleleFileFormat='msf', preserveAmbiguousFlag=0, preserveUnknownFlag=0, preserveLowresFlag=0, alleleDesignator='*', logFile=None, untypedAllele='****', unsequencedSite='#', sequenceFileSuffix='_prot', filename=None, numDigits=4, verboseFlag=1, debug=0, sequenceFilterMethod='strict')#
Bases:
Filter
Filters data via anthonynolan’s allele call data.
Allele call data files can be of either txt or msf formats. txt files available at http://www.anthonynolan.com msf files available at ftp://ftp.ebi.ac.uk/pub/databases/imgt/mhc/hla/ Use of msf files is required in order to translate allele codes into polymorphic sequence data.
- doFiltering(matrix=None)#
Do filtering on StringMatrix
Given a StringMatrix, does the filtering on the matrix, and returns it for further downstream processing
- startFirstPass(locus)#
- checkAlleleName(alleleName)#
Checks allele name against the database.
Returns the allele truncated to appropriate number of digits, if it can’t be found using any of the heuristics, return it as an untyped allele (normally four asterisks)
- addAllele(alleleName)#
- endFirstPass()#
- startFiltering()#
- filterAllele(alleleName)#
- endFiltering()#
- writeToLog(logstring='\n')#
- cleanup()#
- makeSeqDictionaries(matrix=None, locus=None)#
- translateMatrix(matrix=None)#
- class BinningFilter(customBinningDict=None, logFile=None, untypedAllele='****', filename=None, binningDigits=4, debug=0)#
Filters data through rules defined in one file for each locus.
- doDigitBinning(matrix=None)#
- doCustomBinning(matrix=None)#
- lookupCustomBinning(testAllele, locus)#
- class AlleleCountAnthonyNolanFilter(lumpThreshold=None, **kw)#
Bases:
AnthonyNolanFilter
Filters data with an allelecount less than a threshold.
- endFirstPass()#
Do regular AnthonyNolanFilter then translate alleles with count < lumpThreshold to ‘lump’