PyPop.Utils#

Module for common utility classes and functions.

Contains convenience classes for output of text and XML files.

Attributes#

GENOTYPE_SEPARATOR

GENOTYPE_TERMINATOR

Classes#

TextOutputStream

Output stream for writing text files.

XMLOutputStream

Output stream for writing XML files.

OrderedDict

Allows dict to have _ORDERED_ pairs

Index

Returns an Index object for OrderedDict

StringMatrix

StringMatrix is a subclass of NumPy (Numeric Python)

Group

Functions#

glob_with_pathlib(pattern)

getStreamType(stream)

Return the type of stream.

natural_sort_key(s[, _nsre])

unique_elements(li)

Gets the unique elements in a list

appendTo2dList(aList[, appendStr])

convertLineEndings(file, mode)

fixForPlatform(filename[, txt_ext])

copyfileCustomPlatform(src, dest[, txt_ext])

copyCustomPlatform(file, dist_dir[, txt_ext])

checkXSLFile(xslFilename[, path, subdir, abort, ...])

getUserFilenameInput(prompt, filename)

Read user input for a filename, check its existence, continue

splitIntoNGroups(alist[, n])

Divides a list up into n parcels (plus whatever is left over)

Module Contents#

GENOTYPE_SEPARATOR = '~'#
GENOTYPE_TERMINATOR = '~'#
glob_with_pathlib(pattern)#
class TextOutputStream(file)#

Output stream for writing text files.

write(str)#
writeln(str='\n')#
close()#
flush()#
class XMLOutputStream(file)#

Bases: TextOutputStream

Inheritance diagram of PyPop.Utils.XMLOutputStream

Output stream for writing XML files.

opentag(tagname, **kw)#

Generate an open XML tag.

Generate an open XML tag. Attributes are passed in the form of optional named arguments, e.g. opentag(‘tagname’, role=something, id=else) will produce the result ‘<tagname role=”something” id=”else”> Note that the attribute and values are optional and if omitted produce ‘<tagname>’.

emptytag(tagname, **kw)#

Generate an empty XML tag.

As per ‘opentag()’ but without content, i.e.:

‘<tagname attr=”val”/>’.

closetag(tagname)#

Generate a closing XML tag.

Generate a tag in the form: ‘</tagname>’.

tagContents(tagname, content, **kw)#

Generate open and closing XML tags around contents.

Generates tags in the form: ‘<tagname>content</tagname>’. ‘content’ must be a string. Convert ‘&’ and ‘<’ and ‘>’ into valid XML equivalents.

getStreamType(stream)#

Return the type of stream.

Returns either ‘xml’ or ‘text’.

class OrderedDict(hash=None)#

Allows dict to have _ORDERED_ pairs

Creates an ordered dict

index(key)#

Returns position of key in dict

keys()#

Returns list of keys in dict

values()#

Returns list of values in dict

items()#

Returns list of tuples of keys and values

insert(i, key, value)#

Inserts a key-value pair at a given index

remove(i)#

Removes a key-value pair from the dict

reverse()#

Reverses the order of the key-value pairs

sort(cmp=0)#

Sorts the dict (allows for sort algorithm)

clear()#

Clears all the entries in the dict

copy()#

Makes copy of dict, also of OrderdDict class

get(key)#

Returns the value of a key

has_key(key)#

Looks for existence of key in dict

update(dict)#

Updates entries in a dict based on another

count(key)#

Finds occurrences of a key in a dict (0/1)

class Index(i=0)#

Returns an Index object for OrderedDict

Creates an Index object for use with OrderedDict

class StringMatrix(rowCount=None, colList=None, extraList=None, colSep='\t', headerLines=None)#

Bases: numpy.lib.user_array.container

Inheritance diagram of PyPop.Utils.StringMatrix

StringMatrix is a subclass of NumPy (Numeric Python) UserArray class, and uses NumPy to store the data in an efficient array format, rather than internal Python lists.

Constructor for StringMatrix.

colList is a mutable type so we freeze the list of locus keys in the original order in file by making a clone of the list of keys.

the order of loci in the array will correspond to the original file order, and we don’t want this tampered with by the `callee’ function (i.e. effectively override the Python ‘pass by reference’ default and ‘pass by value’).

dump(locus=None, stream=sys.stdout)#
copy()#

Make a (deep) copy of the StringMatrix

Currently this goes via the constructor, not sure if there is a better way of doing this

getNewStringMatrix(key)#

Create an entirely new StringMatrix using only the columns supplied in the keys.

The format of the keys is identical to __getitem__ except that it in this case returns a full StringMatrix instance which includes all metadata

getUniqueAlleles(key)#

Return a list of unique integers for given key sorted by allele name using natural sort

convertToInts()#

Convert matrix to integers: needed for haplo-stats Note that integers start at 1 for compatibility with haplo-stats module FIXME: check whether we need to release memory

countPairs()#

Given a matrix of genotypes (pairs of columns for each locus), compute number of possible pairs of haplotypes for each subject (the rows of the geno matrix)

FIXME: this does not do any involved handling of missing data as per geno.count.pairs from haplo.stats

FIXME: should these methods eventually be moved to Genotype class?

flattenCols()#

Flatten columns into a single list FIXME: assumes entries are integers

filterOut(key, blankDesignator)#

Returns a filtered matrix.

When passed a designator, this method will return the rows of the matrix that do not contain that designator at any rows

getSuperType(key)#

Returns a matrix grouped by columns.

e.g if matrix is [[A01, A02, B01, B02], [A11, A12, B11, B12]]

then getSuperType(‘A:B’) will return the matrix with the column vector:

[[A01:B01, A02:B02], [A11:B11, A12:B12]]

class Group(li, size)#
natural_sort_key(s, _nsre=re.compile('([0-9]+)'))#
unique_elements(li)#

Gets the unique elements in a list

appendTo2dList(aList, appendStr=':')#
convertLineEndings(file, mode)#
fixForPlatform(filename, txt_ext=0)#
copyfileCustomPlatform(src, dest, txt_ext=0)#
copyCustomPlatform(file, dist_dir, txt_ext=0)#
checkXSLFile(xslFilename, path='', subdir='', abort=False, debug=None, msg='')#
getUserFilenameInput(prompt, filename)#

Read user input for a filename, check its existence, continue requesting input until a valid filename is entered.

splitIntoNGroups(alist, n=1)#

Divides a list up into n parcels (plus whatever is left over)

This class currently works with Python 2.2, but will eventually use iterators, so ultimately will need least Python 2.3!