Information relevant to individual loci is reported. Sample
size and allele counts will differ among loci if not all
individuals were typed at each locus. Untyped individuals are
those for which one or two alleles were not reported. The alleles
are listed in descending frequency (and count) in the left hand
column, and are sorted numerically in the right column. The
number of distinct alleles * k* is
reported.

**Example 3.2. Basic locus information sample output**

I. Single Locus Analyses ======================== 1. Locus: A ___________ 1.1. Allele Counts [A] ---------------------- Untyped individuals: 2 Sample Size (n): 45 Allele Count (2n): 90 Distinct alleles (k): 10 Counts ordered by frequency | Counts ordered by name Name Frequency (Count) | Name Frequency (Count) 0201 0.21111 19 | 0101 0.13333 12 0301 0.15556 14 | 0201 0.21111 19 0101 0.13333 12 | 0210 0.10000 9 2501 0.12222 11 | 0218 0.10000 9 0210 0.10000 9 | 0301 0.15556 14 0218 0.10000 9 | 2501 0.12222 11 3204 0.08889 8 | 3204 0.08889 8 6901 0.04444 4 | 6814 0.03333 3 6814 0.03333 3 | 6901 0.04444 4 7403 0.01111 1 | 7403 0.01111 1 Total 1.00000 90 | Total 1.00000 90

In the cases where there is no information for a locus, a message is displayed indicating lack of data.

Sample output:

4. Locus: DRA _____________ No data for this locus!

For each locus, the observed genotype counts are compared to
those expected under Hardy Weinberg proportions (HWP). A
triangular matrix reports observed and expected genotype counts.
If the matrix is more than 80 characters, the output is split into
different sections. Each cell contains the observed and expected
number for a given genotype in the format
`observed/expected`

.

**Example 3.3. Sample output of Hardy-Weinberg genotype table**

6.2. HardyWeinberg [DQA1] ------------------------- Table of genotypes, format of each cell is: observed/expected. 0201 8/5.1 0301 4/4.0 1/0.8 0401 3/6.9 1/2.7 6/2.3 0501 8/9.9 5/3.8 5/6.7 6/4.8 0201 0301 0401 0501 [Cols: 1 to 4]

The values in this matrix are used to test hypotheses of
deviation from HWP. The output also includes the chi-square
statistic, the number of degrees of freedom and associated
* p*-value for a number of classes of
genotypes and is summarized in the following table:

**Example 3.4. Sample output of HW genotype classes**

Observed Expected Chi-square DoF p-value ------------------------------------------------------------------------------ Common N/A N/A 4.65 1 0.0310* ------------------------------------------------------------------------------ Lumped genotypes N/A N/A 1.17 1 0.2797 ------------------------------------------------------------------------------ Common + lumped N/A N/A 5.82 1 0.0158* ------------------------------------------------------------------------------ All homozygotes 21 13.01 4.91 1 0.0268* ------------------------------------------------------------------------------ All heterozygotes 26 33.99 1.88 1 0.1706 ------------------------------------------------------------------------------ Common heterozygotes by allele 0201 15 20.78 1.61 0.2050 0301 10 10.47 0.02 0.8850 0401 9 16.31 3.28 0.0703 0501 18 20.43 0.29 0.5915 ------------------------------------------------------------------------------ Common genotypes 0201:0201 8 5.11 1.63 0.2014 0201:0401 3 6.93 2.23 0.1358 0201:0501 8 9.89 0.36 0.5472 0401:0501 5 6.70 0.43 0.5109 Total 24 28.63 ------------------------------------------------------------------------------

**Explanation of each genotype class**

If the dataset contains no genotypes with expected counts
equal or greater than No common genotypes; chi-square cannot be calculated The analysis of common genotypes may lead to a situtation where there are fewer classes (genotypes) than allele frequencies to estimate. This means that the analysis cannot be performed (degrees of freedom < 1). In such a case the following message is reported, explaining why the analysis could not be performed: Too many parameters for chi-square test. To obviate this as much as possible, only alleles which occur in common genotypes are used in the calculation of degrees of freedom. | |

The pooling procedure is designed to avoid carrying out the chi-square goodness of fit test in cases where there are low expected counts, which could lead to spurious rejection of HWP. However, in certain cases it may not be possible to carry out this pooling approach. The interpretation of results based on lumped genotypes will depend on the particular genotypes that are combined in this class. If the sum of expected counts in the lumped class does not
add up to The total number of expected genotypes is less than 5 This may by remedied by combining rare alleles and
recalculating overall chi-square value and degrees of freedom.
(This would require appropriate manipulation of the data set by
hand and is not a feature of
| |

| |

| |

| |

| |

`lumpBelow` . |

If enabled in the configuration file, the exact test for
deviations from HWP will be output. The exact test uses the method
of Guo & Thompson (1992). The
* p*-value provided describes how probable the
observed set of genotypes is, with respect to a large sample of
other genotypic configurations (conditioned on the same allele
frequencies and

`2n`

`p`

`p`

There are two implementations for this test, the first using
the `gthwe`

implementation originally due
to Guo & Thompson, but modified by John Chen, the second being
`Arlequin`

's
(Schneider et al., 2000) implementation.

**Example 3.5. Sample output for exact test using
gthwe**

6.3. Guo and Thompson HardyWeinberg output [DQA1] ------------------------------------------------- Total steps in MCMC: 1000000 Dememorization steps: 2000 Number of Markov chain samples: 1000 Markov chain sample size: 1000 Std. error: 0.0009431 p-value (overall): 0.0537

**Example 3.6. Sample output for exact test using the
Arlequin implementation**

6.4. Guo and Thompson HardyWeinberg output(Arlequin's implementation) [DQA1] ----------------------------------------------------------------------------- Observed heterozygosity: 0.553190 Expected heterozygosity: 0.763900 Std. deviation: 0.000630 Dememorization steps: 100172 p-value: 0.0518

Note that in the `Arlequin`

implementation, the output is slightly different, and the
only directly comparable value between the two implementation is
the * p*-value. These

`p`

For each locus, we implement the Ewens-Watterson homozygosity
test of neutrality (Ewens 1972; Watterson 1978). We use the term
*observed homozygosity* to denote the
homozygosity statistic (* F*), computed as the sum
of the squared allele frequencies. This value is compared to the

`2n`

`k`

`F`

`F`

The *normalized deviate of the
homozygosity*
(`F`

_{nd}) is the difference
between the *observed homozygosity* and
*expected homozygosity*, divided by the square
root of the variance of the expected homozygosity (also obtained by
simulations; Salamon et al. (1999)). Significant negative
normalized deviates imply *observed
homozygosity* values lower than *expected
homozygosity*, in the direction of balancing
selection. Significant positive values are in the direction of
directional selection.

The * p*-value in the last row of the output
is the probability of obtaining a homozygosity

`F`

`F`

`F`

`2n`

`k`

`p`

`p`

**Example 3.7. Sample output of homozygosity test from Monte-Carlo implementation**

The standard implementation of the test uses a Monte-Carlo
implementation of the exact test written by Slatkin (Slatkin 1994; Slatkin 1996). A Markov-chain Monte
Carlo method is used to obtain the null distribution of the
homozygosity statistic under neutrality. The reported
* p*-values are one-tailed (against the
alternative of balancing selection), but can be interpreted for a
two-tailed test by considering either extreme of the distribution
(< 0.025 or > 0.975) at the 0.05 level.

1.6. Slatkin's implementation of EW homozygosity test of neutrality [A] ----------------------------------------------------------------------- Observed F: 0.1326, Expected F: 0.2654, Variance in F: 0.0083 Normalized deviate of F (Fnd): -1.4603, p-value of F: 0.0029**