Posters 
Abstract
Comparison of Genetic Structure of Mouse Populations by Combinatorial Analysis of Long-Range Linkage Disequilibrium Networks
 
Roumyana Kirova1, Yun Zhang2, Gary A. Churchill3, Michael A. Langston2, Elissa J. Chesler1

Genetic reference populations are defined as panels of related mouse strains with fixed genotypes. Population structure is determined by the breeding history of these populations and particularly, generations of out crossing, randomization of genotype segregation and progenitor diversity. Linkage disequilibrium (LD) is a measure of statistical dependence between genetic markers. It depends on the recombination frequency, the genealogy of the populations, natural selection and other factors. Using large publicly available SNP sets, we applied graph analysis to compare the structure of multiple populations and sub-populations of mice. Linkage disequilibrium was evaluated by three metrics: correlation coefficients, mutual information coefficients and p-values for Lewontin’s D’. A high-pass filter was applied to these metrics to construct an unweighted graph of genotype associations. Maximal complete subgraphs (cliques) were extracted at several thresholds and the resulting graphs were analysed for number of cliques, clique size and chromosomal representation by clique members. These analytic approaches provide a quantitative comparison of populations and can be used to optimize genetic equidistance of sub-populations. Results indicate that the genotype structure of standard inbred strains, which have had longer periods of outcrossing, consists of smaller blocks of linked loci than recombinant inbreds. Moreover, it appears that the non-random breeding history of standard inbreds has resulted in the infiltration of non-syntenic linkage at high LD thresholds. These results are consistently observed across all three LD metrics. Long-range linkage disequilibrium and the presence of LD blocks in mouse inbred strains have the ability to confound SNP haplotype association analysis, despite the large size of the existing standard inbred strain set. Large syntenic LD blocks in the BXD recombinant inbred strain, though relatively uncorrelated with other genome regions, limit the power and precision of this population for genetic analysis. These same limitations apply to other correlation based methods including systems genetic analysis of high-throughput phenotypes such as gene expression. The 8-way collaborative cross is designed to have both smaller LD blocks than existing RI panels, and less long-range (non-syntenic) association of genotypes.

1Bioscience Division, Oak Ridge National Laboratory, Oak Ridge, TN
2Computer Science Department, University of Tennessee, Knoxville, TN
3The Jackson Laboratory, Bar Harbor, ME