Founder effects and related issues in Host-viral association studies
Reeves, Karyn (2013) Founder effects and related issues in Host-viral association studies. PhD thesis, Murdoch University.
Viruses such as HIV which replicate rapidly and with high transcription error rates may evade immune detection by mutating at key positions within the viral amino acid sequence. Large-scale host-viral association studies are conducted to identify positions of possible escape mutation in response to host immune pressure, with this pressure predominantly governed by genes within the human leukocyte antigen (HLA) complex. When transmission of the virus is HLA-associated, however, standard tests of association can be confounded by the relatedness of contemporarily circulating viral sequences, as sequences descended from a common ancestor may share inherited patterns of polymorphisms, termed „founder effects‟. A number of model-based methods utilizing inferred phylogenetic trees estimated from the observed viral sequences have been proposed to correct for this confounding, although such methods are typically computationally intensive and require specialist software for their implementation. In this thesis we propose an alternative empirical approach based on principal components analysis (PCA) which can be implemented using widely available software, and which adapts and extends methods currently used to control for population stratification in case-control genome-wide association studies. To accommodate data with small proportions commonly observed in host-viral studies we implement the PCA-based controlling procedure within a logistic regression framework using novel formulations motivated by the Frisch-Waugh-Lovell Theorem and demonstrate their utility in detecting true associations whilst minimizing confounding generated by founder effects via simulation. The approach is then extended to the multivariate setting through the adaptation of well-known techniques which expand the scope of host-viral analyses by accommodating possible linkages within the HLA and viral data.
The thesis concludes with a discussion of issues arising from the application of tail-based rejection regions and false discovery rates in large-scale analyses based on pooled contingency tables with varying margins. We argue that constraints imposed by the margins have implications overlooked in the rigid application of techniques developed for tests based on statistics with continuous distributions, but by leveraging the scale of such analyses it may be possible to consider local deviations between observed and expected p-value distributions to better identify hypotheses of interest.
|Publication Type:||Thesis (PhD)|
|Murdoch Affiliation:||Institute for Immunology and Infectious Diseases|
|Supervisor:||James, Ian and McKinnon, Elizabeth|
|Item Control Page|
Downloads per month over past year