Assembly of a Complex Genome: Defining Elements of Structure and Function
Breen, James (2009) Assembly of a Complex Genome: Defining Elements of Structure and Function. PhD thesis, Murdoch University.
The post-human genome sequencing project era has seen an influx of genome sequencing projects established to investigate the structure, composition and characteristics of plant genomes. While the genome sequences of smaller plant genomes (ie. Rice) are currently available, there has been a lack of progress on the study of large, complex genomes such as barley (Hordeum vulgare) and wheat (Triticum aestivum), due to the difficulties in their sequencing and assembly. The aim of this study is to assemble and annotate targeted regions of chromosome 3B from Triticum aestivum cv. Chinese Spring (CS) and Hope. This study also aimed to complete a comprehensive, inter- and intra-species comparative analysis using Bioinformatics tools and strategies, in order to define structural and functional elements within the genome.
Genome sequences totalling 2.7Mb from two different loci of chromosome 3B in two different cultivars (ctg11 from the short arm of CS, ctg1034 from the long arm of CS and three assembled sequences over the equivalent ctg11 region of Hope) were assembled using a novel ‘two-phase’ process that integrated information from a genome sequence assembler and a Triticeae-specific transposable element database. Through comparative genomics analysis a gene island was identified within a highly repetitive, heterochromatic region on 3BL that was highly conserved over four other cereal genomes (Brachypodium distachyon, Oryza sativa, Sorghum bicolor and Zea mays). Chromodomain-containing long terminal repeats from the gypsy family of retrotransposons were identified adjacent to the gene island and may suggest an involvement in the targeted insertion of transposable elements at the loci, protecting the gene-island from dynamic evolutionary change. Characterisation of the ctg11 (Sr2 region) genome sequence on 3BS, identified a large ~60kb mitochondrial genome insert and three members of the multi-gene beta-expansin family, with sequence analysis indicating local duplication within the sequence and rearrangements when compared to the equivalent region in a different wheat cultivar. In silico and real-time transcription analysis of the individual gene was also confirmed. Within the equivalent ctg11 in Hope, a germin-like protein (GLP) cluster was identified and characterised that distinguishes between the two wheat cultivars. The genes in this GLP cluster were identified to belong to a sub-gene family that conferred broad level basal resistance in transient over-expressed systems in rice and barley.
The main outcome of this study was the development of a novel strategy of genome sequence assembly by utilising the complex component of the wheat genome that made assembly difficult: transposable elements. The complex genome sequence assembly methodology outlined in this thesis is suitable to be used as a model for future sequence assembly studies. The assembly of large pseudomolecule sequences (among the largest and most complete ever assembled in the wheat genome) enabled the Bioinformatics analysis of a representative sample of wheat chromosome 3B, providing valuable in silico outputs for future functional analyses and allowing an in-depth intra- and inter-species comparative analysis with related genomes.
|Publication Type:||Thesis (PhD)|
|Murdoch Affiliation:||School of Information Technology|
|Supervisor:||Appels, Rudi and Bellgard, Matthew|
|Item Control Page|
Downloads per month over past year