Murdoch University Research Repository

Welcome to the Murdoch University Research Repository

The Murdoch University Research Repository is an open access digital collection of research
created by Murdoch University staff, researchers and postgraduate students.

Learn more

A novel high-accuracy genome assembly method utilizing a high-throughput workflow

Zeng, Q., Cao, W., Xing, L., Qin, G., Wu, J., Nagle, M.F., Xiong, Q., Chen, J., Yang, L., Bajaj, P., Chitikineni, A., Zhou, Y., Yu, Y., Xu, J., Nie, X., Huang, L., Liu, S., Šafář, J., Šimková, H., Song, W., Guo, B., Chen, S., Doležel, J., Hao, Z., Cheng, Q., Liang, J., Tang, J., Cao, A., Wang, Q., Lu, X., Yang, S., Ma, H., Liu, J., Wang, X., Zhang, H., Wang, Z., Ji, W., Wang, C., Yuan, F., Shi, J., Varshney, R.K.ORCID: 0000-0002-4562-9131, Kang, Z., Han, D. and Xu, H. (2020) A novel high-accuracy genome assembly method utilizing a high-throughput workflow. bioRxiv .

Link to Published Version:
*Subscription may be required


Across domains of biological research using genome sequence data, high-quality reference genome sequences are essential for characterizing genetic variation and understanding the genetic basis of phenotypes. However, the construction of genome assemblies for various species is often hampered by complexities of genome organization, especially repetitive and complex sequences, leading to mis-assembly and missing regions. Here, we describe a high-throughput gold standard genome assembly workflow using a large-scale bacterial artificial chromosome (BAC) library with a refined two-step pooling strategy and the Lamp assembler algorithm. This strategy minimizes the laborious processes of physical map construction and clone-by-clone sequencing, enabling inexpensive sequencing of several thousand BAC clones. By applying this strategy with a minimum tiling path BAC clone library for the short arm of chromosome 2D (2DS) of bread wheat, 98% of BAC sequences, covering 92.7% of the 2DS chromosome, were assembled correctly for this species with a highly complex and repetitive genome. We also identified 48 large mis-assemblies in the reference wheat genome assembly (IWGSC RefSeq v1.0) and corrected these large mis-assemblies in addition to filling 92.2% of the gaps in RefSeq v1.0. Our 2DS assembly represents a new benchmark for the assembly of complex genomes with both high accuracy and efficiency.

Item Type: Non-refereed Article
Publisher: Cold Spring Harbor Laboratory
Item Control Page Item Control Page


Downloads per month over past year