Haplotype-resolved Chinese male genome assembly based on high-fidelity sequencing

The advantages of both the length and accuracy of high-fidelity (HiFi) reads enable chromosome-scale haplotype-resolved genome assembly. In this study, we sequenced a cell line named HJ, established from a Chinese Han male individual by using HiFi and Hi-C. We assembled two high-quality haplotypes o...

Full description

Bibliographic Details
Main Authors: Xiaofei Yang, Xixi Zhao, Shoufang Qu, Peng Jia, Bo Wang, Shenghan Gao, Tun Xu, Wenxin Zhang, Jie Huang, Kai Ye
Format: Article
Language:English
Published: KeAi Communications Co. Ltd. 2022-11-01
Series:Fundamental Research
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2667325822001145
_version_ 1797976324587913216
author Xiaofei Yang
Xixi Zhao
Shoufang Qu
Peng Jia
Bo Wang
Shenghan Gao
Tun Xu
Wenxin Zhang
Jie Huang
Kai Ye
author_facet Xiaofei Yang
Xixi Zhao
Shoufang Qu
Peng Jia
Bo Wang
Shenghan Gao
Tun Xu
Wenxin Zhang
Jie Huang
Kai Ye
author_sort Xiaofei Yang
collection DOAJ
description The advantages of both the length and accuracy of high-fidelity (HiFi) reads enable chromosome-scale haplotype-resolved genome assembly. In this study, we sequenced a cell line named HJ, established from a Chinese Han male individual by using HiFi and Hi-C. We assembled two high-quality haplotypes of the HJ genome (haplotype 1 (H1): 3.1 Gb, haplotype 2 (H2): 2.9 Gb). The continuity (H1: contig N50 = 28.2 Mb, H2: contig N50 = 25.9 Mb) and completeness (BUSCO: H1 = 94.9%, H2 = 93.5%) are substantially better than those of other Chinese genomes, for example, HX1, NH1.0, and YH2.0. By comparing HJ genome with GRCh38, we reported the mutation landscape of HJ and found that 176 and 213 N-gaps were filled in H1 and H2, respectively. In addition, we detected 12.9 Mb and 13.4 Mb novel sequences containing 246 and 135 protein-coding genes in H1 and H2, respectively. Our results demonstrate the advantages of HiFi reads in haplotype-resolved genome assembly and provide two high-quality haplotypes of a potential Chinese genome as a reference for the Chinese Han population.
first_indexed 2024-04-11T04:49:08Z
format Article
id doaj.art-b8ae0c4e73e84c55b8804e590dd9d82a
institution Directory Open Access Journal
issn 2667-3258
language English
last_indexed 2024-04-11T04:49:08Z
publishDate 2022-11-01
publisher KeAi Communications Co. Ltd.
record_format Article
series Fundamental Research
spelling doaj.art-b8ae0c4e73e84c55b8804e590dd9d82a2022-12-27T04:43:18ZengKeAi Communications Co. Ltd.Fundamental Research2667-32582022-11-0126946953Haplotype-resolved Chinese male genome assembly based on high-fidelity sequencingXiaofei Yang0Xixi Zhao1Shoufang Qu2Peng Jia3Bo Wang4Shenghan Gao5Tun Xu6Wenxin Zhang7Jie Huang8Kai Ye9Genome Institute, the First Affiliated Hospital of Xi'an Jiaotong University, Xi'an 710061, Shaanxi, China; School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, Shaanxi, China; MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, Shaanxi, ChinaGenome Institute, the First Affiliated Hospital of Xi'an Jiaotong University, Xi'an 710061, Shaanxi, ChinaNational Institutes for food and drug Control (NIFDC), No.2, Tiantan Xili, Dongcheng District, Beijing 102629, ChinaMOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, Shaanxi, China; School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, Shaanxi, ChinaMOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, Shaanxi, China; School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, Shaanxi, ChinaMOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, Shaanxi, China; School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, Shaanxi, ChinaMOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, Shaanxi, China; School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, Shaanxi, ChinaNational Institutes for food and drug Control (NIFDC), No.2, Tiantan Xili, Dongcheng District, Beijing 102629, ChinaNational Institutes for food and drug Control (NIFDC), No.2, Tiantan Xili, Dongcheng District, Beijing 102629, China; Corresponding authors.Genome Institute, the First Affiliated Hospital of Xi'an Jiaotong University, Xi'an 710061, Shaanxi, China; MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, Shaanxi, China; School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, Shaanxi, China; School of Life Science and Technology, Xi'an Jiaotong University, Xi'an 710049, Shaanxi, China; Faculty of Science, Leiden University, Leiden, The Netherlands; Corresponding authors.The advantages of both the length and accuracy of high-fidelity (HiFi) reads enable chromosome-scale haplotype-resolved genome assembly. In this study, we sequenced a cell line named HJ, established from a Chinese Han male individual by using HiFi and Hi-C. We assembled two high-quality haplotypes of the HJ genome (haplotype 1 (H1): 3.1 Gb, haplotype 2 (H2): 2.9 Gb). The continuity (H1: contig N50 = 28.2 Mb, H2: contig N50 = 25.9 Mb) and completeness (BUSCO: H1 = 94.9%, H2 = 93.5%) are substantially better than those of other Chinese genomes, for example, HX1, NH1.0, and YH2.0. By comparing HJ genome with GRCh38, we reported the mutation landscape of HJ and found that 176 and 213 N-gaps were filled in H1 and H2, respectively. In addition, we detected 12.9 Mb and 13.4 Mb novel sequences containing 246 and 135 protein-coding genes in H1 and H2, respectively. Our results demonstrate the advantages of HiFi reads in haplotype-resolved genome assembly and provide two high-quality haplotypes of a potential Chinese genome as a reference for the Chinese Han population.http://www.sciencedirect.com/science/article/pii/S2667325822001145Genome assemblyHiFi readsHuman genomeHaplotype-resolvedChinese HanMutation landscape
spellingShingle Xiaofei Yang
Xixi Zhao
Shoufang Qu
Peng Jia
Bo Wang
Shenghan Gao
Tun Xu
Wenxin Zhang
Jie Huang
Kai Ye
Haplotype-resolved Chinese male genome assembly based on high-fidelity sequencing
Fundamental Research
Genome assembly
HiFi reads
Human genome
Haplotype-resolved
Chinese Han
Mutation landscape
title Haplotype-resolved Chinese male genome assembly based on high-fidelity sequencing
title_full Haplotype-resolved Chinese male genome assembly based on high-fidelity sequencing
title_fullStr Haplotype-resolved Chinese male genome assembly based on high-fidelity sequencing
title_full_unstemmed Haplotype-resolved Chinese male genome assembly based on high-fidelity sequencing
title_short Haplotype-resolved Chinese male genome assembly based on high-fidelity sequencing
title_sort haplotype resolved chinese male genome assembly based on high fidelity sequencing
topic Genome assembly
HiFi reads
Human genome
Haplotype-resolved
Chinese Han
Mutation landscape
url http://www.sciencedirect.com/science/article/pii/S2667325822001145
work_keys_str_mv AT xiaofeiyang haplotyperesolvedchinesemalegenomeassemblybasedonhighfidelitysequencing
AT xixizhao haplotyperesolvedchinesemalegenomeassemblybasedonhighfidelitysequencing
AT shoufangqu haplotyperesolvedchinesemalegenomeassemblybasedonhighfidelitysequencing
AT pengjia haplotyperesolvedchinesemalegenomeassemblybasedonhighfidelitysequencing
AT bowang haplotyperesolvedchinesemalegenomeassemblybasedonhighfidelitysequencing
AT shenghangao haplotyperesolvedchinesemalegenomeassemblybasedonhighfidelitysequencing
AT tunxu haplotyperesolvedchinesemalegenomeassemblybasedonhighfidelitysequencing
AT wenxinzhang haplotyperesolvedchinesemalegenomeassemblybasedonhighfidelitysequencing
AT jiehuang haplotyperesolvedchinesemalegenomeassemblybasedonhighfidelitysequencing
AT kaiye haplotyperesolvedchinesemalegenomeassemblybasedonhighfidelitysequencing