Deeper Genomic Insights with DNBSEQ Complete WGS

More coverage, more insight, and more information.

Better Coverage

Uncover unknown areas of the genome.

Deeper Insights

Greater insights without parental testing.

Actionable Information

Valuable data for genetic disease research.

Complete WGS Overview

The NGS Limitations

Next-generation sequencing (NGS) technologies have enabled comprehensive whole-genome sequencing (WGS) using short DNA fragments, sequencing with SBS technology, and alignment to a reference genome. This method generates a single consensus sequence without distinguishing between variants on homologous chromosomes. However, this approach has limitations.

For instance, when mutations are detected, it’s unclear whether two mutations occur within the same gene or are distributed across two separate genes.

Phased Sequencing

Phased sequencing addresses this limitation by separating the consensus sequence into two individual sequence strands that occur together in phase, identifying alleles on both the maternal and paternal chromosomes.

It can provide valuable information for genetic disease research, such as analyzing structural variants, measuring allele-specific expression, identifying variant linkages, and more.

Affordable Phased Sequencing

Acquiring genomic phasing information has been challenging and costly, requiring specialized long-read sequencing technology. However, with the DNBSEQ™ Complete WGS (cWGS) solution, researchers can now generate highly accurate data while reducing costs.

The solution combines the output of WGS data generated through a PCR-free method with the DNBSEQ Complete WGS Kit which labels long DNA fragments, essentially reconstructing the long fragment DNA. Both outputs are analyzed with the DNBSEQ Complete WGS Analysis Software to generate the phasing information.

Visualizing Long-Phased Haplotype

Ideograms highlighting the long phased haplotype contigs a) H001 b) H002 and c) H005. DNA was isolated from approximately 20 million freshly harvested cells from GM12878 (HG001), GM24385 (HG002), and GM24631 (HG005) using the MGIEasy Magnetic Beads Genomic DNA Extraction Kit (1000010524). Isolated DNA was processed using the DNBSEQ Complete WGS kit using a single sample per reaction, followed by sequencing on the DNBSEQ-G400 sequencing instrument (PE100) to approximately 30X depth per sample. PCR-free libraries were generated using the MGIEasy FS PCR-Free DNA Library Prep Kit (1000013459) and sequenced to approximately 40X depth on the DNBSEQ-T7 sequencing instrument (PE150). FASTQ files were processed through the DNBSEQ Complete WGS Analysis Software.

ideograms-long-phased-haplotype-complete-wgs
HG001HG002HG005
SNPs4,554,0794,587,5754,526,465
Het SNPs2,298,4952,309,1092,100,359
Indels1,281,2961,305,2041,298,396
Het Indels621,814622,817579,681
Barcode Split (%)818382
Long Fragments12,843,70814,305,51513,306,713
Average Length of LFs82,29071,05586,510
Phased het SNPs2,276,8802,285,1492,080,796
Phased het Indels465,566464,543427,588
Phaseblocks1,1801,7571,307
Phaseblock N50 (mb)34.232.038.8

Multiplexed Sample Sequencing

DNBSEQ Complete WGS PanGenome data generated across a wide range of samples. DNA was isolated as previously described and processed in a similar manner, except that a multiplex of 10 samples per DNBSEQ Complete WGS reaction was performed. All sequencing was done on DNBSEQ-T7 using PE100 and PE150 for cWGS and PCR-free libraries, respectively. FASTQ files were processed through the DNBSEQ Complete WGS Analysis Software.

GM19240GM20129HG00438HG00735HG01358HG01928HG02630HG03453HG03492
SNPs5,500,7745,388,1524,568,5144,721,5354,610,2684,418,9765,531,6385,551,6914,663,145
Het SNPs3,099,7753,107,3512,184,4862,522,6392,319,8231,896,4023,107,1153,146,2052,372,570
Indels1,588,8141,564,9341,374,3181,389,5181,391,2881,375,2031,646,9031,600,8051,450,998
Het Indels809,848807,535593,246668,840621,973523,835810,262820,508635,577
Barcode Split (%)85.6273.9482.8780.0985.6879.2279.7584.7081.35
Long Fragments7,401,4739,997,6279,235,78219,731,87613,132,04913,094,32114,740,07214,732,97011,991,831
Average Length of LFs111,01994,98293,69552,03767,05479,20779,05258,51788,550
Phased het SNPs3,043,4783,079,8892,162,2762,496,5442,296,8371,851,8833,079,1653,083,9292,349,608
Phased het Indels511,004603,604436,647495,179463,455315,385605,143499,608470,432
Phaseblocks9131,0821,1853,4322,0692,8371,2111,7171,503
Phaseblock N50 (mb)58.143.835.94.417.75.132.818.543.8

Complete WGS Workflow

Library Prep

Sequencing

Analysis

  • DNBSEQ Complete WGS Analysis Software

Complete WGS Data Analysis

The DNBSEQ Complete WGS (cWGS) analysis pipeline enables the mapping, variant calling, and phasing of input FASTQ files from a PCR-free (PF) and a DNBSEQ Complete WGS (cWGS) library of the same sample. Running this pipeline results in a highly accurate and complete phased variant calling file (VCF). We recommend at least 40X coverage depth for the PCR-free library and 30X coverage depth for the cWGS library. Below is a visual summary of the pipeline.

DNBSEQ Compete WGS (cWGS) Analysis Pipeline

complete-wgs-pipeline-graph-complete-genomics

System Requirements

Hardware SoftwareSoftware
Multicore computer (48 CPUs or more)Linux CentOS 7 or later
Minimum 72 GB RAMRoot access may be necessary to install Singularity on the system
Storage may vary on on sample count and coverage;
expect approximately 1 TB per sample
14 hours per sample (with minimum system requirements)
Per sample run time can be reduced by batching samples and utilizing additional CPUs

Analysis Run Time

The expected analysis run time is 14 hours per sample, with the system meeting the minimum requirements. Per sample analysis, run time can be greatly reduced by batching samples and utilizing additional CPUs.

Specifications

MethodsPCR-FREEComplete WGS
ApplicationsHaplotype identification, structural variant identification, de novo
Compatible SpeciesHuman, simple plants (rice and lettuce), animals (dogs, moths, fish)
Sample TypeDNA
Input Quantity250 – 400 ng10 ng
Manual Assay Time3.2 hrs18 hrs
Run Time on SP-100*~ 2 hrs~ 9 hrs
Barcode48 unique barcodes
SequencerDNBSEQ-T7
Read LengthPE150PE100
DNBSEQ-T7

Ordering Information

CategoryProductCat. No.
InstrumentDNBSEQ-T7 Genetic Sequencer900-000698-00
AutomationSP-100 Automated Library Prep System900-000206-00
SP-Smart 8 Sample Preparation System900-000495-00
Sample PrepMGIEasy Magnetic Beads Genomic DNA Extraction Kit1000010524
Library PrepDNBSEQ Fast PCR-FREE FS Library Prep Set V2.0940-001314-00
DNBSEQ Complete WGS (96 Samples)940-002564-00
Sequencing ReagentsDNBSEQ-T7 High-throughput Sequencing Set (DNBSEQ
Complete WGS FCL PE150)
940-002496-00
DNBSEQ-T7 High-throughput Sequencing Set (FCL PE100) V1.0940-000836-00

A Complete Solution for Every Step of Your Workflow