<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Yun, Taedong</style></author><author><style face="normal" font="default" size="100%">Li, Helen</style></author><author><style face="normal" font="default" size="100%">Chang, Pi-Chuan</style></author><author><style face="normal" font="default" size="100%">Lin, Michael F</style></author><author><style face="normal" font="default" size="100%">Carroll, Andrew</style></author><author><style face="normal" font="default" size="100%">McLean, Cory Y</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Accurate, scalable cohort variant calls using DeepVariant and GLnexus.</style></title><secondary-title><style face="normal" font="default" size="100%">Bioinformatics</style></secondary-title><alt-title><style face="normal" font="default" size="100%">Bioinformatics</style></alt-title></titles><dates><year><style  face="normal" font="default" size="100%">2021</style></year><pub-dates><date><style  face="normal" font="default" size="100%">2021 Jan 05</style></date></pub-dates></dates><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;&lt;b&gt;MOTIVATION: &lt;/b&gt;Population-scale sequenced cohorts are foundational resources for genetic analyses, but processing raw reads into analysis-ready cohort-level variants remains challenging.&lt;/p&gt;&lt;p&gt;&lt;b&gt;RESULTS: &lt;/b&gt;We introduce an open-source cohort-calling method that uses the highly-accurate caller DeepVariant and scalable merging tool GLnexus. Using callset quality metrics based on variant recall and precision in benchmark samples and Mendelian consistency in father-mother-child trios, we optimized the method across a range of cohort sizes, sequencing methods, and sequencing depths. The resulting callsets show consistent quality improvements over those generated using existing best practices with reduced cost. We further evaluate our pipeline in the deeply sequenced 1000 Genomes Project (1KGP) samples and show superior callset quality metrics and imputation reference panel performance compared to an independently-generated GATK Best Practices pipeline.&lt;/p&gt;&lt;p&gt;&lt;b&gt;AVAILABILITY AND IMPLEMENTATION: &lt;/b&gt;We publicly release the 1KGP individual-level variant calls and cohort callset (https://console.cloud.google.com/storage/browser/brain-genomics-public/research/cohort/1KGP) to foster additional development and evaluation of cohort merging methods as well as broad studies of genetic variation. Both DeepVariant (https://github.com/google/deepvariant) and GLnexus (https://github.com/dnanexus-rnd/GLnexus) are open-sourced, and the optimized GLnexus setup discovered in this study is also integrated into GLnexus public releases v1.2.2 and later.&lt;/p&gt;&lt;p&gt;&lt;b&gt;SUPPLEMENTARY INFORMATION: &lt;/b&gt;Supplementary data are available at Bioinformatics online.&lt;/p&gt;</style></abstract><custom1><style face="normal" font="default" size="100%">https://www.ncbi.nlm.nih.gov/pubmed/33399819?dopt=Abstract</style></custom1></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Garg, Shilpa</style></author><author><style face="normal" font="default" size="100%">Fungtammasan, Arkarachai</style></author><author><style face="normal" font="default" size="100%">Carroll, Andrew</style></author><author><style face="normal" font="default" size="100%">Chou, Mike</style></author><author><style face="normal" font="default" size="100%">Schmitt, Anthony</style></author><author><style face="normal" font="default" size="100%">Zhou, Xiang</style></author><author><style face="normal" font="default" size="100%">Mac, Stephen</style></author><author><style face="normal" font="default" size="100%">Peluso, Paul</style></author><author><style face="normal" font="default" size="100%">Hatas, Emily</style></author><author><style face="normal" font="default" size="100%">Ghurye, Jay</style></author><author><style face="normal" font="default" size="100%">Maguire, Jared</style></author><author><style face="normal" font="default" size="100%">Mahmoud, Medhat</style></author><author><style face="normal" font="default" size="100%">Cheng, Haoyu</style></author><author><style face="normal" font="default" size="100%">Heller, David</style></author><author><style face="normal" font="default" size="100%">Zook, Justin M</style></author><author><style face="normal" font="default" size="100%">Moemke, Tobias</style></author><author><style face="normal" font="default" size="100%">Marschall, Tobias</style></author><author><style face="normal" font="default" size="100%">Sedlazeck, Fritz J</style></author><author><style face="normal" font="default" size="100%">Aach, John</style></author><author><style face="normal" font="default" size="100%">Chin, Chen-Shan</style></author><author><style face="normal" font="default" size="100%">Church, George M</style></author><author><style face="normal" font="default" size="100%">Li, Heng</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Chromosome-scale, haplotype-resolved assembly of human genomes.</style></title><secondary-title><style face="normal" font="default" size="100%">Nat Biotechnol</style></secondary-title><alt-title><style face="normal" font="default" size="100%">Nat Biotechnol</style></alt-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">Algorithms</style></keyword><keyword><style  face="normal" font="default" size="100%">Chromosomes, Human</style></keyword><keyword><style  face="normal" font="default" size="100%">Genome, Human</style></keyword><keyword><style  face="normal" font="default" size="100%">Haplotypes</style></keyword><keyword><style  face="normal" font="default" size="100%">Heterozygote</style></keyword><keyword><style  face="normal" font="default" size="100%">Humans</style></keyword><keyword><style  face="normal" font="default" size="100%">Polymorphism, Single Nucleotide</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2021</style></year><pub-dates><date><style  face="normal" font="default" size="100%">2021 03</style></date></pub-dates></dates><volume><style face="normal" font="default" size="100%">39</style></volume><pages><style face="normal" font="default" size="100%">309-312</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;Haplotype-resolved or phased genome assembly provides a complete picture of genomes and their complex genetic variations. However, current algorithms for phased assembly either do not generate chromosome-scale phasing or require pedigree information, which limits their application. We present a method named diploid assembly (DipAsm) that uses long, accurate reads and long-range conformation data for single individuals to generate a chromosome-scale phased assembly within 1 day. Applied to four public human genomes, PGP1, HG002, NA12878 and HG00733, DipAsm produced haplotype-resolved assemblies with minimum contig length needed to cover 50% of the known genome (NG50) up to 25 Mb and phased ~99.5% of heterozygous sites at 98-99% accuracy, outperforming other approaches in terms of both contiguity and phasing completeness. We demonstrate the importance of chromosome-scale phased assemblies for the discovery of structural variants (SVs), including thousands of new transposon insertions, and of highly polymorphic and medically important regions such as the human leukocyte antigen (HLA) and killer cell immunoglobulin-like receptor (KIR) regions. DipAsm will facilitate high-quality precision medicine and studies of individual haplotype variation and population diversity.&lt;/p&gt;</style></abstract><issue><style face="normal" font="default" size="100%">3</style></issue><custom1><style face="normal" font="default" size="100%">https://www.ncbi.nlm.nih.gov/pubmed/33288905?dopt=Abstract</style></custom1></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Zarate, Samantha</style></author><author><style face="normal" font="default" size="100%">Carroll, Andrew</style></author><author><style face="normal" font="default" size="100%">Mahmoud, Medhat</style></author><author><style face="normal" font="default" size="100%">Krasheninina, Olga</style></author><author><style face="normal" font="default" size="100%">Jun, Goo</style></author><author><style face="normal" font="default" size="100%">Salerno, William J</style></author><author><style face="normal" font="default" size="100%">Schatz, Michael C</style></author><author><style face="normal" font="default" size="100%">Boerwinkle, Eric</style></author><author><style face="normal" font="default" size="100%">Gibbs, Richard A</style></author><author><style face="normal" font="default" size="100%">Sedlazeck, Fritz J</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Parliament2: Accurate structural variant calling at scale.</style></title><secondary-title><style face="normal" font="default" size="100%">Gigascience</style></secondary-title><alt-title><style face="normal" font="default" size="100%">Gigascience</style></alt-title></titles><dates><year><style  face="normal" font="default" size="100%">2020</style></year><pub-dates><date><style  face="normal" font="default" size="100%">2020 12 21</style></date></pub-dates></dates><volume><style face="normal" font="default" size="100%">9</style></volume><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;&lt;b&gt;BACKGROUND: &lt;/b&gt;Structural variants (SVs) are critical contributors to genetic diversity and genomic disease. To predict the phenotypic impact of SVs, there is a need for better estimates of both the occurrence and frequency of SVs, preferably from large, ethnically diverse cohorts. Thus, the current standard approach requires the use of short paired-end reads, which remain challenging to detect, especially at the scale of hundreds to thousands of samples.&lt;/p&gt;&lt;p&gt;&lt;b&gt;FINDINGS: &lt;/b&gt;We present Parliament2, a consensus SV framework that leverages multiple best-in-class methods to identify high-quality SVs from short-read DNA sequence data at scale. Parliament2 incorporates pre-installed SV callers that are optimized for efficient execution in parallel to reduce the overall runtime and costs. We demonstrate the accuracy of Parliament2 when applied to data from NovaSeq and HiSeq X platforms with the Genome in a Bottle (GIAB) SV call set across all size classes. The reported quality score per SV is calibrated across different SV types and size classes. Parliament2 has the highest F1 score (74.27%) measured across the independent gold standard from GIAB. We illustrate the compute performance by processing all 1000 Genomes samples (2,691 samples) in &lt;1 day on GRCH38. Parliament2 improves the runtime performance of individual methods and is open source (https://github.com/slzarate/parliament2), and a Docker image, as well as a WDL implementation, is available.&lt;/p&gt;&lt;p&gt;&lt;b&gt;CONCLUSION: &lt;/b&gt;Parliament2 provides both a highly accurate single-sample SV call set from short-read DNA sequence data and enables cost-efficient application over cloud or cluster environments, processing thousands of samples.&lt;/p&gt;</style></abstract><issue><style face="normal" font="default" size="100%">12</style></issue><custom1><style face="normal" font="default" size="100%">https://www.ncbi.nlm.nih.gov/pubmed/33347570?dopt=Abstract</style></custom1></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Wenger, Aaron M</style></author><author><style face="normal" font="default" size="100%">Peluso, Paul</style></author><author><style face="normal" font="default" size="100%">Rowell, William J</style></author><author><style face="normal" font="default" size="100%">Chang, Pi-Chuan</style></author><author><style face="normal" font="default" size="100%">Hall, Richard J</style></author><author><style face="normal" font="default" size="100%">Concepcion, Gregory T</style></author><author><style face="normal" font="default" size="100%">Ebler, Jana</style></author><author><style face="normal" font="default" size="100%">Fungtammasan, Arkarachai</style></author><author><style face="normal" font="default" size="100%">Kolesnikov, Alexey</style></author><author><style face="normal" font="default" size="100%">Olson, Nathan D</style></author><author><style face="normal" font="default" size="100%">Töpfer, Armin</style></author><author><style face="normal" font="default" size="100%">Alonge, Michael</style></author><author><style face="normal" font="default" size="100%">Mahmoud, Medhat</style></author><author><style face="normal" font="default" size="100%">Qian, Yufeng</style></author><author><style face="normal" font="default" size="100%">Chin, Chen-Shan</style></author><author><style face="normal" font="default" size="100%">Phillippy, Adam M</style></author><author><style face="normal" font="default" size="100%">Schatz, Michael C</style></author><author><style face="normal" font="default" size="100%">Myers, Gene</style></author><author><style face="normal" font="default" size="100%">DePristo, Mark A</style></author><author><style face="normal" font="default" size="100%">Ruan, Jue</style></author><author><style face="normal" font="default" size="100%">Marschall, Tobias</style></author><author><style face="normal" font="default" size="100%">Sedlazeck, Fritz J</style></author><author><style face="normal" font="default" size="100%">Zook, Justin M</style></author><author><style face="normal" font="default" size="100%">Li, Heng</style></author><author><style face="normal" font="default" size="100%">Koren, Sergey</style></author><author><style face="normal" font="default" size="100%">Carroll, Andrew</style></author><author><style face="normal" font="default" size="100%">Rank, David R</style></author><author><style face="normal" font="default" size="100%">Hunkapiller, Michael W</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome.</style></title><secondary-title><style face="normal" font="default" size="100%">Nat Biotechnol</style></secondary-title><alt-title><style face="normal" font="default" size="100%">Nat. Biotechnol.</style></alt-title></titles><dates><year><style  face="normal" font="default" size="100%">2019</style></year><pub-dates><date><style  face="normal" font="default" size="100%">2019 Oct</style></date></pub-dates></dates><volume><style face="normal" font="default" size="100%">37</style></volume><pages><style face="normal" font="default" size="100%">1155-1162</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;The DNA sequencing technologies in use today produce either highly accurate short reads or less-accurate long reads. We report the optimization of circular consensus sequencing (CCS) to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%) long high-fidelity (HiFi) reads with an average length of 13.5 kilobases (kb). We applied our approach to sequence the well-characterized human HG002/NA24385 genome and obtained precision and recall rates of at least 99.91% for single-nucleotide variants (SNVs), 95.98% for insertions and deletions &lt;50 bp (indels) and 95.99% for structural variants. Our CCS method matches or exceeds the ability of short-read sequencing to detect small variants and structural variants. We estimate that 2,434 discordances are correctable mistakes in the 'genome in a bottle' (GIAB) benchmark set. Nearly all (99.64%) variants can be phased into haplotypes, further improving variant detection. De novo genome assembly using CCS reads alone produced a contiguous and accurate genome with a contig N50 of &gt;15 megabases (Mb) and concordance of 99.997%, substantially outperforming assembly with less-accurate long reads.&lt;/p&gt;</style></abstract><issue><style face="normal" font="default" size="100%">10</style></issue><custom1><style face="normal" font="default" size="100%">https://www.ncbi.nlm.nih.gov/pubmed/31406327?dopt=Abstract</style></custom1></record></records></xml>