<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Ranallo-Benavidez, T Rhyker</style></author><author><style face="normal" font="default" size="100%">Lemmon, Zachary</style></author><author><style face="normal" font="default" size="100%">Soyk, Sebastian</style></author><author><style face="normal" font="default" size="100%">Aganezov, Sergey</style></author><author><style face="normal" font="default" size="100%">Salerno, William J</style></author><author><style face="normal" font="default" size="100%">McCoy, Rajiv C</style></author><author><style face="normal" font="default" size="100%">Lippman, Zachary B</style></author><author><style face="normal" font="default" size="100%">Schatz, Michael C</style></author><author><style face="normal" font="default" size="100%">Sedlazeck, Fritz J</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Optimized sample selection for cost-efficient long-read population sequencing.</style></title><secondary-title><style face="normal" font="default" size="100%">Genome Res</style></secondary-title><alt-title><style face="normal" font="default" size="100%">Genome Res</style></alt-title></titles><dates><year><style  face="normal" font="default" size="100%">2021</style></year><pub-dates><date><style  face="normal" font="default" size="100%">2021 May</style></date></pub-dates></dates><volume><style face="normal" font="default" size="100%">31</style></volume><pages><style face="normal" font="default" size="100%">910-918</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;An increasingly important scenario in population genetics is when a large cohort has been genotyped using a low-resolution approach (e.g., microarrays, exome capture, short-read WGS), from which a few individuals are resequenced using a more comprehensive approach, especially long-read sequencing. The subset of individuals selected should ensure that the captured genetic diversity is fully representative and includes variants across all subpopulations. For example, human variation has historically focused on individuals with European ancestry, but this represents a small fraction of the overall diversity. Addressing this, SVCollector identifies the optimal subset of individuals for resequencing by analyzing population-level VCF files from low-resolution genotyping studies. It then computes a ranked list of samples that maximizes the total number of variants present within a subset of a given size. To solve this optimization problem, SVCollector implements a fast, greedy heuristic and an exact algorithm using integer linear programming. We apply SVCollector on simulated data, 2504 human genomes from the 1000 Genomes Project, and 3024 genomes from the 3000 Rice Genomes Project and show the rankings it computes are more representative than alternative naive strategies. When selecting an optimal subset of 100 samples in these cohorts, SVCollector identifies individuals from every subpopulation, whereas naive methods yield an unbalanced selection. Finally, we show the number of variants present in cohorts selected using this approach follows a power-law distribution that is naturally related to the population genetic concept of the allele frequency spectrum, allowing us to estimate the diversity present with increasing numbers of samples.&lt;/p&gt;</style></abstract><issue><style face="normal" font="default" size="100%">5</style></issue><custom1><style face="normal" font="default" size="100%">https://www.ncbi.nlm.nih.gov/pubmed/33811084?dopt=Abstract</style></custom1></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Alonge, Michael</style></author><author><style face="normal" font="default" size="100%">Wang, Xingang</style></author><author><style face="normal" font="default" size="100%">Benoit, Matthias</style></author><author><style face="normal" font="default" size="100%">Soyk, Sebastian</style></author><author><style face="normal" font="default" size="100%">Pereira, Lara</style></author><author><style face="normal" font="default" size="100%">Zhang, Lei</style></author><author><style face="normal" font="default" size="100%">Suresh, Hamsini</style></author><author><style face="normal" font="default" size="100%">Ramakrishnan, Srividya</style></author><author><style face="normal" font="default" size="100%">Maumus, Florian</style></author><author><style face="normal" font="default" size="100%">Ciren, Danielle</style></author><author><style face="normal" font="default" size="100%">Levy, Yuval</style></author><author><style face="normal" font="default" size="100%">Harel, Tom Hai</style></author><author><style face="normal" font="default" size="100%">Shalev-Schlosser, Gili</style></author><author><style face="normal" font="default" size="100%">Amsellem, Ziva</style></author><author><style face="normal" font="default" size="100%">Razifard, Hamid</style></author><author><style face="normal" font="default" size="100%">Caicedo, Ana L</style></author><author><style face="normal" font="default" size="100%">Tieman, Denise M</style></author><author><style face="normal" font="default" size="100%">Klee, Harry</style></author><author><style face="normal" font="default" size="100%">Kirsche, Melanie</style></author><author><style face="normal" font="default" size="100%">Aganezov, Sergey</style></author><author><style face="normal" font="default" size="100%">Ranallo-Benavidez, T Rhyker</style></author><author><style face="normal" font="default" size="100%">Lemmon, Zachary H</style></author><author><style face="normal" font="default" size="100%">Kim, Jennifer</style></author><author><style face="normal" font="default" size="100%">Robitaille, Gina</style></author><author><style face="normal" font="default" size="100%">Kramer, Melissa</style></author><author><style face="normal" font="default" size="100%">Goodwin, Sara</style></author><author><style face="normal" font="default" size="100%">McCombie, W Richard</style></author><author><style face="normal" font="default" size="100%">Hutton, Samuel</style></author><author><style face="normal" font="default" size="100%">Van Eck, Joyce</style></author><author><style face="normal" font="default" size="100%">Gillis, Jesse</style></author><author><style face="normal" font="default" size="100%">Eshed, Yuval</style></author><author><style face="normal" font="default" size="100%">Sedlazeck, Fritz J</style></author><author><style face="normal" font="default" size="100%">van der Knaap, Esther</style></author><author><style face="normal" font="default" size="100%">Schatz, Michael C</style></author><author><style face="normal" font="default" size="100%">Lippman, Zachary B</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Major Impacts of Widespread Structural Variation on Gene Expression and Crop Improvement in Tomato.</style></title><secondary-title><style face="normal" font="default" size="100%">Cell</style></secondary-title><alt-title><style face="normal" font="default" size="100%">Cell</style></alt-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">Alleles</style></keyword><keyword><style  face="normal" font="default" size="100%">Crops, Agricultural</style></keyword><keyword><style  face="normal" font="default" size="100%">Cytochrome P-450 Enzyme System</style></keyword><keyword><style  face="normal" font="default" size="100%">Ecotype</style></keyword><keyword><style  face="normal" font="default" size="100%">Epistasis, Genetic</style></keyword><keyword><style  face="normal" font="default" size="100%">Fruit</style></keyword><keyword><style  face="normal" font="default" size="100%">Gene Duplication</style></keyword><keyword><style  face="normal" font="default" size="100%">Gene Expression Regulation, Plant</style></keyword><keyword><style  face="normal" font="default" size="100%">Genome, Plant</style></keyword><keyword><style  face="normal" font="default" size="100%">Genomic Structural Variation</style></keyword><keyword><style  face="normal" font="default" size="100%">Genotype</style></keyword><keyword><style  face="normal" font="default" size="100%">Inbreeding</style></keyword><keyword><style  face="normal" font="default" size="100%">Lycopersicon esculentum</style></keyword><keyword><style  face="normal" font="default" size="100%">Molecular Sequence Annotation</style></keyword><keyword><style  face="normal" font="default" size="100%">Phenotype</style></keyword><keyword><style  face="normal" font="default" size="100%">Plant Breeding</style></keyword><keyword><style  face="normal" font="default" size="100%">Quantitative Trait Loci</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2020</style></year><pub-dates><date><style  face="normal" font="default" size="100%">2020 07 09</style></date></pub-dates></dates><volume><style face="normal" font="default" size="100%">182</style></volume><pages><style face="normal" font="default" size="100%">145-161.e23</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;Structural variants (SVs) underlie important crop improvement and domestication traits. However, resolving the extent, diversity, and quantitative impact of SVs has been challenging. We used long-read nanopore sequencing to capture 238,490 SVs in 100 diverse tomato lines. This panSV genome, along with 14 new reference assemblies, revealed large-scale intermixing of diverse genotypes, as well as thousands of SVs intersecting genes and cis-regulatory regions. Hundreds of SV-gene pairs exhibit subtle and significant expression changes, which could broadly influence quantitative trait variation. By combining quantitative genetics with genome editing, we show how multiple SVs that changed gene dosage and expression levels modified fruit flavor, size, and production. In the last example, higher order epistasis among four SVs affecting three related transcription factors allowed introduction of an important harvesting trait in modern tomato. Our findings highlight the underexplored role of SVs in genotype-to-phenotype relationships and their widespread importance and utility in crop improvement.&lt;/p&gt;</style></abstract><issue><style face="normal" font="default" size="100%">1</style></issue><custom1><style face="normal" font="default" size="100%">https://www.ncbi.nlm.nih.gov/pubmed/32553272?dopt=Abstract</style></custom1></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Zarate, Samantha</style></author><author><style face="normal" font="default" size="100%">Carroll, Andrew</style></author><author><style face="normal" font="default" size="100%">Mahmoud, Medhat</style></author><author><style face="normal" font="default" size="100%">Krasheninina, Olga</style></author><author><style face="normal" font="default" size="100%">Jun, Goo</style></author><author><style face="normal" font="default" size="100%">Salerno, William J</style></author><author><style face="normal" font="default" size="100%">Schatz, Michael C</style></author><author><style face="normal" font="default" size="100%">Boerwinkle, Eric</style></author><author><style face="normal" font="default" size="100%">Gibbs, Richard A</style></author><author><style face="normal" font="default" size="100%">Sedlazeck, Fritz J</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Parliament2: Accurate structural variant calling at scale.</style></title><secondary-title><style face="normal" font="default" size="100%">Gigascience</style></secondary-title><alt-title><style face="normal" font="default" size="100%">Gigascience</style></alt-title></titles><dates><year><style  face="normal" font="default" size="100%">2020</style></year><pub-dates><date><style  face="normal" font="default" size="100%">2020 12 21</style></date></pub-dates></dates><volume><style face="normal" font="default" size="100%">9</style></volume><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;&lt;b&gt;BACKGROUND: &lt;/b&gt;Structural variants (SVs) are critical contributors to genetic diversity and genomic disease. To predict the phenotypic impact of SVs, there is a need for better estimates of both the occurrence and frequency of SVs, preferably from large, ethnically diverse cohorts. Thus, the current standard approach requires the use of short paired-end reads, which remain challenging to detect, especially at the scale of hundreds to thousands of samples.&lt;/p&gt;&lt;p&gt;&lt;b&gt;FINDINGS: &lt;/b&gt;We present Parliament2, a consensus SV framework that leverages multiple best-in-class methods to identify high-quality SVs from short-read DNA sequence data at scale. Parliament2 incorporates pre-installed SV callers that are optimized for efficient execution in parallel to reduce the overall runtime and costs. We demonstrate the accuracy of Parliament2 when applied to data from NovaSeq and HiSeq X platforms with the Genome in a Bottle (GIAB) SV call set across all size classes. The reported quality score per SV is calibrated across different SV types and size classes. Parliament2 has the highest F1 score (74.27%) measured across the independent gold standard from GIAB. We illustrate the compute performance by processing all 1000 Genomes samples (2,691 samples) in &lt;1 day on GRCH38. Parliament2 improves the runtime performance of individual methods and is open source (https://github.com/slzarate/parliament2), and a Docker image, as well as a WDL implementation, is available.&lt;/p&gt;&lt;p&gt;&lt;b&gt;CONCLUSION: &lt;/b&gt;Parliament2 provides both a highly accurate single-sample SV call set from short-read DNA sequence data and enables cost-efficient application over cloud or cluster environments, processing thousands of samples.&lt;/p&gt;</style></abstract><issue><style face="normal" font="default" size="100%">12</style></issue><custom1><style face="normal" font="default" size="100%">https://www.ncbi.nlm.nih.gov/pubmed/33347570?dopt=Abstract</style></custom1></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Wenger, Aaron M</style></author><author><style face="normal" font="default" size="100%">Peluso, Paul</style></author><author><style face="normal" font="default" size="100%">Rowell, William J</style></author><author><style face="normal" font="default" size="100%">Chang, Pi-Chuan</style></author><author><style face="normal" font="default" size="100%">Hall, Richard J</style></author><author><style face="normal" font="default" size="100%">Concepcion, Gregory T</style></author><author><style face="normal" font="default" size="100%">Ebler, Jana</style></author><author><style face="normal" font="default" size="100%">Fungtammasan, Arkarachai</style></author><author><style face="normal" font="default" size="100%">Kolesnikov, Alexey</style></author><author><style face="normal" font="default" size="100%">Olson, Nathan D</style></author><author><style face="normal" font="default" size="100%">Töpfer, Armin</style></author><author><style face="normal" font="default" size="100%">Alonge, Michael</style></author><author><style face="normal" font="default" size="100%">Mahmoud, Medhat</style></author><author><style face="normal" font="default" size="100%">Qian, Yufeng</style></author><author><style face="normal" font="default" size="100%">Chin, Chen-Shan</style></author><author><style face="normal" font="default" size="100%">Phillippy, Adam M</style></author><author><style face="normal" font="default" size="100%">Schatz, Michael C</style></author><author><style face="normal" font="default" size="100%">Myers, Gene</style></author><author><style face="normal" font="default" size="100%">DePristo, Mark A</style></author><author><style face="normal" font="default" size="100%">Ruan, Jue</style></author><author><style face="normal" font="default" size="100%">Marschall, Tobias</style></author><author><style face="normal" font="default" size="100%">Sedlazeck, Fritz J</style></author><author><style face="normal" font="default" size="100%">Zook, Justin M</style></author><author><style face="normal" font="default" size="100%">Li, Heng</style></author><author><style face="normal" font="default" size="100%">Koren, Sergey</style></author><author><style face="normal" font="default" size="100%">Carroll, Andrew</style></author><author><style face="normal" font="default" size="100%">Rank, David R</style></author><author><style face="normal" font="default" size="100%">Hunkapiller, Michael W</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome.</style></title><secondary-title><style face="normal" font="default" size="100%">Nat Biotechnol</style></secondary-title><alt-title><style face="normal" font="default" size="100%">Nat. Biotechnol.</style></alt-title></titles><dates><year><style  face="normal" font="default" size="100%">2019</style></year><pub-dates><date><style  face="normal" font="default" size="100%">2019 Oct</style></date></pub-dates></dates><volume><style face="normal" font="default" size="100%">37</style></volume><pages><style face="normal" font="default" size="100%">1155-1162</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;The DNA sequencing technologies in use today produce either highly accurate short reads or less-accurate long reads. We report the optimization of circular consensus sequencing (CCS) to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%) long high-fidelity (HiFi) reads with an average length of 13.5 kilobases (kb). We applied our approach to sequence the well-characterized human HG002/NA24385 genome and obtained precision and recall rates of at least 99.91% for single-nucleotide variants (SNVs), 95.98% for insertions and deletions &lt;50 bp (indels) and 95.99% for structural variants. Our CCS method matches or exceeds the ability of short-read sequencing to detect small variants and structural variants. We estimate that 2,434 discordances are correctable mistakes in the 'genome in a bottle' (GIAB) benchmark set. Nearly all (99.64%) variants can be phased into haplotypes, further improving variant detection. De novo genome assembly using CCS reads alone produced a contiguous and accurate genome with a contig N50 of &gt;15 megabases (Mb) and concordance of 99.997%, substantially outperforming assembly with less-accurate long reads.&lt;/p&gt;</style></abstract><issue><style face="normal" font="default" size="100%">10</style></issue><custom1><style face="normal" font="default" size="100%">https://www.ncbi.nlm.nih.gov/pubmed/31406327?dopt=Abstract</style></custom1></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Dennenmoser, Stefan</style></author><author><style face="normal" font="default" size="100%">Sedlazeck, Fritz J</style></author><author><style face="normal" font="default" size="100%">Schatz, Michael C</style></author><author><style face="normal" font="default" size="100%">Altmüller, Janine</style></author><author><style face="normal" font="default" size="100%">Zytnicki, Matthias</style></author><author><style face="normal" font="default" size="100%">Nolte, Arne W</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Genome-wide patterns of transposon proliferation in an evolutionary young hybrid fish.</style></title><secondary-title><style face="normal" font="default" size="100%">Mol Ecol</style></secondary-title><alt-title><style face="normal" font="default" size="100%">Mol. Ecol.</style></alt-title></titles><dates><year><style  face="normal" font="default" size="100%">2019</style></year><pub-dates><date><style  face="normal" font="default" size="100%">2019 Mar</style></date></pub-dates></dates><volume><style face="normal" font="default" size="100%">28</style></volume><pages><style face="normal" font="default" size="100%">1491-1505</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;Hybridization can induce transposons to jump into new genomic positions, which may result in their accumulation across the genome. Alternatively, transposon copy numbers may increase through nonallelic (ectopic) homologous recombination in highly repetitive regions of the genome. The relative contribution of transposition bursts versus recombination-based mechanisms to evolutionary processes remains unclear because studies on transposon dynamics in natural systems are rare. We assessed the genomewide distribution of transposon insertions in a young hybrid lineage (&quot;invasive Cottus&quot;, n = 11) and its parental species Cottus rhenanus (n = 17) and Cottus perifretum(n = 9) using a reference genome assembled from long single molecule pacbio reads. An inventory of transposable elements was reconstructed from the same data and annotated. Transposon copy numbers in the hybrid lineage increased in 120 (15.9%) out of 757 transposons studied here. The copy number increased on average by 69% (range: 10%-197%). Given the age of the hybrid lineage, this suggests that they have proliferated within a few hundred generations since admixture began. However, frequency spectra of transposon insertions revealed no increase in novel and rare insertions across assembled parts of the genome. This implies that transposons were added to repetitive regions of the genome that remain difficult to assemble. Future studies will need to evaluate whether recombination-based mechanisms rather than genomewide transposition may explain the majority of the recent transposon proliferation in the hybrid lineage. Irrespectively of the underlying mechanism, the observed overabundance in repetitive parts of the genome suggests that gene-rich regions are unlikely to be directly affected.&lt;/p&gt;</style></abstract><issue><style face="normal" font="default" size="100%">6</style></issue><custom1><style face="normal" font="default" size="100%">https://www.ncbi.nlm.nih.gov/pubmed/30520198?dopt=Abstract</style></custom1></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Luo, Ruibang</style></author><author><style face="normal" font="default" size="100%">Sedlazeck, Fritz J</style></author><author><style face="normal" font="default" size="100%">Lam, Tak-Wah</style></author><author><style face="normal" font="default" size="100%">Schatz, Michael C</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">A multi-task convolutional deep neural network for variant calling in single molecule sequencing.</style></title><secondary-title><style face="normal" font="default" size="100%">Nat Commun</style></secondary-title><alt-title><style face="normal" font="default" size="100%">Nat Commun</style></alt-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">Base Sequence</style></keyword><keyword><style  face="normal" font="default" size="100%">Computational Biology</style></keyword><keyword><style  face="normal" font="default" size="100%">DNA Mutational Analysis</style></keyword><keyword><style  face="normal" font="default" size="100%">Genome, Human</style></keyword><keyword><style  face="normal" font="default" size="100%">Genome-Wide Association Study</style></keyword><keyword><style  face="normal" font="default" size="100%">Genomics</style></keyword><keyword><style  face="normal" font="default" size="100%">Genotype</style></keyword><keyword><style  face="normal" font="default" size="100%">Genotyping Techniques</style></keyword><keyword><style  face="normal" font="default" size="100%">Humans</style></keyword><keyword><style  face="normal" font="default" size="100%">INDEL Mutation</style></keyword><keyword><style  face="normal" font="default" size="100%">Nanopores</style></keyword><keyword><style  face="normal" font="default" size="100%">Neural Networks (Computer)</style></keyword><keyword><style  face="normal" font="default" size="100%">Polymorphism, Single Nucleotide</style></keyword><keyword><style  face="normal" font="default" size="100%">Sequence Analysis, DNA</style></keyword><keyword><style  face="normal" font="default" size="100%">Software</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2019</style></year><pub-dates><date><style  face="normal" font="default" size="100%">2019 03 01</style></date></pub-dates></dates><volume><style face="normal" font="default" size="100%">10</style></volume><pages><style face="normal" font="default" size="100%">998</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;The accurate identification of DNA sequence variants is an important, but challenging task in genomics. It is particularly difficult for single molecule sequencing, which has a per-nucleotide error rate of ~5-15%. Meeting this demand, we developed Clairvoyante, a multi-task five-layer convolutional neural network model for predicting variant type (SNP or indel), zygosity, alternative allele and indel length from aligned reads. For the well-characterized NA12878 human sample, Clairvoyante achieves 99.67, 95.78, 90.53% F1-score on 1KP common variants, and 98.65, 92.57, 87.26% F1-score for whole-genome analysis, using Illumina, PacBio, and Oxford Nanopore data, respectively. Training on a second human sample shows Clairvoyante is sample agnostic and finds variants in less than 2 h on a standard server. Furthermore, we present 3,135 variants that are missed using Illumina but supported independently by both PacBio and Oxford Nanopore reads. Clairvoyante is available open-source ( https://github.com/aquaskyline/Clairvoyante ), with modules to train, utilize and visualize the model.&lt;/p&gt;</style></abstract><issue><style face="normal" font="default" size="100%">1</style></issue><custom1><style face="normal" font="default" size="100%">https://www.ncbi.nlm.nih.gov/pubmed/30824707?dopt=Abstract</style></custom1></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Alonge, Michael</style></author><author><style face="normal" font="default" size="100%">Soyk, Sebastian</style></author><author><style face="normal" font="default" size="100%">Ramakrishnan, Srividya</style></author><author><style face="normal" font="default" size="100%">Wang, Xingang</style></author><author><style face="normal" font="default" size="100%">Goodwin, Sara</style></author><author><style face="normal" font="default" size="100%">Sedlazeck, Fritz J</style></author><author><style face="normal" font="default" size="100%">Lippman, Zachary B</style></author><author><style face="normal" font="default" size="100%">Schatz, Michael C</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">RaGOO: fast and accurate reference-guided scaffolding of draft genomes.</style></title><secondary-title><style face="normal" font="default" size="100%">Genome Biol</style></secondary-title><alt-title><style face="normal" font="default" size="100%">Genome Biol.</style></alt-title></titles><dates><year><style  face="normal" font="default" size="100%">2019</style></year><pub-dates><date><style  face="normal" font="default" size="100%">2019 Oct 28</style></date></pub-dates></dates><volume><style face="normal" font="default" size="100%">20</style></volume><pages><style face="normal" font="default" size="100%">224</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;We present RaGOO, a reference-guided contig ordering and orienting tool that leverages the speed and sensitivity of Minimap2 to accurately achieve chromosome-scale assemblies in minutes. After the pseudomolecules are constructed, RaGOO identifies structural variants, including those spanning sequencing gaps. We show that RaGOO accurately orders and orients 3 de novo tomato genome assemblies, including the widely used M82 reference cultivar. We then demonstrate the scalability and utility of RaGOO with a pan-genome analysis of 103 Arabidopsis thaliana accessions by examining the structural variants detected in the newly assembled pseudomolecules. RaGOO is available open source at https://github.com/malonge/RaGOO .&lt;/p&gt;</style></abstract><issue><style face="normal" font="default" size="100%">1</style></issue><custom1><style face="normal" font="default" size="100%">https://www.ncbi.nlm.nih.gov/pubmed/31661016?dopt=Abstract</style></custom1></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Sedlazeck, Fritz J</style></author><author><style face="normal" font="default" size="100%">Rescheneder, Philipp</style></author><author><style face="normal" font="default" size="100%">Smolka, Moritz</style></author><author><style face="normal" font="default" size="100%">Fang, Han</style></author><author><style face="normal" font="default" size="100%">Nattestad, Maria</style></author><author><style face="normal" font="default" size="100%">von Haeseler, Arndt</style></author><author><style face="normal" font="default" size="100%">Schatz, Michael C</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Accurate detection of complex structural variations using single-molecule sequencing.</style></title><secondary-title><style face="normal" font="default" size="100%">Nat Methods</style></secondary-title><alt-title><style face="normal" font="default" size="100%">Nat Methods</style></alt-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">DNA Mutational Analysis</style></keyword><keyword><style  face="normal" font="default" size="100%">Genome, Human</style></keyword><keyword><style  face="normal" font="default" size="100%">Genomics</style></keyword><keyword><style  face="normal" font="default" size="100%">High-Throughput Nucleotide Sequencing</style></keyword><keyword><style  face="normal" font="default" size="100%">Humans</style></keyword><keyword><style  face="normal" font="default" size="100%">Sequence Analysis, DNA</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2018</style></year><pub-dates><date><style  face="normal" font="default" size="100%">2018 Jun</style></date></pub-dates></dates><volume><style face="normal" font="default" size="100%">15</style></volume><pages><style face="normal" font="default" size="100%">461-468</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;Structural variations are the greatest source of genetic variation, but they remain poorly understood because of technological limitations. Single-molecule long-read sequencing has the potential to dramatically advance the field, although high error rates are a challenge with existing methods. Addressing this need, we introduce open-source methods for long-read alignment (NGMLR; https://github.com/philres/ngmlr ) and structural variant identification (Sniffles; https://github.com/fritzsedlazeck/Sniffles ) that provide unprecedented sensitivity and precision for variant detection, even in repeat-rich regions and for complex nested events that can have substantial effects on human health. In several long-read datasets, including healthy and cancerous human genomes, we discovered thousands of novel variants and categorized systematic errors in short-read approaches. NGMLR and Sniffles can automatically filter false events and operate on low-coverage data, thereby reducing the high costs that have hindered the application of long reads in clinical and research settings.&lt;/p&gt;</style></abstract><issue><style face="normal" font="default" size="100%">6</style></issue><custom1><style face="normal" font="default" size="100%">https://www.ncbi.nlm.nih.gov/pubmed/29713083?dopt=Abstract</style></custom1></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Nattestad, Maria</style></author><author><style face="normal" font="default" size="100%">Goodwin, Sara</style></author><author><style face="normal" font="default" size="100%">Ng, Karen</style></author><author><style face="normal" font="default" size="100%">Baslan, Timour</style></author><author><style face="normal" font="default" size="100%">Sedlazeck, Fritz J</style></author><author><style face="normal" font="default" size="100%">Rescheneder, Philipp</style></author><author><style face="normal" font="default" size="100%">Garvin, Tyler</style></author><author><style face="normal" font="default" size="100%">Fang, Han</style></author><author><style face="normal" font="default" size="100%">Gurtowski, James</style></author><author><style face="normal" font="default" size="100%">Hutton, Elizabeth</style></author><author><style face="normal" font="default" size="100%">Tseng, Elizabeth</style></author><author><style face="normal" font="default" size="100%">Chin, Chen-Shan</style></author><author><style face="normal" font="default" size="100%">Beck, Timothy</style></author><author><style face="normal" font="default" size="100%">Sundaravadanam, Yogi</style></author><author><style face="normal" font="default" size="100%">Kramer, Melissa</style></author><author><style face="normal" font="default" size="100%">Antoniou, Eric</style></author><author><style face="normal" font="default" size="100%">McPherson, John D</style></author><author><style face="normal" font="default" size="100%">Hicks, James</style></author><author><style face="normal" font="default" size="100%">McCombie, W Richard</style></author><author><style face="normal" font="default" size="100%">Schatz, Michael C</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Complex rearrangements and oncogene amplifications revealed by long-read DNA and RNA sequencing of a breast cancer cell line.</style></title><secondary-title><style face="normal" font="default" size="100%">Genome Res</style></secondary-title><alt-title><style face="normal" font="default" size="100%">Genome Res.</style></alt-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">Breast Neoplasms</style></keyword><keyword><style  face="normal" font="default" size="100%">Female</style></keyword><keyword><style  face="normal" font="default" size="100%">Gene Amplification</style></keyword><keyword><style  face="normal" font="default" size="100%">Gene Rearrangement</style></keyword><keyword><style  face="normal" font="default" size="100%">Genome, Human</style></keyword><keyword><style  face="normal" font="default" size="100%">Genomic Structural Variation</style></keyword><keyword><style  face="normal" font="default" size="100%">High-Throughput Nucleotide Sequencing</style></keyword><keyword><style  face="normal" font="default" size="100%">Humans</style></keyword><keyword><style  face="normal" font="default" size="100%">MCF-7 Cells</style></keyword><keyword><style  face="normal" font="default" size="100%">Oncogenes</style></keyword><keyword><style  face="normal" font="default" size="100%">Receptor, ErbB-2</style></keyword><keyword><style  face="normal" font="default" size="100%">Repetitive Sequences, Nucleic Acid</style></keyword><keyword><style  face="normal" font="default" size="100%">Transcriptome</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2018</style></year><pub-dates><date><style  face="normal" font="default" size="100%">2018 08</style></date></pub-dates></dates><volume><style face="normal" font="default" size="100%">28</style></volume><pages><style face="normal" font="default" size="100%">1126-1135</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;The SK-BR-3 cell line is one of the most important models for HER2+ breast cancers, which affect one in five breast cancer patients. SK-BR-3 is known to be highly rearranged, although much of the variation is in complex and repetitive regions that may be underreported. Addressing this, we sequenced SK-BR-3 using long-read single molecule sequencing from Pacific Biosciences and develop one of the most detailed maps of structural variations (SVs) in a cancer genome available, with nearly 20,000 variants present, most of which were missed by short-read sequencing. Surrounding the important  oncogene (also known as ), we discover a complex sequence of nested duplications and translocations, suggesting a punctuated progression. Full-length transcriptome sequencing further revealed several novel gene fusions within the nested genomic variants. Combining long-read genome and transcriptome sequencing enables an in-depth analysis of how SVs disrupt the genome and sheds new light on the complex mechanisms involved in cancer genome evolution.&lt;/p&gt;</style></abstract><issue><style face="normal" font="default" size="100%">8</style></issue><custom1><style face="normal" font="default" size="100%">http://www.ncbi.nlm.nih.gov/pubmed/29954844?dopt=Abstract</style></custom1></record></records></xml>