<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">De Coster, Wouter</style></author><author><style face="normal" font="default" size="100%">Weissensteiner, Matthias H</style></author><author><style face="normal" font="default" size="100%">Sedlazeck, Fritz J</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Towards population-scale long-read sequencing.</style></title><secondary-title><style face="normal" font="default" size="100%">Nat Rev Genet</style></secondary-title><alt-title><style face="normal" font="default" size="100%">Nat Rev Genet</style></alt-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">Computational Biology</style></keyword><keyword><style  face="normal" font="default" size="100%">Genome, Human</style></keyword><keyword><style  face="normal" font="default" size="100%">Genomics</style></keyword><keyword><style  face="normal" font="default" size="100%">High-Throughput Nucleotide Sequencing</style></keyword><keyword><style  face="normal" font="default" size="100%">Humans</style></keyword><keyword><style  face="normal" font="default" size="100%">Industrial Development</style></keyword><keyword><style  face="normal" font="default" size="100%">Sequence Analysis, DNA</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2021</style></year><pub-dates><date><style  face="normal" font="default" size="100%">2021 09</style></date></pub-dates></dates><volume><style face="normal" font="default" size="100%">22</style></volume><pages><style face="normal" font="default" size="100%">572-587</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;Long-read sequencing technologies have now reached a level of accuracy and yield that allows their application to variant detection at a scale of tens to thousands of samples. Concomitant with the development of new computational tools, the first population-scale studies involving long-read sequencing have emerged over the past 2 years and, given the continuous advancement of the field, many more are likely to follow. In this Review, we survey recent developments in population-scale long-read sequencing, highlight potential challenges of a scaled-up approach and provide guidance regarding experimental design. We provide an overview of current long-read sequencing platforms, variant calling methodologies and approaches for de novo assemblies and reference-based mapping approaches. Furthermore, we summarize strategies for variant validation, genotyping and predicting functional impact and emphasize challenges remaining in achieving long-read sequencing at a population scale.&lt;/p&gt;</style></abstract><issue><style face="normal" font="default" size="100%">9</style></issue><custom1><style face="normal" font="default" size="100%">https://www.ncbi.nlm.nih.gov/pubmed/34050336?dopt=Abstract</style></custom1></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Shen, Feichen</style></author><author><style face="normal" font="default" size="100%">Kidd, Jeffrey M</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Rapid, Paralog-Sensitive CNV Analysis of 2457 Human Genomes Using QuicK-mer2.</style></title><secondary-title><style face="normal" font="default" size="100%">Genes (Basel)</style></secondary-title><alt-title><style face="normal" font="default" size="100%">Genes (Basel)</style></alt-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">Algorithms</style></keyword><keyword><style  face="normal" font="default" size="100%">Computational Biology</style></keyword><keyword><style  face="normal" font="default" size="100%">DNA Copy Number Variations</style></keyword><keyword><style  face="normal" font="default" size="100%">Evolution, Molecular</style></keyword><keyword><style  face="normal" font="default" size="100%">Gene Duplication</style></keyword><keyword><style  face="normal" font="default" size="100%">Genome, Human</style></keyword><keyword><style  face="normal" font="default" size="100%">Humans</style></keyword><keyword><style  face="normal" font="default" size="100%">Sequence Analysis, DNA</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2020</style></year><pub-dates><date><style  face="normal" font="default" size="100%">2020 01 29</style></date></pub-dates></dates><volume><style face="normal" font="default" size="100%">11</style></volume><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;Gene duplication is a major mechanism for the evolution of gene novelty, and copy-number variation makes a major contribution to inter-individual genetic diversity. However, most approaches for studying copy-number variation rely upon uniquely mapping reads to a genome reference and are unable to distinguish among duplicated sequences. Specialized approaches to interrogate specific paralogs are comparatively slow and have a high degree of computational complexity, limiting their effective application to emerging population-scale data sets. We present QuicK-mer2, a self-contained, mapping-free approach that enables the rapid construction of paralog-specific copy-number maps from short-read sequence data. This approach is based on the tabulation of unique k-mer sequences from short-read data sets, and is able to analyze a 20X coverage human genome in approximately 20 min. We applied our approach to newly released sequence data from the 1000 Genomes Project, constructed paralog-specific copy-number maps from 2457 unrelated individuals, and uncovered copy-number variation of paralogous genes. We identify nine genes where none of the analyzed samples have a copy number of two, 92 genes where the majority of samples have a copy number other than two, and describe rare copy number variation effecting multiple genes at the APOBEC3 locus.&lt;/p&gt;</style></abstract><issue><style face="normal" font="default" size="100%">2</style></issue><custom1><style face="normal" font="default" size="100%">https://www.ncbi.nlm.nih.gov/pubmed/32013076?dopt=Abstract</style></custom1></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Luo, Ruibang</style></author><author><style face="normal" font="default" size="100%">Sedlazeck, Fritz J</style></author><author><style face="normal" font="default" size="100%">Lam, Tak-Wah</style></author><author><style face="normal" font="default" size="100%">Schatz, Michael C</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">A multi-task convolutional deep neural network for variant calling in single molecule sequencing.</style></title><secondary-title><style face="normal" font="default" size="100%">Nat Commun</style></secondary-title><alt-title><style face="normal" font="default" size="100%">Nat Commun</style></alt-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">Base Sequence</style></keyword><keyword><style  face="normal" font="default" size="100%">Computational Biology</style></keyword><keyword><style  face="normal" font="default" size="100%">DNA Mutational Analysis</style></keyword><keyword><style  face="normal" font="default" size="100%">Genome, Human</style></keyword><keyword><style  face="normal" font="default" size="100%">Genome-Wide Association Study</style></keyword><keyword><style  face="normal" font="default" size="100%">Genomics</style></keyword><keyword><style  face="normal" font="default" size="100%">Genotype</style></keyword><keyword><style  face="normal" font="default" size="100%">Genotyping Techniques</style></keyword><keyword><style  face="normal" font="default" size="100%">Humans</style></keyword><keyword><style  face="normal" font="default" size="100%">INDEL Mutation</style></keyword><keyword><style  face="normal" font="default" size="100%">Nanopores</style></keyword><keyword><style  face="normal" font="default" size="100%">Neural Networks (Computer)</style></keyword><keyword><style  face="normal" font="default" size="100%">Polymorphism, Single Nucleotide</style></keyword><keyword><style  face="normal" font="default" size="100%">Sequence Analysis, DNA</style></keyword><keyword><style  face="normal" font="default" size="100%">Software</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2019</style></year><pub-dates><date><style  face="normal" font="default" size="100%">2019 03 01</style></date></pub-dates></dates><volume><style face="normal" font="default" size="100%">10</style></volume><pages><style face="normal" font="default" size="100%">998</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;The accurate identification of DNA sequence variants is an important, but challenging task in genomics. It is particularly difficult for single molecule sequencing, which has a per-nucleotide error rate of ~5-15%. Meeting this demand, we developed Clairvoyante, a multi-task five-layer convolutional neural network model for predicting variant type (SNP or indel), zygosity, alternative allele and indel length from aligned reads. For the well-characterized NA12878 human sample, Clairvoyante achieves 99.67, 95.78, 90.53% F1-score on 1KP common variants, and 98.65, 92.57, 87.26% F1-score for whole-genome analysis, using Illumina, PacBio, and Oxford Nanopore data, respectively. Training on a second human sample shows Clairvoyante is sample agnostic and finds variants in less than 2 h on a standard server. Furthermore, we present 3,135 variants that are missed using Illumina but supported independently by both PacBio and Oxford Nanopore reads. Clairvoyante is available open-source ( https://github.com/aquaskyline/Clairvoyante ), with modules to train, utilize and visualize the model.&lt;/p&gt;</style></abstract><issue><style face="normal" font="default" size="100%">1</style></issue><custom1><style face="normal" font="default" size="100%">https://www.ncbi.nlm.nih.gov/pubmed/30824707?dopt=Abstract</style></custom1></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Huang, Yi-Fei</style></author><author><style face="normal" font="default" size="100%">Gulko, Brad</style></author><author><style face="normal" font="default" size="100%">Siepel, Adam</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data.</style></title><secondary-title><style face="normal" font="default" size="100%">Nat Genet</style></secondary-title><alt-title><style face="normal" font="default" size="100%">Nat Genet</style></alt-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">Animals</style></keyword><keyword><style  face="normal" font="default" size="100%">Base Sequence</style></keyword><keyword><style  face="normal" font="default" size="100%">Computational Biology</style></keyword><keyword><style  face="normal" font="default" size="100%">Evolution, Molecular</style></keyword><keyword><style  face="normal" font="default" size="100%">Genetic Variation</style></keyword><keyword><style  face="normal" font="default" size="100%">Genome</style></keyword><keyword><style  face="normal" font="default" size="100%">Humans</style></keyword><keyword><style  face="normal" font="default" size="100%">Mammals</style></keyword><keyword><style  face="normal" font="default" size="100%">Metagenomics</style></keyword><keyword><style  face="normal" font="default" size="100%">Phenotype</style></keyword><keyword><style  face="normal" font="default" size="100%">Primates</style></keyword><keyword><style  face="normal" font="default" size="100%">Vertebrates</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2017</style></year><pub-dates><date><style  face="normal" font="default" size="100%">2017 Apr</style></date></pub-dates></dates><volume><style face="normal" font="default" size="100%">49</style></volume><pages><style face="normal" font="default" size="100%">618-624</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;Many genetic variants that influence phenotypes of interest are located outside of protein-coding genes, yet existing methods for identifying such variants have poor predictive power. Here we introduce a new computational method, called LINSIGHT, that substantially improves the prediction of noncoding nucleotide sites at which mutations are likely to have deleterious fitness consequences, and which, therefore, are likely to be phenotypically important. LINSIGHT combines a generalized linear model for functional genomic data with a probabilistic model of molecular evolution. The method is fast and highly scalable, enabling it to exploit the 'big data' available in modern genomics. We show that LINSIGHT outperforms the best available methods in identifying human noncoding variants associated with inherited diseases. In addition, we apply LINSIGHT to an atlas of human enhancers and show that the fitness consequences at enhancers depend on cell type, tissue specificity, and constraints at associated promoters.&lt;/p&gt;</style></abstract><issue><style face="normal" font="default" size="100%">4</style></issue><custom1><style face="normal" font="default" size="100%">https://www.ncbi.nlm.nih.gov/pubmed/28288115?dopt=Abstract</style></custom1></record></records></xml>