文化大學機構典藏 CCUR:Item 987654321/36133
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 46867/50733 (92%)
Visitors : 11872495      Online Users : 393
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: https://irlib.pccu.edu.tw/handle/987654321/36133


    Title: De novo assembly of highly polymorphic metagenomic data using in situ generated reference sequences and a novel BLAST-based assembly pipeline
    Authors: Lin, YY (Lin, You-Yu)
    Hsieh, CH (Hsieh, Chia-Hung)
    Chen, JH (Chen, Jiun-Hong)
    Lu, XM (Lu, Xuemei)
    Kao, JH (Kao, Jia-Horng)
    Chen, PJ (Chen, Pei-Jer)
    Chen, DS (Chen, Ding-Shinn)
    Wang, HY (Wang, Hurng-Yi)
    Contributors: 森保系
    Keywords: Next generation sequencing
    Metagenomics
    Hepatitis B virus
    Sequence assembly
    Assembly pipeline
    Date: 2017-04
    Issue Date: 2017-06-08 10:50:25 (UTC+8)
    Abstract: Background: The accuracy of metagenomic assembly is usually compromised by high levels of polymorphism due to divergent reads from the same genomic region recognized as different loci when sequenced and assembled together. A viral quasispecies is a group of abundant and diversified genetically related viruses found in a single carrier. Current mainstream assembly methods, such as Velvet and SOAPdenovo, were not originally intended for the assembly of such metagenomics data, and therefore demands for new methods to provide accurate and informative assembly results for metagenomic data.

    Results: In this study, we present a hybrid method for assembling highly polymorphic data combining the partial de novo-reference assembly (PDR) strategy and the BLAST-based assembly pipeline (BBAP). The PDR strategy generates in situ reference sequences through de novo assembly of a randomly extracted partial data set which is subsequently used for the reference assembly for the full data set. BBAP employs a greedy algorithm to assemble polymorphic reads. We used 12 hepatitis B virus quasispecies NGS data sets from a previous study to assess and compare the performance of both PDR and BBAP. Analyses suggest the high polymorphism of a full metagenomic data set leads to fragmentized de novo assembly results, whereas the biased or limited representation of external reference sequences included fewer reads into the assembly with lower assembly accuracy and variation sensitivity. In comparison, the PDR generated in situ reference sequence incorporated more reads into the final PDR assembly of the full metagenomics data set along with greater accuracy and higher variation sensitivity. BBAP assembly results also suggest higher assembly efficiency and accuracy compared to other assembly methods. Additionally, BBAP assembly recovered HBV structural variants that were not observed amongst assembly results of other methods. Together, PDR/BBAP assembly results were significantly better than other compared methods.

    Conclusions: Both PDR and BBAP independently increased the assembly efficiency and accuracy of highly polymorphic data, and assembly performances were further improved when used together. BBAP also provides nucleotide frequency information. Together, PDR and BBAP provide powerful tools for metagenomic data studies.
    Relation: BMC BIOINFORMATICS 卷: 18 文獻號碼: 223
    Appears in Collections:[Department of Forestry and Nature Conservation] journal articles

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML415View/Open


    All items in CCUR are protected by copyright, with all rights reserved.


    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback