資料載入中.....
|
請使用永久網址來引用或連結此文件:
https://irlib.pccu.edu.tw/handle/987654321/36133
|
題名: | De novo assembly of highly polymorphic metagenomic data using in situ generated reference sequences and a novel BLAST-based assembly pipeline |
作者: | Lin, YY (Lin, You-Yu) Hsieh, CH (Hsieh, Chia-Hung) Chen, JH (Chen, Jiun-Hong) Lu, XM (Lu, Xuemei) Kao, JH (Kao, Jia-Horng) Chen, PJ (Chen, Pei-Jer) Chen, DS (Chen, Ding-Shinn) Wang, HY (Wang, Hurng-Yi) |
貢獻者: | 森保系 |
關鍵詞: | Next generation sequencing Metagenomics Hepatitis B virus Sequence assembly Assembly pipeline |
日期: | 2017-04 |
上傳時間: | 2017-06-08 10:50:25 (UTC+8) |
摘要: | Background: The accuracy of metagenomic assembly is usually compromised by high levels of polymorphism due to divergent reads from the same genomic region recognized as different loci when sequenced and assembled together. A viral quasispecies is a group of abundant and diversified genetically related viruses found in a single carrier. Current mainstream assembly methods, such as Velvet and SOAPdenovo, were not originally intended for the assembly of such metagenomics data, and therefore demands for new methods to provide accurate and informative assembly results for metagenomic data.
Results: In this study, we present a hybrid method for assembling highly polymorphic data combining the partial de novo-reference assembly (PDR) strategy and the BLAST-based assembly pipeline (BBAP). The PDR strategy generates in situ reference sequences through de novo assembly of a randomly extracted partial data set which is subsequently used for the reference assembly for the full data set. BBAP employs a greedy algorithm to assemble polymorphic reads. We used 12 hepatitis B virus quasispecies NGS data sets from a previous study to assess and compare the performance of both PDR and BBAP. Analyses suggest the high polymorphism of a full metagenomic data set leads to fragmentized de novo assembly results, whereas the biased or limited representation of external reference sequences included fewer reads into the assembly with lower assembly accuracy and variation sensitivity. In comparison, the PDR generated in situ reference sequence incorporated more reads into the final PDR assembly of the full metagenomics data set along with greater accuracy and higher variation sensitivity. BBAP assembly results also suggest higher assembly efficiency and accuracy compared to other assembly methods. Additionally, BBAP assembly recovered HBV structural variants that were not observed amongst assembly results of other methods. Together, PDR/BBAP assembly results were significantly better than other compared methods.
Conclusions: Both PDR and BBAP independently increased the assembly efficiency and accuracy of highly polymorphic data, and assembly performances were further improved when used together. BBAP also provides nucleotide frequency information. Together, PDR and BBAP provide powerful tools for metagenomic data studies. |
關聯: | BMC BIOINFORMATICS 卷: 18 文獻號碼: 223 |
顯示於類別: | [森林暨自然保育學系 ] 期刊論文
|
文件中的檔案:
檔案 |
描述 |
大小 | 格式 | 瀏覽次數 |
index.html | | 0Kb | HTML | 414 | 檢視/開啟 |
|
在CCUR中所有的資料項目都受到原著作權保護.
|