TY - JOUR ID - 50305 TI - Clustering of Short Read Sequences for de novo Transcriptome Assembly JO - Progress in Biological Sciences JA - PBS LA - en SN - 1016-1058 AU - Saadat, Samaneh AU - Safikhani, Zhaleh AU - Badie, Kambiz AU - Sadeghi, Mehdi AD - Department of Algorithms and Computation, University of Tehran, Tehran, Iran AD - Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran AD - National Telecom Research Center, Tehran, Iran AD - National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran 14155-6346, Iran. Y1 - 2014 PY - 2014 VL - 4 IS - 1 SP - 43 EP - 52 KW - De novo KW - Next generation sequencing KW - RNA-Seq KW - transcriptome assembly DO - 10.22059/pbs.2014.50305 N2 - Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with different k-mer lengths. Then, the eclectic mixtures ofsequences are gathered in order to form the final sequences. Lastly, the contiguous sequencesare clustered and the isoform groups are provided. This proposed algorithm is capable ofgenerating long contiguous sequences and accurately clustering them into isoform groups.Toevaluate our algorithm, we applied it to a simulated RNA-seq dataset of rat transcriptome and areal RNA-seq experiment of the loricaria gr. cataphracta transcriptome. The correctness of theassembled contigs was more than 95%, and our algorithm was able to reconstruct over 70% ofthe transcripts at more than 80% of the transcripts’ lengths. This study demonstrates thatapplying a sophisticated merging method improves transcriptome assembly. The source code isavailable upon request by contacting the corresponding author by email.  UR - https://pbiosci.ut.ac.ir/article_50305.html L1 - https://pbiosci.ut.ac.ir/article_50305_0020397eff8270749dd28d95054cf66e.pdf ER -