Description
This track shows the transcripts assembled by StringTie2 using Full-Length Non-Chimeric (FLNC) PacBio Iso-Seq reads, Illumina RNA-Seq reads from adult females, and the RefSeq gene annotations. Coding regions within the transcript are identified by TransDecoder.
The alignments of the FLNC PacBio Iso-Seq reads were produced by minimap2. The alignments of the Illumina RNA-Seq reads from adult females were produced by HISAT2. The D. ananassae RefSeq gene annotations were obtained from NCBI under the Assembly accession number GCF_017639315.
TransDecoder was run against the transcripts assembled by StringTie using default parameters. The open reading frames predicted by TransDecoder were searched against the Swiss-Prot database using NCBI blastp, and against the Pfam database using HMMER. Open reading frames are kept if they satisfy the TransDecoder.Predict filtering criteria, or if they show significant matches to Swiss-Prot or Pfam.
Data Sources
PacBio Iso-Seq data
- The PacBio Iso-Seq data were produced by Chinmay P. Rele from the Reed Lab at The University of Alabama.
Illumina Adult Females RNA-Seq Data
References
Chen ZX, et al. Comparative validation of the D. melanogaster modENCODE transcriptome annotation. Genome Res. 2014 Jul;24(7):1209-23.
Haas BJ et al. De novo transcript sequence reconstruction from RNA-Seq using the Trinity platform for reference generation and analysis. Nat Protoc. 2013 Aug;8(8):1494-512.
Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nat Methods. 2015 Apr;12(4):357-60.
Li H. New strategies to improve minimap2 alignment accuracy. Bioinformatics. 2021 Oct 8;37(23):4572-4.
Shumate A, Wong B, Pertea G, Pertea M. Improved transcriptome assembly using a hybrid of long and short reads with StringTie. PLoS Comput Biol. 2022 Jun 1;18(6):e1009730.
|