Worksheet - Menu Tool: Transcriptome post-processing

This tab permits you to make downstream curation of transcriptome sequence data in two different ways.

Filter best isoform

By selecting the option “Filter best isoform” you can read the annotation CSV file of the whole transcriptome under analysis and, then, state one or more filters to select the most representative sequences among the distinct cDNAs annotated per gene transcribed. This is done using an algorithm, which is a normalized combination of the most relevant BLAST statistics, such as the high-scoring segment pairs (HSPs) of both, the query and the hit as well as the similarity, the inverse of the E-value and the sequencing depth. You can state the filter based on any of these filters or based on all of them.

You can also filter the clusters by positional redundancy to detect and select all non-overlapping sets of isotigs/contigs of a gene partially characterized and then, select the best isoform within each one of these non-overlapping sets (in a red circle within the figure).

Sequence trimming

Using this utility, you can upload both the FASTA file with your cDNA sequences and its associated annotation CSV file and, then, trim the FASTA sequences according to two options:

Finally, the tool permits you to label the sequences as “full-length cDNAs”. “Partial Sequence” or “Related Domain” depending on the core shared and to trim upstream and downstream each sequence to remove frames respectively upstream and downstream from the start codon and stop codon or from the defined core.


