Basic output flags

These are output flags that do not require the input of any fasta or EnTAP files.


Statistics will be run on the gene table and printed to statistics.txt. This command is performed by task_scripts/ If a prefix is used, the statistics file will be named accordingly.

These are all the potential statistics in the reported format:

Number of genes:
Number of monoexonic genes:
Number of multiexonic genes:

Number of positive strand genes:

Number of negative strand genes:

Average overall gene size:
Median overall gene size:
Average overall CDS size:
Median overall CDS size:
Average overall exon size:
Median overall exon size:

Average size of monoexonic genes:
Median size of monoexonic genes:
Largest monoexonic gene:
Smallest monoexonic gene:

Average size of multiexonic genes:
Median size of multiexonic genes:
Largest multiexonic gene:
Smallest multiexonic gene:

Average size of multiexonic CDS:
Median size of multiexonic CDS:
Largest multiexonic CDS:
Smallest multiexonic CDS:

Average size of multiexonic exons:
Median size of multiexonic exons:
Average size of multiexonic introns:
Median size of multiexonic introns:

Average number of exons per multiexonic gene:
Median number of exons per multiexonic gene:
Largest multiexonic exon:
Smallest multiexonic exon:
Most exons in one gene:

Average number of introns per multiexonic gene:
Median number of introns per multiexonic gene:
Largest intron:
Smallest intron:

The following columns do not involve codons:
Number of complete models:
Number of 5’ only incomplete models:
Number of 3’ only incomplete models:
Number of 5’ and 3’ incomplete models:

If your set is only monoexonics, a smaller version of the statistics will be printed that only contain the categories where monoexonic genes are evaluated.


A statistical analysis of the gene table is run following every filtering step. This information is in the same format as regular --statistics but prints to the log following the information line for each flag. To ensure statistics.txt is created at the end, make sure to include -–statistics in your command.


Identical to --create-gtf, but lacks start and stop codon information. This option is significantly faster.


An Ensembl v3 gff3 gff3 will be created that contains mRNA, exon, and intron information. ID, Name, and Parent information will be shown.