faqs

Jon Palmer · Jon Palmer · commit 671e7fe8701d · 2016-12-19T11:53:41.000-06:00
diff --git a/docs/faqs.md b/docs/faqs.md
@@ -11,10 +11,10 @@ Prokaryotes are not supported -> use [Prokka](https://github.com/tseemann/prokka
 
 ###4) How does funannotate train Augustus?
 Training Augustus is not very easy.  There are several ways to do it in funannotate and the script will automatically pick a training path based on your input data.  For all of these training steps, the more evidence you can provide the better your training will be (`--protein_evidence` and `--transcript_evidence`). This is how the "logic" in the script is setup.  
-1) If you pass a valid pre-trained species to `--augustus_species` or there is already one trained (`--species "Aspergillus nidulans"` will essentially be turned into `--augustus_species aspergillus_nidulans`) then the scripts will NOT train Augustus and will use the pre-trained parameters.  Note you can check which species have been pretrained with `funannotate species`.
-2) If you provide a coordinate sorted BAM file via `--rna_bam`, Augustus and GeneMark will be trained using BRAKER1.  
-3) If you provide a PASA GFF file via `--pasa_gff` then Augustus will be trained using these PASA gene models. 
-4) If you don't have PASA or a RNAseq BAM file, then Augustus will be trained using BUSCO2.  The `--busco_seed_species` option is for passing the most closely related pre-trained Augustus species parameter to BUSCO2 to improve its de novo prediction.  Funannotate uses a modified training regime where it takes BUSCO2 'Complete' models, de novo GeneMarkES models, and evidence in those regions and runs EvidenceModeler to predict gene models.  The models are then confirmed using BUSCO2 and a subset are used for training Augustus.
+* If you pass a valid pre-trained species to `--augustus_species` or there is already one trained (`--species "Aspergillus nidulans"` will essentially be turned into `--augustus_species aspergillus_nidulans`) then the scripts will NOT train Augustus and will use the pre-trained parameters.  Note you can check which species have been pretrained with `funannotate species`.
+* If you provide a coordinate sorted BAM file via `--rna_bam`, Augustus and GeneMark will be trained using BRAKER1.  
+* If you provide a PASA GFF file via `--pasa_gff` then Augustus will be trained using these PASA gene models. 
+* If you don't have PASA or a RNAseq BAM file, then Augustus will be trained using BUSCO2.  The `--busco_seed_species` option is for passing the most closely related pre-trained Augustus species parameter to BUSCO2 to improve its de novo prediction.  Funannotate uses a modified training regime where it takes BUSCO2 'Complete' models, de novo GeneMarkES models, and evidence in those regions and runs EvidenceModeler to predict gene models.  The models are then confirmed using BUSCO2 and a subset are used for training Augustus.
 
 ###5) Funannotate said I should manually fix problematic gene models, how???
 In the 'predict_results' folder you will find the output from `funannotate predict` which is composed of a GenBank flatfile, feature table file, GFF3, proteins, transcripts, as well as 3 error reports from tbl2asn.  Gene models that show up as ERROR in the error.summary.txt file MUST be fixed prior to submission to NCBI.  All errors listed as FATAL in the discrepency.report.txt must also be fixed (with the exception of FATAL: DISC_BACTERIAL_PARTIAL_NONEXTENDABLE_PROBLEMS).  I try to parse the errors where I can automatically provide fixes or removing the gene models, however there are lots of tbl2asn errors I've either never seen before or don't know how to fix automatically.  Here is how you can fix those problematic gene models: