Skip to content

Latest commit

 

History

History
149 lines (124 loc) · 5.23 KB

File metadata and controls

149 lines (124 loc) · 5.23 KB

Variables to set

Set your variable variable and export them for your environment

THREADS=3
DATADIR=~/Share/GenomeAnnWorkshopDATA
SWISSPROTDB=$DATADIR/uniprot_sprot

- These are the variables that needed to be exported to annotate the Arabidopsis taliana genome, I suggest to not change them.

lax_06100_elife-06100-fig1-v1 tif

SPECIES=Atal
BUSCODIR=$DATADIR
DIR=Tutorial.Arabidopsis
srSRAS='SRR5956435 SRR5956436'
longREADS='SRR23291376 SRR23291378'
LONGTYPE=pacbio
ASSEMBLY=$DATADIR/GCF_000001735.4_TAIR10.1_genomic.fna
TARGETCHR='NC_003070.9'
CHRNAME=chr1
CHR=$DATADIR/$SPECIES.$CHRNAME.fasta
ANNREF=$DATADIR/$SPECIES.$CHRNAME.gene.ncbi.gff
BRAKERGTF=$DATADIR/$SPECIES.braker_utr.$CHRNAME.gff3
STRAND='--rf'
LIBTYPE='first'
BUSCODB=embryophyta_odb12
MIKADOSCORE=$DATADIR/plants.yaml

These are the BUSCO profiles for the chromosome you're going to annotate:

Results from dataset embryophyta_odb10: proteome
C:27.1%[S:14.4%,D:12.7%],F:0.3%,M:72.6%,n:1614
437 Complete BUSCOs (C)
232 Complete and single-copy BUSCOs (S)
205 Complete and duplicated BUSCOs (D)
5 Fragmented BUSCOs (F)
1172 Missing BUSCOs (M)
1614 Total BUSCO groups searched

And this is for the reference annotation:

Results from dataset embryophyta_odb10: Chr1
S:27.01%,D:0.19%,F:0.93%,I:0.00%,M:71.87%,n:1614
436 Complete and single-copy BUSCOs (S)
3 Complete and duplicated BUSCOs (D)
0 Incomplete (I)
15 Fragmented BUSCOs (F)
1160 Missing BUSCOs (M)
1614 Total BUSCO groups searched

You can use these stats as a comparison to evaluate the annotations you're going to generate.

- These are the variables that needed to be exported to annotate the Entomobrya proxima genome not to change.

Entomobrya_lanuginosa_(11539169933)

SPECIES=Epro
BUSCODIR=$DATADIR
DIR=Tutorial.Springtail
srSRAS='SRR15910090'
longREADS='SRR15910089'
LONGTYPE=pacbio
ASSEMBLY=$DATADIR/GCA_029691765.1_ASM2969176v1_genomic.fna
TARGETCHR='CM056168.1'
CHRNAME=chr1
CHR=$DATADIR/$SPECIES.$CHRNAME.fasta
ANNREF=$DATADIR/$SPECIES.$CHRNAME.gene.maker.gff
BRAKERGTF=$DATADIR/$SPECIES.braker_utr.$CHRNAME.gff3
STRAND='--rf'
LIBTYPE='first'
BUSCODB=arthropoda_odb12
MIKADOSCORE=$DATADIR/insects.yaml
Results from dataset arthropoda_odb10: proteome
C:32.1%[S:28.9%,D:3.2%],F:1.9%,M:66.0%,n:1013
325 Complete BUSCOs (C)
293 Complete and single-copy BUSCOs (S)
32 Complete and duplicated BUSCOs (D)
19 Fragmented BUSCOs (F)
669 Missing BUSCOs (M)
1013 Total BUSCO groups searched
Results from dataset arthropoda_odb10: Chr1
S:30.40%,D:0.10%,F:1.09%,I:0.00%,M:68.41%,n:1013
308 Complete and single-copy BUSCOs (S)
1 Complete and duplicated BUSCOs (D)
0 Incomplete (I)
11 Fragmented BUSCOs (F)
693 Missing BUSCOs (M)
1013 Total BUSCO groups searched

- These are the variables that needed to be exported to annotate the Schistocerca gregaria genome not to change.

Schistocerca gregaria, Tenerife - Axel Hochkirch

SPECIES=Sgre
BUSCODIR=$DATADIR
DIR=Tutorial.DesertLocust
srSRAS='SRR15423966 SRR15423964'
longREADS='SRR17978912'
LONGTYPE=pacbio
ASSEMBLY=$DATADIR/GCF_023897955.1_iqSchGreg1.2_genomic.fna
TARGETCHR='NC_064930.1'
CHRNAME=chr11
CHR=$DATADIR/$SPECIES.$CHRNAME.fasta
ANNREF=$DATADIR/$SPECIES.$CHRNAME.gene.ncbi.gff
BRAKERGTF=$DATADIR/$SPECIES.braker_utr.$CHRNAME.gff3
STRAND='--rf'
LIBTYPE='first'
BUSCODB=insecta_odb12
MIKADOSCORE=$DATADIR/insects.yaml
Results from dataset insecta_odb10: proteome
C:1.7%[S:1.0%,D:0.7%],F:0.2%,M:98.1%,n:1367
22 Complete BUSCOs (C)
13 Complete and single-copy BUSCOs (S)
9 Complete and duplicated BUSCOs (D)
3 Fragmented BUSCOs (F)
1342 Missing BUSCOs (M)
1367 Total BUSCO groups searched
Results from dataset insecta_odb10: Chr11
S:1.10%,D:0.00%,F:0.29%,I:0.00%,M:98.61%,n:1367
15 Complete and single-copy BUSCOs (S)
0 Complete and duplicated BUSCOs (D)
0 Incomplete (I)
4 Fragmented BUSCOs (F)
1348 Missing BUSCOs (M)
1367 Total BUSCO groups searched