[2024-01-24 11:13:04,599] [INFO] DFAST_QC pipeline started. [2024-01-24 11:13:04,600] [INFO] DFAST_QC version: 0.5.7 [2024-01-24 11:13:04,601] [INFO] DQC Reference Directory: /var/lib/cwl/stg1ece6799-1658-43c0-a655-bf28dc849430/dqc_reference [2024-01-24 11:13:05,830] [INFO] ===== Start taxonomy check using ANI ===== [2024-01-24 11:13:05,831] [INFO] Task started: Prodigal [2024-01-24 11:13:05,831] [INFO] Running command: gunzip -c /var/lib/cwl/stg65e50f2a-ad85-42e2-b4ba-b7efd23daef2/GCF_014384885.1_ASM1438488v1_genomic.fna.gz | prodigal -d GCF_014384885.1_ASM1438488v1_genomic.fna/cds.fna -a GCF_014384885.1_ASM1438488v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2024-01-24 11:13:14,318] [INFO] Task succeeded: Prodigal [2024-01-24 11:13:14,319] [INFO] Task started: HMMsearch [2024-01-24 11:13:14,319] [INFO] Running command: hmmsearch --tblout GCF_014384885.1_ASM1438488v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg1ece6799-1658-43c0-a655-bf28dc849430/dqc_reference/reference_markers.hmm GCF_014384885.1_ASM1438488v1_genomic.fna/protein.faa > /dev/null [2024-01-24 11:13:14,633] [INFO] Task succeeded: HMMsearch [2024-01-24 11:13:14,634] [INFO] Found 6/6 markers. [2024-01-24 11:13:14,662] [INFO] Query marker FASTA was written to GCF_014384885.1_ASM1438488v1_genomic.fna/markers.fasta [2024-01-24 11:13:14,662] [INFO] Task started: Blastn [2024-01-24 11:13:14,662] [INFO] Running command: blastn -query GCF_014384885.1_ASM1438488v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg1ece6799-1658-43c0-a655-bf28dc849430/dqc_reference/reference_markers.fasta -out GCF_014384885.1_ASM1438488v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 11:13:15,429] [INFO] Task succeeded: Blastn [2024-01-24 11:13:15,433] [INFO] Selected 19 target genomes. [2024-01-24 11:13:15,433] [INFO] Target genome list was writen to GCF_014384885.1_ASM1438488v1_genomic.fna/target_genomes.txt [2024-01-24 11:13:15,452] [INFO] Task started: fastANI [2024-01-24 11:13:15,453] [INFO] Running command: fastANI --query /var/lib/cwl/stg65e50f2a-ad85-42e2-b4ba-b7efd23daef2/GCF_014384885.1_ASM1438488v1_genomic.fna.gz --refList GCF_014384885.1_ASM1438488v1_genomic.fna/target_genomes.txt --output GCF_014384885.1_ASM1438488v1_genomic.fna/fastani_result.tsv --threads 1 [2024-01-24 11:13:26,751] [INFO] Task succeeded: fastANI [2024-01-24 11:13:26,752] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg1ece6799-1658-43c0-a655-bf28dc849430/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2024-01-24 11:13:26,753] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg1ece6799-1658-43c0-a655-bf28dc849430/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2024-01-24 11:13:26,768] [INFO] Found 14 fastANI hits (1 hits with ANI > threshold) [2024-01-24 11:13:26,768] [INFO] The taxonomy check result is classified as 'conclusive'. [2024-01-24 11:13:26,768] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Ligaoa zhengdingensis strain=NSJ-31 GCA_014384885.1 2763658 2763658 type True 100.0 926 932 95 conclusive Vescimonas coprocola strain=MM50 GCA_018408575.1 2714355 2714355 type True 83.0469 66 932 95 below_threshold Faecalibacterium duncaniae strain=JCM 31915 GCA_010509575.1 411483 411483 type True 81.2383 89 932 95 below_threshold Faecalibacterium duncaniae strain=A2-165 GCA_000162015.1 411483 411483 type True 80.9898 87 932 95 below_threshold Angelakisella massiliensis strain=Marseille-P3217 GCA_900104675.1 1871018 1871018 type True 80.8824 113 932 95 below_threshold Faecalibacterium gallinarum strain=JCM 17207 GCA_022180365.1 2903556 2903556 type True 78.1411 88 932 95 below_threshold Provencibacterium massiliense strain=Marseille-P2780 GCA_900169495.1 1841868 1841868 type True 77.9282 173 932 95 below_threshold Anaerotruncus massiliensis strain=AT3 GCA_900199635.1 1673720 1673720 type True 77.615 178 932 95 below_threshold Hydrogenoanaerobacterium saccharovorans strain=DSM 24774 GCA_003814745.1 474960 474960 type True 77.6125 85 932 95 below_threshold Hydrogenoanaerobacterium saccharovorans strain=CGMCC 1.5070 GCA_900110045.1 474960 474960 type True 77.5132 87 932 95 below_threshold Anaerotruncus rubiinfantis strain=MT15 GCA_900078395.1 1720200 1720200 type True 77.4688 108 932 95 below_threshold Phocea massiliensis strain=Marseille-P2769 GCA_900104615.1 1841867 1841867 type True 77.0942 98 932 95 below_threshold Marasmitruncus massiliensis strain=Marseille-P3646 GCA_900186535.1 1944642 1944642 type True 76.7875 56 932 95 below_threshold Acetanaerobacterium elongatum strain=CGMCC 1.5012 GCA_900103835.1 258515 258515 type True 76.6075 64 932 95 below_threshold -------------------------------------------------------------------------------- [2024-01-24 11:13:26,770] [INFO] DFAST Taxonomy check result was written to GCF_014384885.1_ASM1438488v1_genomic.fna/tc_result.tsv [2024-01-24 11:13:26,771] [INFO] ===== Taxonomy check completed ===== [2024-01-24 11:13:26,771] [INFO] ===== Start completeness check using CheckM ===== [2024-01-24 11:13:26,772] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg1ece6799-1658-43c0-a655-bf28dc849430/dqc_reference/checkm_data [2024-01-24 11:13:26,773] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2024-01-24 11:13:26,804] [INFO] Task started: CheckM [2024-01-24 11:13:26,804] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_014384885.1_ASM1438488v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_014384885.1_ASM1438488v1_genomic.fna/checkm_input GCF_014384885.1_ASM1438488v1_genomic.fna/checkm_result [2024-01-24 11:13:56,734] [INFO] Task succeeded: CheckM [2024-01-24 11:13:56,736] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2024-01-24 11:13:56,757] [INFO] ===== Completeness check finished ===== [2024-01-24 11:13:56,757] [INFO] ===== Start GTDB Search ===== [2024-01-24 11:13:56,758] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_014384885.1_ASM1438488v1_genomic.fna/markers.fasta) [2024-01-24 11:13:56,758] [INFO] Task started: Blastn [2024-01-24 11:13:56,758] [INFO] Running command: blastn -query GCF_014384885.1_ASM1438488v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg1ece6799-1658-43c0-a655-bf28dc849430/dqc_reference/reference_markers_gtdb.fasta -out GCF_014384885.1_ASM1438488v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 11:13:58,183] [INFO] Task succeeded: Blastn [2024-01-24 11:13:58,188] [INFO] Selected 25 target genomes. [2024-01-24 11:13:58,188] [INFO] Target genome list was writen to GCF_014384885.1_ASM1438488v1_genomic.fna/target_genomes_gtdb.txt [2024-01-24 11:13:58,223] [INFO] Task started: fastANI [2024-01-24 11:13:58,224] [INFO] Running command: fastANI --query /var/lib/cwl/stg65e50f2a-ad85-42e2-b4ba-b7efd23daef2/GCF_014384885.1_ASM1438488v1_genomic.fna.gz --refList GCF_014384885.1_ASM1438488v1_genomic.fna/target_genomes_gtdb.txt --output GCF_014384885.1_ASM1438488v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2024-01-24 11:14:08,984] [INFO] Task succeeded: fastANI [2024-01-24 11:14:09,003] [INFO] Found 20 fastANI hits (1 hits with ANI > circumscription radius) [2024-01-24 11:14:09,003] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_014384885.1 s__Hydrogenoanaerobacterium sp014384885 100.0 926 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Ruminococcaceae;g__Hydrogenoanaerobacterium 95.0 98.59 98.59 0.91 0.91 2 conclusive GCF_900199635.1 s__Anaerotruncus massiliensis 77.615 178 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Ruminococcaceae;g__Anaerotruncus 95.0 98.56 98.01 0.95 0.93 6 - GCF_003814745.1 s__Hydrogenoanaerobacterium saccharovorans 77.6125 85 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Ruminococcaceae;g__Hydrogenoanaerobacterium 95.0 100.00 100.00 1.00 1.00 2 - GCF_900078395.1 s__Anaerotruncus rubiinfantis 77.4688 108 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Ruminococcaceae;g__Anaerotruncus 95.0 99.23 99.21 0.89 0.86 5 - GCA_004340125.1 s__Harryflintia acetispora 77.3788 158 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Ruminococcaceae;g__Harryflintia 95.0 98.73 98.54 0.92 0.90 6 - GCF_904419105.1 s__Avimicrobium faecavium 77.1257 69 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Ruminococcaceae;g__Avimicrobium 95.0 N/A N/A N/A N/A 1 - GCF_904398325.1 s__Neoruminococcus faecicola 77.0926 97 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Ruminococcaceae;g__Neoruminococcus 95.0 N/A N/A N/A N/A 1 - GCA_004558145.1 s__Fournierella excrementavium 76.9981 90 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Ruminococcaceae;g__Fournierella 95.0 97.60 97.45 0.91 0.87 5 - GCA_019114825.1 s__Anaerotruncus excrementipullorum 76.9896 98 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Ruminococcaceae;g__Anaerotruncus 95.0 99.88 99.88 0.94 0.94 2 - GCA_900760305.1 s__UMGS856 sp900760305 76.8723 61 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Acutalibacteraceae;g__UMGS856 95.0 N/A N/A N/A N/A 1 - GCA_910585525.1 s__Angelakisella sp910585525 76.8452 81 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Ruminococcaceae;g__Angelakisella 95.0 N/A N/A N/A N/A 1 - GCA_905233755.1 s__Ruminococcus_E sp905233755 76.8161 56 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Acutalibacteraceae;g__Ruminococcus_E 95.0 N/A N/A N/A N/A 1 - GCF_904387055.1 s__Heteroruminococcus faecigallinarum 76.7906 89 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Ruminococcaceae;g__Heteroruminococcus 95.0 99.97 99.97 0.96 0.96 2 - GCA_905192745.1 s__Avimonas sp900551425 76.7878 57 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Acutalibacteraceae;g__Avimonas 95.0 99.85 99.85 0.95 0.95 2 - GCA_018715025.1 s__Caccousia avistercoris 76.7763 61 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Acutalibacteraceae;g__Caccousia 95.0 98.39 98.39 0.86 0.86 2 - GCA_904420255.1 s__Angelakisella sp904420255 76.7568 76 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Ruminococcaceae;g__Angelakisella 95.0 N/A N/A N/A N/A 1 - GCF_015667585.1 s__Anaeromassilibacillus sp015667585 76.7532 78 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Acutalibacteraceae;g__Anaeromassilibacillus 95.0 N/A N/A N/A N/A 1 - GCA_900761965.1 s__Avimicrobium sp900761965 76.6954 50 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Ruminococcaceae;g__Avimicrobium 95.0 98.03 98.03 0.92 0.92 2 - GCF_016901815.1 s__Avimicrobium caecorum 76.3172 55 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Ruminococcaceae;g__Avimicrobium 95.0 98.00 97.74 0.94 0.92 7 - GCA_910585535.1 s__Angelakisella sp910585535 76.1406 60 932 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Oscillospirales;f__Ruminococcaceae;g__Angelakisella 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2024-01-24 11:14:09,005] [INFO] GTDB search result was written to GCF_014384885.1_ASM1438488v1_genomic.fna/result_gtdb.tsv [2024-01-24 11:14:09,005] [INFO] ===== GTDB Search completed ===== [2024-01-24 11:14:09,009] [INFO] DFAST_QC result json was written to GCF_014384885.1_ASM1438488v1_genomic.fna/dqc_result.json [2024-01-24 11:14:09,009] [INFO] DFAST_QC completed! [2024-01-24 11:14:09,009] [INFO] Total running time: 0h1m4s