[2023-06-13 02:32:38,181] [INFO] DFAST_QC pipeline started. [2023-06-13 02:32:38,186] [INFO] DFAST_QC version: 0.5.7 [2023-06-13 02:32:38,187] [INFO] DQC Reference Directory: /var/lib/cwl/stg803c0877-4a67-4066-94a7-5bf9646ebbf6/dqc_reference [2023-06-13 02:32:39,750] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-13 02:32:39,751] [INFO] Task started: Prodigal [2023-06-13 02:32:39,751] [INFO] Running command: gunzip -c /var/lib/cwl/stg3a5e56a7-3a27-4549-a074-c5e9328b5d91/GCA_022773665.1_ASM2277366v1_genomic.fna.gz | prodigal -d GCA_022773665.1_ASM2277366v1_genomic.fna/cds.fna -a GCA_022773665.1_ASM2277366v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-13 02:32:44,029] [INFO] Task succeeded: Prodigal [2023-06-13 02:32:44,029] [INFO] Task started: HMMsearch [2023-06-13 02:32:44,030] [INFO] Running command: hmmsearch --tblout GCA_022773665.1_ASM2277366v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg803c0877-4a67-4066-94a7-5bf9646ebbf6/dqc_reference/reference_markers.hmm GCA_022773665.1_ASM2277366v1_genomic.fna/protein.faa > /dev/null [2023-06-13 02:32:44,364] [INFO] Task succeeded: HMMsearch [2023-06-13 02:32:44,365] [WARNING] Found 5/6 markers. [/var/lib/cwl/stg3a5e56a7-3a27-4549-a074-c5e9328b5d91/GCA_022773665.1_ASM2277366v1_genomic.fna.gz] [2023-06-13 02:32:44,405] [INFO] Query marker FASTA was written to GCA_022773665.1_ASM2277366v1_genomic.fna/markers.fasta [2023-06-13 02:32:44,405] [INFO] Task started: Blastn [2023-06-13 02:32:44,405] [INFO] Running command: blastn -query GCA_022773665.1_ASM2277366v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg803c0877-4a67-4066-94a7-5bf9646ebbf6/dqc_reference/reference_markers.fasta -out GCA_022773665.1_ASM2277366v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-13 02:32:45,027] [INFO] Task succeeded: Blastn [2023-06-13 02:32:45,036] [INFO] Selected 25 target genomes. [2023-06-13 02:32:45,036] [INFO] Target genome list was writen to GCA_022773665.1_ASM2277366v1_genomic.fna/target_genomes.txt [2023-06-13 02:32:45,040] [INFO] Task started: fastANI [2023-06-13 02:32:45,040] [INFO] Running command: fastANI --query /var/lib/cwl/stg3a5e56a7-3a27-4549-a074-c5e9328b5d91/GCA_022773665.1_ASM2277366v1_genomic.fna.gz --refList GCA_022773665.1_ASM2277366v1_genomic.fna/target_genomes.txt --output GCA_022773665.1_ASM2277366v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-13 02:33:01,493] [INFO] Task succeeded: fastANI [2023-06-13 02:33:01,494] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg803c0877-4a67-4066-94a7-5bf9646ebbf6/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-13 02:33:01,494] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg803c0877-4a67-4066-94a7-5bf9646ebbf6/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-13 02:33:01,514] [INFO] Found 25 fastANI hits (0 hits with ANI > threshold) [2023-06-13 02:33:01,514] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-13 02:33:01,515] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Clostridium paraputrificum strain=NCTC11833 GCA_900447045.1 29363 29363 type True 78.6287 225 861 95 below_threshold Clostridium vincentii strain=DSM 10228 GCA_002995745.1 52704 52704 type True 77.9443 183 861 95 below_threshold Clostridium cibarium strain=Sa3CVN1 GCA_014836335.1 2762247 2762247 type True 77.941 228 861 95 below_threshold Clostridium uliginosum strain=DSM 12992 GCA_900112485.1 119641 119641 type True 77.7846 211 861 95 below_threshold Clostridium septicum strain=DSM 7534 GCA_003606265.1 1504 1504 type True 77.6002 193 861 95 below_threshold Clostridium septicum strain=FDAARGOS_1551 GCA_020736665.1 1504 1504 type True 77.583 193 861 95 below_threshold Clostridium cagae strain=Marseille-P4344 GCA_900290265.1 2080751 2080751 type True 77.5498 218 861 95 below_threshold Clostridium chauvoei strain=DSM 7528 GCA_002327185.1 46867 46867 type True 77.4202 203 861 95 below_threshold Clostridium celatum strain=DSM 1785 GCA_000320405.1 36834 36834 type True 77.4196 207 861 95 below_threshold Clostridium nigeriense strain=Marseille-P2414 GCA_900086595.1 1805470 1805470 type True 77.4181 210 861 95 below_threshold Clostridium saudiense strain=JCC GCA_000577815.1 1414720 1414720 type True 77.4122 201 861 95 below_threshold Clostridium weizhouense strain=YB-6 GCA_019431045.1 2859781 2859781 type True 77.3749 250 861 95 below_threshold Clostridium gallinarum strain=Sa3CUN1 GCA_014836325.1 2762246 2762246 type True 77.1161 223 861 95 below_threshold Clostridium saccharobutylicum strain=DSM 13864 GCA_000473995.1 169679 169679 type True 77.1132 230 861 95 below_threshold Clostridium chrysemydis strain=PT GCA_015234215.1 2665504 2665504 type True 77.1125 174 861 95 below_threshold Clostridium fallax strain=NCTC8380 GCA_900461065.1 1533 1533 type True 77.079 183 861 95 below_threshold Clostridium saccharobutylicum strain=DSM 13864 GCA_001657435.1 169679 169679 type True 77.075 222 861 95 below_threshold Clostridium neonatale strain=LCDC 99A005 GCA_002553615.1 137838 137838 type True 77.0747 240 861 95 below_threshold Clostridium neonatale strain=LCDC no.99-A-005 GCA_001458595.1 137838 137838 type True 77.0673 245 861 95 below_threshold Clostridium gelidum strain=C5S11 GCA_019977655.1 704125 704125 type True 77.0213 222 861 95 below_threshold Clostridium fallax strain=DSM 2631 GCA_900129365.1 1533 1533 type True 76.9814 182 861 95 below_threshold Sarcina ventriculi strain=NCTC12966 GCA_900456775.1 1267 1267 type True 76.9745 170 861 95 below_threshold Clostridium simiarum strain=MSJ-4 GCA_018919175.1 2841506 2841506 type True 76.1392 112 861 95 below_threshold Clostridium muellerianum strain=P21 GCA_012926525.1 2716538 2716538 type True 75.9906 150 861 95 below_threshold Clostridium autoethanogenum strain=DSM 10061 GCA_000484505.2 84023 84023 suspected-type True 75.8683 82 861 95 below_threshold -------------------------------------------------------------------------------- [2023-06-13 02:33:01,520] [INFO] DFAST Taxonomy check result was written to GCA_022773665.1_ASM2277366v1_genomic.fna/tc_result.tsv [2023-06-13 02:33:01,520] [INFO] ===== Taxonomy check completed ===== [2023-06-13 02:33:01,521] [INFO] ===== Start completeness check using CheckM ===== [2023-06-13 02:33:01,521] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg803c0877-4a67-4066-94a7-5bf9646ebbf6/dqc_reference/checkm_data [2023-06-13 02:33:01,524] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-13 02:33:01,572] [INFO] Task started: CheckM [2023-06-13 02:33:01,572] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_022773665.1_ASM2277366v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_022773665.1_ASM2277366v1_genomic.fna/checkm_input GCA_022773665.1_ASM2277366v1_genomic.fna/checkm_result [2023-06-13 02:33:22,971] [INFO] Task succeeded: CheckM [2023-06-13 02:33:22,972] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 95.83% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-13 02:33:22,991] [INFO] ===== Completeness check finished ===== [2023-06-13 02:33:22,991] [INFO] ===== Start GTDB Search ===== [2023-06-13 02:33:22,991] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_022773665.1_ASM2277366v1_genomic.fna/markers.fasta) [2023-06-13 02:33:22,992] [INFO] Task started: Blastn [2023-06-13 02:33:22,992] [INFO] Running command: blastn -query GCA_022773665.1_ASM2277366v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg803c0877-4a67-4066-94a7-5bf9646ebbf6/dqc_reference/reference_markers_gtdb.fasta -out GCA_022773665.1_ASM2277366v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-13 02:33:23,781] [INFO] Task succeeded: Blastn [2023-06-13 02:33:23,786] [INFO] Selected 16 target genomes. [2023-06-13 02:33:23,787] [INFO] Target genome list was writen to GCA_022773665.1_ASM2277366v1_genomic.fna/target_genomes_gtdb.txt [2023-06-13 02:33:23,836] [INFO] Task started: fastANI [2023-06-13 02:33:23,837] [INFO] Running command: fastANI --query /var/lib/cwl/stg3a5e56a7-3a27-4549-a074-c5e9328b5d91/GCA_022773665.1_ASM2277366v1_genomic.fna.gz --refList GCA_022773665.1_ASM2277366v1_genomic.fna/target_genomes_gtdb.txt --output GCA_022773665.1_ASM2277366v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-13 02:33:33,289] [INFO] Task succeeded: fastANI [2023-06-13 02:33:33,309] [INFO] Found 16 fastANI hits (1 hits with ANI > circumscription radius) [2023-06-13 02:33:33,310] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_902785295.1 s__Clostridium sp902785295 95.1172 620 861 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 N/A N/A N/A N/A 1 conclusive GCA_900317445.1 s__Clostridium sp900317445 90.9647 481 861 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 100.00 100.00 0.99 0.99 2 - GCA_900539375.1 s__Clostridium sp900539375 89.5052 592 861 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 99.09 99.02 0.89 0.86 4 - GCF_900447045.1 s__Clostridium paraputrificum 78.6645 223 861 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 97.58 95.42 0.89 0.83 29 - GCA_018372875.1 s__Clostridium sp018372875 78.6201 220 861 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 98.91 98.91 0.87 0.87 2 - GCA_012519155.1 s__Clostridium sp012519155 78.3405 223 861 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 N/A N/A N/A N/A 1 - GCF_900116755.1 s__Clostridium sp900116755 78.3346 211 861 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 N/A N/A N/A N/A 1 - GCF_002995745.1 s__Clostridium vincentii 77.9443 183 861 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 N/A N/A N/A N/A 1 - GCA_015058405.1 s__Clostridium sp015058405 77.8972 242 861 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 N/A N/A N/A N/A 1 - GCA_012519845.1 s__Clostridium sp012519845 77.7998 231 861 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 N/A N/A N/A N/A 1 - GCF_001735765.2 s__Clostridium taeniosporum 77.6651 232 861 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 N/A N/A N/A N/A 1 - GCF_000789395.1 s__Clostridium baratii 77.6581 222 861 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 96.53 95.37 0.92 0.89 15 - GCA_910588715.1 s__Clostridium sp910588715 77.6381 208 861 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 N/A N/A N/A N/A 1 - GCF_000498355.1 s__Clostridium sp000498355 77.5741 225 861 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 N/A N/A N/A N/A 1 - GCF_002327185.1 s__Clostridium chauvoei 77.4052 204 861 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 99.96 99.88 0.99 0.99 5 - GCF_000473995.1 s__Clostridium saccharobutylicum 77.1079 232 861 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 99.89 95.82 0.99 0.83 65 - -------------------------------------------------------------------------------- [2023-06-13 02:33:33,312] [INFO] GTDB search result was written to GCA_022773665.1_ASM2277366v1_genomic.fna/result_gtdb.tsv [2023-06-13 02:33:33,313] [INFO] ===== GTDB Search completed ===== [2023-06-13 02:33:33,317] [INFO] DFAST_QC result json was written to GCA_022773665.1_ASM2277366v1_genomic.fna/dqc_result.json [2023-06-13 02:33:33,318] [INFO] DFAST_QC completed! [2023-06-13 02:33:33,318] [INFO] Total running time: 0h0m55s