[2023-06-12 23:38:00,666] [INFO] DFAST_QC pipeline started. [2023-06-12 23:38:00,668] [INFO] DFAST_QC version: 0.5.7 [2023-06-12 23:38:00,669] [INFO] DQC Reference Directory: /var/lib/cwl/stg25f090b4-bea0-40df-89ed-ced401988ad6/dqc_reference [2023-06-12 23:38:02,923] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-12 23:38:02,924] [INFO] Task started: Prodigal [2023-06-12 23:38:02,925] [INFO] Running command: gunzip -c /var/lib/cwl/stgd0231eec-4fd4-490b-97ee-7be7fcbc1144/GCA_022773025.1_ASM2277302v1_genomic.fna.gz | prodigal -d GCA_022773025.1_ASM2277302v1_genomic.fna/cds.fna -a GCA_022773025.1_ASM2277302v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-12 23:38:07,927] [INFO] Task succeeded: Prodigal [2023-06-12 23:38:07,927] [INFO] Task started: HMMsearch [2023-06-12 23:38:07,927] [INFO] Running command: hmmsearch --tblout GCA_022773025.1_ASM2277302v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg25f090b4-bea0-40df-89ed-ced401988ad6/dqc_reference/reference_markers.hmm GCA_022773025.1_ASM2277302v1_genomic.fna/protein.faa > /dev/null [2023-06-12 23:38:08,238] [INFO] Task succeeded: HMMsearch [2023-06-12 23:38:08,240] [INFO] Found 6/6 markers. [2023-06-12 23:38:08,275] [INFO] Query marker FASTA was written to GCA_022773025.1_ASM2277302v1_genomic.fna/markers.fasta [2023-06-12 23:38:08,275] [INFO] Task started: Blastn [2023-06-12 23:38:08,275] [INFO] Running command: blastn -query GCA_022773025.1_ASM2277302v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg25f090b4-bea0-40df-89ed-ced401988ad6/dqc_reference/reference_markers.fasta -out GCA_022773025.1_ASM2277302v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-12 23:38:08,865] [INFO] Task succeeded: Blastn [2023-06-12 23:38:08,870] [INFO] Selected 30 target genomes. [2023-06-12 23:38:08,870] [INFO] Target genome list was writen to GCA_022773025.1_ASM2277302v1_genomic.fna/target_genomes.txt [2023-06-12 23:38:08,881] [INFO] Task started: fastANI [2023-06-12 23:38:08,881] [INFO] Running command: fastANI --query /var/lib/cwl/stgd0231eec-4fd4-490b-97ee-7be7fcbc1144/GCA_022773025.1_ASM2277302v1_genomic.fna.gz --refList GCA_022773025.1_ASM2277302v1_genomic.fna/target_genomes.txt --output GCA_022773025.1_ASM2277302v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-12 23:38:27,702] [INFO] Task succeeded: fastANI [2023-06-12 23:38:27,702] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg25f090b4-bea0-40df-89ed-ced401988ad6/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-12 23:38:27,703] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg25f090b4-bea0-40df-89ed-ced401988ad6/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-12 23:38:27,722] [INFO] Found 25 fastANI hits (0 hits with ANI > threshold) [2023-06-12 23:38:27,722] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-12 23:38:27,722] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Clostridium frigidicarnis strain=DSM 12271 GCA_900111985.1 84698 84698 type True 77.5262 151 997 95 below_threshold Clostridium fallax strain=NCTC8380 GCA_900461065.1 1533 1533 type True 77.1438 127 997 95 below_threshold Clostridium fallax strain=DSM 2631 GCA_900129365.1 1533 1533 type True 77.1387 125 997 95 below_threshold Clostridium nigeriense strain=Marseille-P2414 GCA_900086595.1 1805470 1805470 type True 77.1227 125 997 95 below_threshold Clostridium amylolyticum strain=DSM 21864 GCA_900142075.1 1121298 1121298 type True 77.0867 91 997 95 below_threshold Clostridium simiarum strain=MSJ-4 GCA_018919175.1 2841506 2841506 type True 76.973 117 997 95 below_threshold Clostridium hydrogeniformans strain=DSM 21757 GCA_000686705.1 349933 349933 type True 76.9443 159 997 95 below_threshold Clostridium uliginosum strain=DSM 12992 GCA_900112485.1 119641 119641 type True 76.9156 103 997 95 below_threshold Clostridium gallinarum strain=Sa3CUN1 GCA_014836325.1 2762246 2762246 type True 76.8686 112 997 95 below_threshold Clostridium perfringens strain=FDAARGOS_903 GCA_016027375.1 1502 1502 type True 76.8255 99 997 95 below_threshold Clostridium weizhouense strain=YB-6 GCA_019431045.1 2859781 2859781 type True 76.7866 103 997 95 below_threshold Clostridium cibarium strain=Sa3CVN1 GCA_014836335.1 2762247 2762247 type True 76.6308 98 997 95 below_threshold Clostridium massiliodielmoense strain=MT26 GCA_900176615.1 1776385 1776385 type True 76.5897 84 997 95 below_threshold Clostridium tarantellae strain=DSM 3997 GCA_009295725.1 39493 39493 type True 76.5859 111 997 95 below_threshold Clostridium aciditolerans strain=DSM 17425 GCA_016316925.1 339861 339861 type True 76.5753 87 997 95 below_threshold Haloimpatiens massiliensis strain=Mt13 GCA_900184255.1 1658110 1658110 type True 76.4914 103 997 95 below_threshold Clostridium beijerinckii strain=DSM 791 GCA_002006445.1 1520 1520 suspected-type True 76.3929 99 997 95 below_threshold Clostridium beijerinckii strain=DSM 791 GCA_018223745.1 1520 1520 suspected-type True 76.377 104 997 95 below_threshold Clostridium felsineum strain=DSM 7320 GCA_002006215.2 36839 36839 type True 76.3503 80 997 95 below_threshold Clostridium gelidum strain=C5S11 GCA_019977655.1 704125 704125 type True 76.1637 109 997 95 below_threshold Clostridium acidisoli strain=DSM 12555 GCA_900176305.1 91624 91624 type True 76.1624 65 997 95 below_threshold Clostridium felsineum strain=DSM 793 GCA_002006235.2 36839 36839 type True 76.1038 90 997 95 below_threshold Clostridium felsineum strain=DSM 794 GCA_002006355.2 36839 36839 type True 76.0815 89 997 95 below_threshold Clostridium sporogenes strain=DSM 795 GCA_001020205.1 1509 1509 type True 76.0531 115 997 95 below_threshold Clostridium sporogenes strain=NCTC13020 GCA_900461305.1 1509 1509 type True 76.0529 115 997 95 below_threshold -------------------------------------------------------------------------------- [2023-06-12 23:38:27,724] [INFO] DFAST Taxonomy check result was written to GCA_022773025.1_ASM2277302v1_genomic.fna/tc_result.tsv [2023-06-12 23:38:27,726] [INFO] ===== Taxonomy check completed ===== [2023-06-12 23:38:27,726] [INFO] ===== Start completeness check using CheckM ===== [2023-06-12 23:38:27,726] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg25f090b4-bea0-40df-89ed-ced401988ad6/dqc_reference/checkm_data [2023-06-12 23:38:27,728] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-12 23:38:27,762] [INFO] Task started: CheckM [2023-06-12 23:38:27,762] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_022773025.1_ASM2277302v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_022773025.1_ASM2277302v1_genomic.fna/checkm_input GCA_022773025.1_ASM2277302v1_genomic.fna/checkm_result [2023-06-12 23:38:49,878] [INFO] Task succeeded: CheckM [2023-06-12 23:38:49,881] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 95.45% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-12 23:38:49,912] [INFO] ===== Completeness check finished ===== [2023-06-12 23:38:49,914] [INFO] ===== Start GTDB Search ===== [2023-06-12 23:38:49,914] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_022773025.1_ASM2277302v1_genomic.fna/markers.fasta) [2023-06-12 23:38:49,915] [INFO] Task started: Blastn [2023-06-12 23:38:49,915] [INFO] Running command: blastn -query GCA_022773025.1_ASM2277302v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg25f090b4-bea0-40df-89ed-ced401988ad6/dqc_reference/reference_markers_gtdb.fasta -out GCA_022773025.1_ASM2277302v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-12 23:38:50,676] [INFO] Task succeeded: Blastn [2023-06-12 23:38:50,682] [INFO] Selected 24 target genomes. [2023-06-12 23:38:50,682] [INFO] Target genome list was writen to GCA_022773025.1_ASM2277302v1_genomic.fna/target_genomes_gtdb.txt [2023-06-12 23:38:50,696] [INFO] Task started: fastANI [2023-06-12 23:38:50,696] [INFO] Running command: fastANI --query /var/lib/cwl/stgd0231eec-4fd4-490b-97ee-7be7fcbc1144/GCA_022773025.1_ASM2277302v1_genomic.fna.gz --refList GCA_022773025.1_ASM2277302v1_genomic.fna/target_genomes_gtdb.txt --output GCA_022773025.1_ASM2277302v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-12 23:39:05,623] [INFO] Task succeeded: fastANI [2023-06-12 23:39:05,643] [INFO] Found 21 fastANI hits (0 hits with ANI > circumscription radius) [2023-06-12 23:39:05,643] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_000424205.1 s__Clostridium_X cadaveris 86.8897 802 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium_X 95.0 99.52 99.40 0.92 0.82 7 - GCA_002423705.1 s__Clostridium_X sp002423705 78.3041 272 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium_X 95.0 N/A N/A N/A N/A 1 - GCF_900111985.1 s__Clostridium_Z frigidicarnis 77.5309 150 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium_Z 95.0 N/A N/A N/A N/A 1 - GCA_900539375.1 s__Clostridium sp900539375 77.2335 90 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 99.09 99.02 0.89 0.86 4 - GCF_900129365.1 s__Clostridium_AH fallax 77.1194 126 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium_AH 95.0 100.00 100.00 1.00 1.00 2 - GCF_900086595.1 s__Clostridium nigeriense 77.0995 126 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 98.98 98.98 0.91 0.91 2 - GCF_004794105.1 s__Clostridium sartagoforme_B 76.9921 120 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 98.58 98.58 0.90 0.90 2 - GCF_018918055.1 s__Clostridium sp018918055 76.9491 79 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 N/A N/A N/A N/A 1 - GCF_000686705.1 s__Clostridium_Z hydrogeniformans 76.9443 159 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium_Z 95.0 N/A N/A N/A N/A 1 - GCF_009928445.1 s__Clostridium_AR sp009928445 76.9061 101 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium_AR 95.0 N/A N/A N/A N/A 1 - GCA_000753435.1 s__Clostridium_L amazonitimonense 76.9046 120 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium_L 95.0 99.33 97.98 0.97 0.93 4 - GCF_014836325.1 s__Clostridium sp014836325 76.8908 112 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 N/A N/A N/A N/A 1 - GCF_009295725.1 s__Clostridium_P tarantellae 76.5859 111 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium_P 95.0 N/A N/A N/A N/A 1 - GCF_016316925.1 s__Clostridium_AM aciditolerans 76.5507 88 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium_AM 95.0 95.27 95.27 0.80 0.80 2 - GCA_009738435.1 s__Clostridium_H bovifaecis 76.5347 69 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium_H 95.0 N/A N/A N/A N/A 1 - GCF_900184255.1 s__Haloimpatiens massiliensis 76.4914 103 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Haloimpatiens 95.0 N/A N/A N/A N/A 1 - GCF_018223745.1 s__Clostridium beijerinckii 76.377 104 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium 95.0 97.01 95.18 0.85 0.79 244 - GCF_002008345.1 s__Clostridium_F tepidum 76.1998 117 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium_F 95.0 99.70 99.54 0.93 0.92 5 - GCF_002006355.1 s__Clostridium_S felsineum 76.0698 87 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium_S 95.0 98.29 98.24 0.88 0.87 4 - GCF_001276215.1 s__Clostridium_F sp001276215 76.047 98 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium_F 95.0 97.32 95.44 0.88 0.82 10 - GCF_001276985.1 s__Clostridium_F botulinum 75.9885 111 997 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium_F 95.0 97.69 95.97 0.92 0.83 213 - -------------------------------------------------------------------------------- [2023-06-12 23:39:05,645] [INFO] GTDB search result was written to GCA_022773025.1_ASM2277302v1_genomic.fna/result_gtdb.tsv [2023-06-12 23:39:05,646] [INFO] ===== GTDB Search completed ===== [2023-06-12 23:39:05,651] [INFO] DFAST_QC result json was written to GCA_022773025.1_ASM2277302v1_genomic.fna/dqc_result.json [2023-06-12 23:39:05,651] [INFO] DFAST_QC completed! [2023-06-12 23:39:05,651] [INFO] Total running time: 0h1m5s