[2023-06-08 03:31:18,044] [INFO] DFAST_QC pipeline started. [2023-06-08 03:31:18,046] [INFO] DFAST_QC version: 0.5.7 [2023-06-08 03:31:18,046] [INFO] DQC Reference Directory: /var/lib/cwl/stgf33909a2-8e12-4621-959d-9ac28e14987f/dqc_reference [2023-06-08 03:31:19,256] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-08 03:31:19,257] [INFO] Task started: Prodigal [2023-06-08 03:31:19,257] [INFO] Running command: gunzip -c /var/lib/cwl/stg416ef35b-14df-477b-bf00-76776a7caf5a/GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna.gz | prodigal -d GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/cds.fna -a GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-08 03:31:27,513] [INFO] Task succeeded: Prodigal [2023-06-08 03:31:27,514] [INFO] Task started: HMMsearch [2023-06-08 03:31:27,514] [INFO] Running command: hmmsearch --tblout GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgf33909a2-8e12-4621-959d-9ac28e14987f/dqc_reference/reference_markers.hmm GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/protein.faa > /dev/null [2023-06-08 03:31:27,791] [INFO] Task succeeded: HMMsearch [2023-06-08 03:31:27,792] [INFO] Found 6/6 markers. [2023-06-08 03:31:27,823] [INFO] Query marker FASTA was written to GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/markers.fasta [2023-06-08 03:31:27,824] [INFO] Task started: Blastn [2023-06-08 03:31:27,824] [INFO] Running command: blastn -query GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/markers.fasta -db /var/lib/cwl/stgf33909a2-8e12-4621-959d-9ac28e14987f/dqc_reference/reference_markers.fasta -out GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-08 03:31:28,449] [INFO] Task succeeded: Blastn [2023-06-08 03:31:28,454] [INFO] Selected 23 target genomes. [2023-06-08 03:31:28,454] [INFO] Target genome list was writen to GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/target_genomes.txt [2023-06-08 03:31:28,460] [INFO] Task started: fastANI [2023-06-08 03:31:28,460] [INFO] Running command: fastANI --query /var/lib/cwl/stg416ef35b-14df-477b-bf00-76776a7caf5a/GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna.gz --refList GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/target_genomes.txt --output GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/fastani_result.tsv --threads 1 [2023-06-08 03:31:43,699] [INFO] Task succeeded: fastANI [2023-06-08 03:31:43,700] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgf33909a2-8e12-4621-959d-9ac28e14987f/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-08 03:31:43,701] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgf33909a2-8e12-4621-959d-9ac28e14987f/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-08 03:31:43,717] [INFO] Found 13 fastANI hits (0 hits with ANI > threshold) [2023-06-08 03:31:43,717] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-08 03:31:43,717] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Enterocloster bolteae strain=ATCC BAA-613 GCA_002959675.1 208479 208479 type True 76.6182 116 1135 95 below_threshold Enterocloster bolteae strain=ATCC BAA-613 GCA_002234575.2 208479 208479 type True 76.5723 115 1135 95 below_threshold Enterocloster bolteae strain=ATCC BAA-613 GCA_000154365.1 208479 208479 type True 76.5602 118 1135 95 below_threshold Enterocloster clostridioformis strain=ATCC 25537 GCA_900113155.1 1531 1531 type True 76.5512 110 1135 95 below_threshold Enterocloster clostridioformis strain=NCTC11224 GCA_900447015.1 1531 1531 suspected-type True 76.4151 119 1135 95 below_threshold Clostridium porci strain=WCA-389-WT-23D1 GCA_009696375.1 2605778 2605778 type True 76.4104 73 1135 95 below_threshold Hungatella hathewayi strain=DSM 13479 GCA_000160095.1 154046 154046 suspected-type True 76.4051 101 1135 95 below_threshold Hungatella hathewayi strain=DSM 13479 GCA_025149285.1 154046 154046 suspected-type True 76.3831 107 1135 95 below_threshold Hungatella effluvii strain=DSM 24995 GCA_003201875.1 1096246 1096246 type True 76.2601 103 1135 95 below_threshold Enterocloster asparagiformis strain=DSM 15981 GCA_025149125.1 333367 333367 type True 76.1997 159 1135 95 below_threshold Enterocloster asparagiformis strain=DSM 15981 GCA_000158075.1 333367 333367 type True 76.1781 158 1135 95 below_threshold Lacrimispora sphenoides strain=NCTC507 GCA_900461315.1 29370 29370 type True 76.0232 70 1135 95 below_threshold Lacrimispora celerecrescens strain=18A GCA_002797975.1 29354 29354 type True 76.0029 73 1135 95 below_threshold -------------------------------------------------------------------------------- [2023-06-08 03:31:43,719] [INFO] DFAST Taxonomy check result was written to GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/tc_result.tsv [2023-06-08 03:31:43,719] [INFO] ===== Taxonomy check completed ===== [2023-06-08 03:31:43,719] [INFO] ===== Start completeness check using CheckM ===== [2023-06-08 03:31:43,720] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgf33909a2-8e12-4621-959d-9ac28e14987f/dqc_reference/checkm_data [2023-06-08 03:31:43,720] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-08 03:31:43,756] [INFO] Task started: CheckM [2023-06-08 03:31:43,756] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/checkm_input GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/checkm_result [2023-06-08 03:32:13,378] [INFO] Task succeeded: CheckM [2023-06-08 03:32:13,380] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 94.79% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-08 03:32:13,402] [INFO] ===== Completeness check finished ===== [2023-06-08 03:32:13,402] [INFO] ===== Start GTDB Search ===== [2023-06-08 03:32:13,403] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/markers.fasta) [2023-06-08 03:32:13,403] [INFO] Task started: Blastn [2023-06-08 03:32:13,403] [INFO] Running command: blastn -query GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/markers.fasta -db /var/lib/cwl/stgf33909a2-8e12-4621-959d-9ac28e14987f/dqc_reference/reference_markers_gtdb.fasta -out GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-08 03:32:14,469] [INFO] Task succeeded: Blastn [2023-06-08 03:32:14,475] [INFO] Selected 22 target genomes. [2023-06-08 03:32:14,475] [INFO] Target genome list was writen to GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/target_genomes_gtdb.txt [2023-06-08 03:32:14,483] [INFO] Task started: fastANI [2023-06-08 03:32:14,484] [INFO] Running command: fastANI --query /var/lib/cwl/stg416ef35b-14df-477b-bf00-76776a7caf5a/GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna.gz --refList GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/target_genomes_gtdb.txt --output GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-08 03:32:29,138] [INFO] Task succeeded: fastANI [2023-06-08 03:32:29,163] [INFO] Found 22 fastANI hits (1 hits with ANI > circumscription radius) [2023-06-08 03:32:29,163] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_910575565.1 s__Caccovicinus sp910575565 99.2482 1082 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Caccovicinus 95.0 99.12 99.12 0.93 0.93 2 conclusive GCA_910589195.1 s__Caccovicinus sp910589195 78.6433 393 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Caccovicinus 95.0 N/A N/A N/A N/A 1 - GCA_009774615.1 s__Caccovicinus sp009774615 78.0816 287 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Caccovicinus 95.0 99.13 99.13 0.87 0.87 2 - GCA_910585635.1 s__Caccovicinus sp910585635 77.8438 289 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Caccovicinus 95.0 N/A N/A N/A N/A 1 - GCA_017889125.1 s__Caccovicinus sp017889125 77.5934 246 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Caccovicinus 95.0 N/A N/A N/A N/A 1 - GCA_910588895.1 s__Caccovicinus sp910588895 77.4226 249 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Caccovicinus 95.0 N/A N/A N/A N/A 1 - GCA_018715905.1 s__Caccovicinus merdipullorum 77.4154 214 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Caccovicinus 95.0 98.94 98.69 0.87 0.79 3 - GCA_003611875.1 s__Ventrimonas sp003611875 77.3385 150 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Ventrimonas 95.0 99.50 99.24 0.91 0.86 4 - GCA_018713255.1 s__Caccovicinus excrementipullorum 77.3059 195 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Caccovicinus 95.0 99.99 99.99 0.96 0.96 2 - GCA_910586255.1 s__Ventrimonas sp910586255 76.7721 161 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Ventrimonas 95.0 N/A N/A N/A N/A 1 - GCA_018380885.1 s__Enterocloster sp900555905 76.6347 148 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Enterocloster 95.0 98.89 98.89 0.95 0.95 2 - GCA_019114465.1 s__Lachnoclostridium_A avicola 76.5697 83 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Lachnoclostridium_A 95.0 99.23 98.89 0.92 0.90 3 - GCF_002160755.1 s__Lachnoclostridium_A sp002160755 76.5433 134 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Lachnoclostridium_A 95.0 98.32 98.32 0.92 0.92 2 - GCA_002160535.1 s__Clostridium_Q saccharolyticum_A 76.5387 81 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Clostridium_Q 95.0 98.86 98.46 0.90 0.84 10 - GCA_904395885.1 s__UMGS1370 sp904395885 76.4846 79 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__UMGS1370 95.0 99.37 99.37 0.90 0.90 2 - GCA_018223415.1 s__Copromonas sp900541255 76.4422 74 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Copromonas 95.0 99.81 99.68 0.96 0.93 4 - GCA_001304875.1 s__Ventrisoma faecale 76.4164 89 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Ventrisoma 95.0 98.23 97.74 0.91 0.86 4 - GCA_905215775.1 s__Copromonas sp905215775 76.3577 82 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Copromonas 95.0 N/A N/A N/A N/A 1 - GCF_003434055.1 s__Enterocloster aldenensis 76.2878 141 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Enterocloster 95.0 99.16 97.51 0.89 0.78 13 - GCA_019118585.1 s__Enterocloster faecavium 76.0855 97 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Enterocloster 95.0 99.87 99.87 0.86 0.86 2 - GCF_900155545.1 s__Lacrimispora sp900155545 76.0748 84 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Lacrimispora 95.0 N/A N/A N/A N/A 1 - GCA_019119775.1 s__Lachnoclostridium_A pullistercoris 75.9549 92 1135 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Lachnoclostridium_A 95.0 99.96 99.96 0.95 0.95 2 - -------------------------------------------------------------------------------- [2023-06-08 03:32:29,165] [INFO] GTDB search result was written to GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/result_gtdb.tsv [2023-06-08 03:32:29,165] [INFO] ===== GTDB Search completed ===== [2023-06-08 03:32:29,169] [INFO] DFAST_QC result json was written to GCA_947003205.1_SRR19715300_bin.8_metawrap_v1.3_MAG_genomic.fna/dqc_result.json [2023-06-08 03:32:29,169] [INFO] DFAST_QC completed! [2023-06-08 03:32:29,169] [INFO] Total running time: 0h1m11s