[2023-06-19 02:25:10,971] [INFO] DFAST_QC pipeline started. [2023-06-19 02:25:10,973] [INFO] DFAST_QC version: 0.5.7 [2023-06-19 02:25:10,973] [INFO] DQC Reference Directory: /var/lib/cwl/stg7f6da5eb-9df5-4b7e-9091-1d2b76f4cd07/dqc_reference [2023-06-19 02:25:12,146] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-19 02:25:12,146] [INFO] Task started: Prodigal [2023-06-19 02:25:12,147] [INFO] Running command: gunzip -c /var/lib/cwl/stg3c22a65d-4c74-4e8e-bb25-f94a988e9599/GCA_018402585.1_ASM1840258v1_genomic.fna.gz | prodigal -d GCA_018402585.1_ASM1840258v1_genomic.fna/cds.fna -a GCA_018402585.1_ASM1840258v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-19 02:25:28,824] [INFO] Task succeeded: Prodigal [2023-06-19 02:25:28,824] [INFO] Task started: HMMsearch [2023-06-19 02:25:28,825] [INFO] Running command: hmmsearch --tblout GCA_018402585.1_ASM1840258v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg7f6da5eb-9df5-4b7e-9091-1d2b76f4cd07/dqc_reference/reference_markers.hmm GCA_018402585.1_ASM1840258v1_genomic.fna/protein.faa > /dev/null [2023-06-19 02:25:29,074] [INFO] Task succeeded: HMMsearch [2023-06-19 02:25:29,076] [INFO] Found 6/6 markers. [2023-06-19 02:25:29,114] [INFO] Query marker FASTA was written to GCA_018402585.1_ASM1840258v1_genomic.fna/markers.fasta [2023-06-19 02:25:29,115] [INFO] Task started: Blastn [2023-06-19 02:25:29,115] [INFO] Running command: blastn -query GCA_018402585.1_ASM1840258v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg7f6da5eb-9df5-4b7e-9091-1d2b76f4cd07/dqc_reference/reference_markers.fasta -out GCA_018402585.1_ASM1840258v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-19 02:25:29,759] [INFO] Task succeeded: Blastn [2023-06-19 02:25:29,763] [INFO] Selected 15 target genomes. [2023-06-19 02:25:29,764] [INFO] Target genome list was writen to GCA_018402585.1_ASM1840258v1_genomic.fna/target_genomes.txt [2023-06-19 02:25:29,767] [INFO] Task started: fastANI [2023-06-19 02:25:29,767] [INFO] Running command: fastANI --query /var/lib/cwl/stg3c22a65d-4c74-4e8e-bb25-f94a988e9599/GCA_018402585.1_ASM1840258v1_genomic.fna.gz --refList GCA_018402585.1_ASM1840258v1_genomic.fna/target_genomes.txt --output GCA_018402585.1_ASM1840258v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-19 02:25:42,806] [INFO] Task succeeded: fastANI [2023-06-19 02:25:42,807] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg7f6da5eb-9df5-4b7e-9091-1d2b76f4cd07/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-19 02:25:42,807] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg7f6da5eb-9df5-4b7e-9091-1d2b76f4cd07/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-19 02:25:42,816] [INFO] Found 3 fastANI hits (0 hits with ANI > threshold) [2023-06-19 02:25:42,817] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-19 02:25:42,817] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Chthoniobacter flavus strain=Ellin428 GCA_000173075.1 191863 191863 type True 76.5287 64 803 95 below_threshold Terrimicrobium sacchariphilum strain=NM-5 GCA_001613545.1 690879 690879 type True 76.4237 62 803 95 below_threshold Chthoniobacter flavus strain=DSM 22515 GCA_004341915.1 191863 191863 type True 76.209 63 803 95 below_threshold -------------------------------------------------------------------------------- [2023-06-19 02:25:42,848] [INFO] DFAST Taxonomy check result was written to GCA_018402585.1_ASM1840258v1_genomic.fna/tc_result.tsv [2023-06-19 02:25:42,849] [INFO] ===== Taxonomy check completed ===== [2023-06-19 02:25:42,849] [INFO] ===== Start completeness check using CheckM ===== [2023-06-19 02:25:42,849] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg7f6da5eb-9df5-4b7e-9091-1d2b76f4cd07/dqc_reference/checkm_data [2023-06-19 02:25:42,852] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-19 02:25:42,880] [INFO] Task started: CheckM [2023-06-19 02:25:42,881] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_018402585.1_ASM1840258v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_018402585.1_ASM1840258v1_genomic.fna/checkm_input GCA_018402585.1_ASM1840258v1_genomic.fna/checkm_result [2023-06-19 02:26:30,806] [INFO] Task succeeded: CheckM [2023-06-19 02:26:30,808] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 91.67% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-19 02:26:30,828] [INFO] ===== Completeness check finished ===== [2023-06-19 02:26:30,828] [INFO] ===== Start GTDB Search ===== [2023-06-19 02:26:30,829] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_018402585.1_ASM1840258v1_genomic.fna/markers.fasta) [2023-06-19 02:26:30,829] [INFO] Task started: Blastn [2023-06-19 02:26:30,829] [INFO] Running command: blastn -query GCA_018402585.1_ASM1840258v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg7f6da5eb-9df5-4b7e-9091-1d2b76f4cd07/dqc_reference/reference_markers_gtdb.fasta -out GCA_018402585.1_ASM1840258v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-19 02:26:31,657] [INFO] Task succeeded: Blastn [2023-06-19 02:26:31,661] [INFO] Selected 10 target genomes. [2023-06-19 02:26:31,662] [INFO] Target genome list was writen to GCA_018402585.1_ASM1840258v1_genomic.fna/target_genomes_gtdb.txt [2023-06-19 02:26:31,666] [INFO] Task started: fastANI [2023-06-19 02:26:31,666] [INFO] Running command: fastANI --query /var/lib/cwl/stg3c22a65d-4c74-4e8e-bb25-f94a988e9599/GCA_018402585.1_ASM1840258v1_genomic.fna.gz --refList GCA_018402585.1_ASM1840258v1_genomic.fna/target_genomes_gtdb.txt --output GCA_018402585.1_ASM1840258v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-19 02:26:38,721] [INFO] Task succeeded: fastANI [2023-06-19 02:26:38,732] [INFO] Found 7 fastANI hits (1 hits with ANI > circumscription radius) [2023-06-19 02:26:38,732] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_018402585.1 s__JACTMZ01 sp018402585 100.0 802 803 d__Bacteria;p__Verrucomicrobiota;c__Verrucomicrobiae;o__Chthoniobacterales;f__JACTMZ01;g__JACTMZ01 95.0 N/A N/A N/A N/A 1 conclusive GCA_903934135.1 s__JACTMZ01 sp903934135 86.8802 499 803 d__Bacteria;p__Verrucomicrobiota;c__Verrucomicrobiae;o__Chthoniobacterales;f__JACTMZ01;g__JACTMZ01 95.0 99.34 99.34 0.86 0.86 2 - GCA_903953565.1 s__JACTMZ01 sp903953565 80.383 428 803 d__Bacteria;p__Verrucomicrobiota;c__Verrucomicrobiae;o__Chthoniobacterales;f__JACTMZ01;g__JACTMZ01 95.0 99.90 99.90 0.96 0.96 2 - GCA_017853515.1 s__JACTMZ01 sp017853515 79.2769 266 803 d__Bacteria;p__Verrucomicrobiota;c__Verrucomicrobiae;o__Chthoniobacterales;f__JACTMZ01;g__JACTMZ01 95.0 N/A N/A N/A N/A 1 - GCA_018883245.1 s__JACTMZ01 sp018883245 78.4424 259 803 d__Bacteria;p__Verrucomicrobiota;c__Verrucomicrobiae;o__Chthoniobacterales;f__JACTMZ01;g__JACTMZ01 95.0 N/A N/A N/A N/A 1 - GCA_014879535.1 s__JACTMZ01 sp014879535 78.2018 265 803 d__Bacteria;p__Verrucomicrobiota;c__Verrucomicrobiae;o__Chthoniobacterales;f__JACTMZ01;g__JACTMZ01 95.0 N/A N/A N/A N/A 1 - GCA_018402345.1 s__UBA967 sp018402345 77.1845 70 803 d__Bacteria;p__Verrucomicrobiota;c__Verrucomicrobiae;o__Chthoniobacterales;f__Terrimicrobiaceae;g__UBA967 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-19 02:26:38,734] [INFO] GTDB search result was written to GCA_018402585.1_ASM1840258v1_genomic.fna/result_gtdb.tsv [2023-06-19 02:26:38,735] [INFO] ===== GTDB Search completed ===== [2023-06-19 02:26:38,738] [INFO] DFAST_QC result json was written to GCA_018402585.1_ASM1840258v1_genomic.fna/dqc_result.json [2023-06-19 02:26:38,738] [INFO] DFAST_QC completed! [2023-06-19 02:26:38,738] [INFO] Total running time: 0h1m28s