[2024-01-24 11:30:49,466] [INFO] DFAST_QC pipeline started. [2024-01-24 11:30:49,468] [INFO] DFAST_QC version: 0.5.7 [2024-01-24 11:30:49,469] [INFO] DQC Reference Directory: /var/lib/cwl/stgcfaef0c5-b17b-4e38-9bad-cb70a0e07e53/dqc_reference [2024-01-24 11:30:50,694] [INFO] ===== Start taxonomy check using ANI ===== [2024-01-24 11:30:50,694] [INFO] Task started: Prodigal [2024-01-24 11:30:50,695] [INFO] Running command: gunzip -c /var/lib/cwl/stg55f1288a-7493-4e52-baf0-aed3e94d720f/GCF_012396255.1_ASM1239625v1_genomic.fna.gz | prodigal -d GCF_012396255.1_ASM1239625v1_genomic.fna/cds.fna -a GCF_012396255.1_ASM1239625v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2024-01-24 11:31:06,445] [INFO] Task succeeded: Prodigal [2024-01-24 11:31:06,445] [INFO] Task started: HMMsearch [2024-01-24 11:31:06,445] [INFO] Running command: hmmsearch --tblout GCF_012396255.1_ASM1239625v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgcfaef0c5-b17b-4e38-9bad-cb70a0e07e53/dqc_reference/reference_markers.hmm GCF_012396255.1_ASM1239625v1_genomic.fna/protein.faa > /dev/null [2024-01-24 11:31:06,725] [INFO] Task succeeded: HMMsearch [2024-01-24 11:31:06,727] [INFO] Found 6/6 markers. [2024-01-24 11:31:06,771] [INFO] Query marker FASTA was written to GCF_012396255.1_ASM1239625v1_genomic.fna/markers.fasta [2024-01-24 11:31:06,771] [INFO] Task started: Blastn [2024-01-24 11:31:06,771] [INFO] Running command: blastn -query GCF_012396255.1_ASM1239625v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgcfaef0c5-b17b-4e38-9bad-cb70a0e07e53/dqc_reference/reference_markers.fasta -out GCF_012396255.1_ASM1239625v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 11:31:07,966] [INFO] Task succeeded: Blastn [2024-01-24 11:31:07,971] [INFO] Selected 12 target genomes. [2024-01-24 11:31:07,971] [INFO] Target genome list was writen to GCF_012396255.1_ASM1239625v1_genomic.fna/target_genomes.txt [2024-01-24 11:31:08,014] [INFO] Task started: fastANI [2024-01-24 11:31:08,015] [INFO] Running command: fastANI --query /var/lib/cwl/stg55f1288a-7493-4e52-baf0-aed3e94d720f/GCF_012396255.1_ASM1239625v1_genomic.fna.gz --refList GCF_012396255.1_ASM1239625v1_genomic.fna/target_genomes.txt --output GCF_012396255.1_ASM1239625v1_genomic.fna/fastani_result.tsv --threads 1 [2024-01-24 11:31:21,442] [INFO] Task succeeded: fastANI [2024-01-24 11:31:21,443] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgcfaef0c5-b17b-4e38-9bad-cb70a0e07e53/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2024-01-24 11:31:21,444] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgcfaef0c5-b17b-4e38-9bad-cb70a0e07e53/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2024-01-24 11:31:21,456] [INFO] Found 12 fastANI hits (2 hits with ANI > threshold) [2024-01-24 11:31:21,456] [INFO] The taxonomy check result is classified as 'conclusive'. [2024-01-24 11:31:21,456] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Gordonia sputi strain=ATCC 29627 GCA_012396255.1 36823 36823 type True 100.0 1623 1625 95 conclusive Gordonia sputi strain=NBRC 100414 GCA_000248055.2 36823 36823 type True 99.9732 1563 1625 95 conclusive Gordonia aichiensis strain=NBRC 108223 GCA_000332975.1 36820 36820 type True 84.672 1244 1625 95 below_threshold Gordonia otitidis strain=NBRC 100426 GCA_000248075.2 249058 249058 type True 83.9314 1157 1625 95 below_threshold Gordonia rubripertincta strain=ATCC 14352 GCA_012396225.1 36822 36822 type True 79.5202 687 1625 95 below_threshold Gordonia rubripertincta strain=NBRC 101908 GCA_000327325.1 36822 36822 type True 79.4972 671 1625 95 below_threshold Gordonia namibiensis strain=NBRC 108229 GCA_000298235.1 168480 168480 type True 79.4862 676 1625 95 below_threshold Gordonia polyisoprenivorans strain=ATCC BAA-14 GCA_012396285.1 84595 84595 type True 79.4571 810 1625 95 below_threshold Gordonia lacunae strain=BS2 GCA_002149015.1 417102 417102 type True 79.2117 677 1625 95 below_threshold Gordonia rhizosphera strain=NBRC 16068 GCA_000298195.1 83341 83341 type True 79.0903 642 1625 95 below_threshold Gordonia insulae strain=MMS17-SY073 GCA_003855095.1 2420509 2420509 type True 79.0569 723 1625 95 below_threshold Gordonia hankookensis strain=ON-33 GCA_014673215.1 589403 589403 type True 78.994 673 1625 95 below_threshold -------------------------------------------------------------------------------- [2024-01-24 11:31:21,458] [INFO] DFAST Taxonomy check result was written to GCF_012396255.1_ASM1239625v1_genomic.fna/tc_result.tsv [2024-01-24 11:31:21,458] [INFO] ===== Taxonomy check completed ===== [2024-01-24 11:31:21,458] [INFO] ===== Start completeness check using CheckM ===== [2024-01-24 11:31:21,459] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgcfaef0c5-b17b-4e38-9bad-cb70a0e07e53/dqc_reference/checkm_data [2024-01-24 11:31:21,459] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2024-01-24 11:31:21,505] [INFO] Task started: CheckM [2024-01-24 11:31:21,505] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_012396255.1_ASM1239625v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_012396255.1_ASM1239625v1_genomic.fna/checkm_input GCF_012396255.1_ASM1239625v1_genomic.fna/checkm_result [2024-01-24 11:32:10,650] [INFO] Task succeeded: CheckM [2024-01-24 11:32:10,652] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2024-01-24 11:32:10,674] [INFO] ===== Completeness check finished ===== [2024-01-24 11:32:10,674] [INFO] ===== Start GTDB Search ===== [2024-01-24 11:32:10,674] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_012396255.1_ASM1239625v1_genomic.fna/markers.fasta) [2024-01-24 11:32:10,675] [INFO] Task started: Blastn [2024-01-24 11:32:10,675] [INFO] Running command: blastn -query GCF_012396255.1_ASM1239625v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgcfaef0c5-b17b-4e38-9bad-cb70a0e07e53/dqc_reference/reference_markers_gtdb.fasta -out GCF_012396255.1_ASM1239625v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 11:32:12,488] [INFO] Task succeeded: Blastn [2024-01-24 11:32:12,493] [INFO] Selected 11 target genomes. [2024-01-24 11:32:12,493] [INFO] Target genome list was writen to GCF_012396255.1_ASM1239625v1_genomic.fna/target_genomes_gtdb.txt [2024-01-24 11:32:13,053] [INFO] Task started: fastANI [2024-01-24 11:32:13,054] [INFO] Running command: fastANI --query /var/lib/cwl/stg55f1288a-7493-4e52-baf0-aed3e94d720f/GCF_012396255.1_ASM1239625v1_genomic.fna.gz --refList GCF_012396255.1_ASM1239625v1_genomic.fna/target_genomes_gtdb.txt --output GCF_012396255.1_ASM1239625v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2024-01-24 11:32:25,386] [INFO] Task succeeded: fastANI [2024-01-24 11:32:25,399] [INFO] Found 11 fastANI hits (1 hits with ANI > circumscription radius) [2024-01-24 11:32:25,400] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_000248055.1 s__Gordonia sputi 99.9732 1563 1625 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 97.92 96.52 0.90 0.84 5 conclusive GCF_001186365.1 s__Gordonia jacobaea 92.6405 1439 1625 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 97.30 95.68 0.92 0.90 5 - GCF_000332975.1 s__Gordonia aichiensis 84.672 1244 1625 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 N/A N/A N/A N/A 1 - GCF_000248075.1 s__Gordonia otitidis 83.9295 1157 1625 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 N/A N/A N/A N/A 1 - GCF_000241325.1 s__Gordonia polyisoprenivorans 79.5152 796 1625 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 98.60 97.62 0.90 0.79 11 - GCF_000298235.1 s__Gordonia namibiensis 79.4867 676 1625 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 97.13 97.13 0.90 0.90 2 - GCF_000327325.1 s__Gordonia rubripertincta 79.4867 672 1625 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 98.55 98.07 0.91 0.87 6 - GCF_000298195.1 s__Gordonia rhizosphera 79.0909 642 1625 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 N/A N/A N/A N/A 1 - GCF_003855095.1 s__Gordonia insulae 79.0507 723 1625 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 N/A N/A N/A N/A 1 - GCA_002700145.1 s__Gordonia sp002700145 79.019 642 1625 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 N/A N/A N/A N/A 1 - GCF_014673215.1 s__Gordonia hankookensis 79.0071 671 1625 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 98.33 98.32 0.95 0.94 3 - -------------------------------------------------------------------------------- [2024-01-24 11:32:25,401] [INFO] GTDB search result was written to GCF_012396255.1_ASM1239625v1_genomic.fna/result_gtdb.tsv [2024-01-24 11:32:25,402] [INFO] ===== GTDB Search completed ===== [2024-01-24 11:32:25,406] [INFO] DFAST_QC result json was written to GCF_012396255.1_ASM1239625v1_genomic.fna/dqc_result.json [2024-01-24 11:32:25,406] [INFO] DFAST_QC completed! [2024-01-24 11:32:25,406] [INFO] Total running time: 0h1m36s