[2024-01-24 12:31:51,253] [INFO] DFAST_QC pipeline started. [2024-01-24 12:31:51,255] [INFO] DFAST_QC version: 0.5.7 [2024-01-24 12:31:51,255] [INFO] DQC Reference Directory: /var/lib/cwl/stge1187590-08ea-498d-b532-8aef63b30170/dqc_reference [2024-01-24 12:31:52,512] [INFO] ===== Start taxonomy check using ANI ===== [2024-01-24 12:31:52,513] [INFO] Task started: Prodigal [2024-01-24 12:31:52,513] [INFO] Running command: gunzip -c /var/lib/cwl/stg1bca2383-4dc8-4bc1-8f53-69a3a9083c53/GCF_000334455.1_ASM33445v1_genomic.fna.gz | prodigal -d GCF_000334455.1_ASM33445v1_genomic.fna/cds.fna -a GCF_000334455.1_ASM33445v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2024-01-24 12:32:09,714] [INFO] Task succeeded: Prodigal [2024-01-24 12:32:09,714] [INFO] Task started: HMMsearch [2024-01-24 12:32:09,714] [INFO] Running command: hmmsearch --tblout GCF_000334455.1_ASM33445v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stge1187590-08ea-498d-b532-8aef63b30170/dqc_reference/reference_markers.hmm GCF_000334455.1_ASM33445v1_genomic.fna/protein.faa > /dev/null [2024-01-24 12:32:10,016] [INFO] Task succeeded: HMMsearch [2024-01-24 12:32:10,017] [INFO] Found 6/6 markers. [2024-01-24 12:32:10,066] [INFO] Query marker FASTA was written to GCF_000334455.1_ASM33445v1_genomic.fna/markers.fasta [2024-01-24 12:32:10,067] [INFO] Task started: Blastn [2024-01-24 12:32:10,067] [INFO] Running command: blastn -query GCF_000334455.1_ASM33445v1_genomic.fna/markers.fasta -db /var/lib/cwl/stge1187590-08ea-498d-b532-8aef63b30170/dqc_reference/reference_markers.fasta -out GCF_000334455.1_ASM33445v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 12:32:11,269] [INFO] Task succeeded: Blastn [2024-01-24 12:32:11,275] [INFO] Selected 18 target genomes. [2024-01-24 12:32:11,275] [INFO] Target genome list was writen to GCF_000334455.1_ASM33445v1_genomic.fna/target_genomes.txt [2024-01-24 12:32:11,283] [INFO] Task started: fastANI [2024-01-24 12:32:11,283] [INFO] Running command: fastANI --query /var/lib/cwl/stg1bca2383-4dc8-4bc1-8f53-69a3a9083c53/GCF_000334455.1_ASM33445v1_genomic.fna.gz --refList GCF_000334455.1_ASM33445v1_genomic.fna/target_genomes.txt --output GCF_000334455.1_ASM33445v1_genomic.fna/fastani_result.tsv --threads 1 [2024-01-24 12:32:28,567] [INFO] Task succeeded: fastANI [2024-01-24 12:32:28,567] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stge1187590-08ea-498d-b532-8aef63b30170/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2024-01-24 12:32:28,568] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stge1187590-08ea-498d-b532-8aef63b30170/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2024-01-24 12:32:28,581] [INFO] Found 18 fastANI hits (1 hits with ANI > threshold) [2024-01-24 12:32:28,581] [INFO] The taxonomy check result is classified as 'conclusive'. [2024-01-24 12:32:28,581] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Gordonia soli strain=NBRC 108243 GCA_000334455.1 320799 320799 type True 100.0 1760 1762 95 conclusive Gordonia rubripertincta strain=NBRC 101908 GCA_000327325.1 36822 36822 type True 79.9978 736 1762 95 below_threshold Gordonia rubripertincta strain=ATCC 14352 GCA_012396225.1 36822 36822 type True 79.939 762 1762 95 below_threshold Gordonia namibiensis strain=NBRC 108229 GCA_000298235.1 168480 168480 type True 79.9303 762 1762 95 below_threshold Gordonia amicalis strain=NBRC 100051 GCA_000332995.1 89053 89053 type True 79.9131 728 1762 95 below_threshold Gordonia amicalis strain=DSM 44461 GCA_012395955.1 89053 89053 type True 79.9097 738 1762 95 below_threshold Gordonia insulae strain=MMS17-SY073 GCA_003855095.1 2420509 2420509 type True 79.7932 846 1762 95 below_threshold Gordonia desulfuricans strain=NBRC 100010 GCA_001485495.1 89051 89051 type True 79.754 739 1762 95 below_threshold Gordonia terrae strain=3612 GCA_001698225.1 2055 2055 suspected-type True 79.7225 796 1762 95 below_threshold Gordonia terrae strain=NRRL B-16283 GCA_000716975.1 2055 2055 suspected-type True 79.7191 800 1762 95 below_threshold Gordonia desulfuricans strain=213E GCA_010119475.1 89051 89051 type True 79.7069 756 1762 95 below_threshold Gordonia terrae strain=NBRC 100016 GCA_000248035.2 2055 2055 suspected-type True 79.6905 797 1762 95 below_threshold Gordonia terrae strain=NRRL B-16283 GCA_003183825.1 2055 2055 suspected-type True 79.6724 808 1762 95 below_threshold Gordonia bronchialis strain=NCTC10667 GCA_900450805.1 2054 2054 type True 79.5702 750 1762 95 below_threshold Gordonia rhizosphera strain=NBRC 16068 GCA_000298195.1 83341 83341 type True 79.4772 719 1762 95 below_threshold Gordonia shandongensis strain=DSM 45094 GCA_000423025.1 376351 376351 type True 78.6187 437 1762 95 below_threshold Gordonia spumicola strain=NBRC 107696 GCA_009932475.1 589161 589161 type True 78.6052 533 1762 95 below_threshold Gordonia phthalatica strain=QH-11 GCA_001305675.1 1136941 1136941 type True 78.5057 545 1762 95 below_threshold -------------------------------------------------------------------------------- [2024-01-24 12:32:28,583] [INFO] DFAST Taxonomy check result was written to GCF_000334455.1_ASM33445v1_genomic.fna/tc_result.tsv [2024-01-24 12:32:28,584] [INFO] ===== Taxonomy check completed ===== [2024-01-24 12:32:28,584] [INFO] ===== Start completeness check using CheckM ===== [2024-01-24 12:32:28,584] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stge1187590-08ea-498d-b532-8aef63b30170/dqc_reference/checkm_data [2024-01-24 12:32:28,585] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2024-01-24 12:32:28,637] [INFO] Task started: CheckM [2024-01-24 12:32:28,637] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_000334455.1_ASM33445v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_000334455.1_ASM33445v1_genomic.fna/checkm_input GCF_000334455.1_ASM33445v1_genomic.fna/checkm_result [2024-01-24 12:33:32,085] [INFO] Task succeeded: CheckM [2024-01-24 12:33:32,087] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 95.83% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2024-01-24 12:33:32,112] [INFO] ===== Completeness check finished ===== [2024-01-24 12:33:32,112] [INFO] ===== Start GTDB Search ===== [2024-01-24 12:33:32,113] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_000334455.1_ASM33445v1_genomic.fna/markers.fasta) [2024-01-24 12:33:32,113] [INFO] Task started: Blastn [2024-01-24 12:33:32,113] [INFO] Running command: blastn -query GCF_000334455.1_ASM33445v1_genomic.fna/markers.fasta -db /var/lib/cwl/stge1187590-08ea-498d-b532-8aef63b30170/dqc_reference/reference_markers_gtdb.fasta -out GCF_000334455.1_ASM33445v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 12:33:33,824] [INFO] Task succeeded: Blastn [2024-01-24 12:33:33,828] [INFO] Selected 17 target genomes. [2024-01-24 12:33:33,828] [INFO] Target genome list was writen to GCF_000334455.1_ASM33445v1_genomic.fna/target_genomes_gtdb.txt [2024-01-24 12:33:33,840] [INFO] Task started: fastANI [2024-01-24 12:33:33,841] [INFO] Running command: fastANI --query /var/lib/cwl/stg1bca2383-4dc8-4bc1-8f53-69a3a9083c53/GCF_000334455.1_ASM33445v1_genomic.fna.gz --refList GCF_000334455.1_ASM33445v1_genomic.fna/target_genomes_gtdb.txt --output GCF_000334455.1_ASM33445v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2024-01-24 12:33:51,965] [INFO] Task succeeded: fastANI [2024-01-24 12:33:51,991] [INFO] Found 17 fastANI hits (1 hits with ANI > circumscription radius) [2024-01-24 12:33:51,991] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_000334455.1 s__Gordonia soli 100.0 1760 1762 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 N/A N/A N/A N/A 1 conclusive GCF_000225505.1 s__Gordonia alkanivorans 80.0013 744 1762 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 98.26 97.87 0.90 0.88 6 - GCF_000327325.1 s__Gordonia rubripertincta 79.9876 737 1762 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 98.55 98.07 0.91 0.87 6 - GCF_000298235.1 s__Gordonia namibiensis 79.9652 757 1762 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 97.13 97.13 0.90 0.90 2 - GCF_900105725.1 s__Gordonia westfalica 79.9391 814 1762 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 N/A N/A N/A N/A 1 - GCF_000332995.1 s__Gordonia amicalis 79.898 730 1762 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 98.45 97.39 0.94 0.91 7 - GCF_003855095.1 s__Gordonia insulae 79.8211 842 1762 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 N/A N/A N/A N/A 1 - GCF_001485495.1 s__Gordonia desulfuricans 79.7568 739 1762 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 99.37 98.77 0.95 0.92 3 - GCF_001698225.1 s__Gordonia terrae 79.7458 792 1762 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 99.37 97.67 0.96 0.87 7 - GCF_002149015.1 s__Gordonia lacunae 79.6496 803 1762 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 N/A N/A N/A N/A 1 - GCF_000024785.1 s__Gordonia bronchialis 79.595 747 1762 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 100.00 100.00 1.00 1.00 3 - GCA_002700145.1 s__Gordonia sp002700145 79.514 685 1762 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 N/A N/A N/A N/A 1 - GCF_009862785.1 s__Gordonia sp009862785 79.465 758 1762 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 N/A N/A N/A N/A 1 - GCF_000298195.1 s__Gordonia rhizosphera 79.4544 723 1762 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 N/A N/A N/A N/A 1 - GCA_004193205.1 s__Gordonia sediminis 79.3652 772 1762 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 N/A N/A N/A N/A 1 - GCF_000241325.1 s__Gordonia polyisoprenivorans 79.2834 751 1762 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 98.60 97.62 0.90 0.79 11 - GCF_009932475.1 s__Gordonia spumicola 78.5979 534 1762 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Gordonia 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2024-01-24 12:33:51,993] [INFO] GTDB search result was written to GCF_000334455.1_ASM33445v1_genomic.fna/result_gtdb.tsv [2024-01-24 12:33:51,993] [INFO] ===== GTDB Search completed ===== [2024-01-24 12:33:51,999] [INFO] DFAST_QC result json was written to GCF_000334455.1_ASM33445v1_genomic.fna/dqc_result.json [2024-01-24 12:33:51,999] [INFO] DFAST_QC completed! [2024-01-24 12:33:52,000] [INFO] Total running time: 0h2m1s