[2024-01-24 15:18:23,558] [INFO] DFAST_QC pipeline started. [2024-01-24 15:18:23,560] [INFO] DFAST_QC version: 0.5.7 [2024-01-24 15:18:23,561] [INFO] DQC Reference Directory: /var/lib/cwl/stgf637841d-381f-4897-ad0b-cf0999c8d342/dqc_reference [2024-01-24 15:18:25,003] [INFO] ===== Start taxonomy check using ANI ===== [2024-01-24 15:18:25,004] [INFO] Task started: Prodigal [2024-01-24 15:18:25,004] [INFO] Running command: gunzip -c /var/lib/cwl/stgf0d5279d-5d1b-432d-b9e2-0d5e50dc7410/GCF_002632585.1_ASM263258v1_genomic.fna.gz | prodigal -d GCF_002632585.1_ASM263258v1_genomic.fna/cds.fna -a GCF_002632585.1_ASM263258v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2024-01-24 15:18:46,432] [INFO] Task succeeded: Prodigal [2024-01-24 15:18:46,433] [INFO] Task started: HMMsearch [2024-01-24 15:18:46,433] [INFO] Running command: hmmsearch --tblout GCF_002632585.1_ASM263258v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgf637841d-381f-4897-ad0b-cf0999c8d342/dqc_reference/reference_markers.hmm GCF_002632585.1_ASM263258v1_genomic.fna/protein.faa > /dev/null [2024-01-24 15:18:46,781] [INFO] Task succeeded: HMMsearch [2024-01-24 15:18:46,783] [INFO] Found 6/6 markers. [2024-01-24 15:18:46,828] [INFO] Query marker FASTA was written to GCF_002632585.1_ASM263258v1_genomic.fna/markers.fasta [2024-01-24 15:18:46,829] [INFO] Task started: Blastn [2024-01-24 15:18:46,829] [INFO] Running command: blastn -query GCF_002632585.1_ASM263258v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgf637841d-381f-4897-ad0b-cf0999c8d342/dqc_reference/reference_markers.fasta -out GCF_002632585.1_ASM263258v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 15:18:47,688] [INFO] Task succeeded: Blastn [2024-01-24 15:18:47,691] [INFO] Selected 16 target genomes. [2024-01-24 15:18:47,692] [INFO] Target genome list was writen to GCF_002632585.1_ASM263258v1_genomic.fna/target_genomes.txt [2024-01-24 15:18:47,698] [INFO] Task started: fastANI [2024-01-24 15:18:47,698] [INFO] Running command: fastANI --query /var/lib/cwl/stgf0d5279d-5d1b-432d-b9e2-0d5e50dc7410/GCF_002632585.1_ASM263258v1_genomic.fna.gz --refList GCF_002632585.1_ASM263258v1_genomic.fna/target_genomes.txt --output GCF_002632585.1_ASM263258v1_genomic.fna/fastani_result.tsv --threads 1 [2024-01-24 15:19:06,091] [INFO] Task succeeded: fastANI [2024-01-24 15:19:06,092] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgf637841d-381f-4897-ad0b-cf0999c8d342/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2024-01-24 15:19:06,093] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgf637841d-381f-4897-ad0b-cf0999c8d342/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2024-01-24 15:19:06,110] [INFO] Found 16 fastANI hits (2 hits with ANI > threshold) [2024-01-24 15:19:06,110] [INFO] The taxonomy check result is classified as 'conclusive'. [2024-01-24 15:19:06,110] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Xenorhabdus szentirmaii strain=DSM 16338 GCA_002632585.1 290112 290112 type True 99.9991 1633 1636 95 conclusive Xenorhabdus szentirmaii strain=DSM 16338 GCA_000531455.1 290112 290112 type True 99.9365 1517 1636 95 conclusive Xenorhabdus nematophila strain=ATCC 19061 GCA_000252955.1 628 628 type True 83.3662 895 1636 95 below_threshold Xenorhabdus mauleonii strain=DSM 17908 GCA_900113945.1 351675 351675 type True 83.3276 973 1636 95 below_threshold Xenorhabdus mauleonii strain=DSM 17908 GCA_002632685.1 351675 351675 type True 83.3224 983 1636 95 below_threshold Xenorhabdus lircayensis strain=VLS GCA_016306625.1 2763499 2763499 type True 83.0599 911 1636 95 below_threshold Xenorhabdus japonica strain=DSM 16522 GCA_900115195.1 53341 53341 type True 82.2672 775 1636 95 below_threshold Xenorhabdus miraniensis strain=DSM 17902 GCA_002632615.1 351674 351674 type True 82.138 912 1636 95 below_threshold Xenorhabdus beddingii strain=DSM 4764 GCA_002127545.1 40578 40578 type True 82.123 812 1636 95 below_threshold Xenorhabdus ishibashii strain=DSM 22670 GCA_002632755.1 1034471 1034471 type True 82.0051 770 1636 95 below_threshold Xenorhabdus eapokensis strain=DL20 GCA_001908105.1 1873482 1873482 type True 81.9479 782 1636 95 below_threshold Xenorhabdus thuongxuanensis strain=30TX1 GCA_001908095.1 1873484 1873484 type True 81.7451 769 1636 95 below_threshold Xenorhabdus bovienii strain=T228 GCA_024721015.1 40576 40576 type True 81.1991 737 1636 95 below_threshold Hafnia paralvei strain=ATCC 29927 GCA_023698525.1 546367 546367 type True 79.217 148 1636 95 below_threshold Photorhabdus hindustanensis strain=H1 GCA_002968995.1 2918802 2918802 type True 78.662 355 1636 95 below_threshold Photorhabdus temperata strain=DSM 14550 GCA_025384845.1 574560 574560 type True 78.6512 376 1636 95 below_threshold -------------------------------------------------------------------------------- [2024-01-24 15:19:06,112] [INFO] DFAST Taxonomy check result was written to GCF_002632585.1_ASM263258v1_genomic.fna/tc_result.tsv [2024-01-24 15:19:06,113] [INFO] ===== Taxonomy check completed ===== [2024-01-24 15:19:06,113] [INFO] ===== Start completeness check using CheckM ===== [2024-01-24 15:19:06,113] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgf637841d-381f-4897-ad0b-cf0999c8d342/dqc_reference/checkm_data [2024-01-24 15:19:06,114] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2024-01-24 15:19:06,164] [INFO] Task started: CheckM [2024-01-24 15:19:06,165] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_002632585.1_ASM263258v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_002632585.1_ASM263258v1_genomic.fna/checkm_input GCF_002632585.1_ASM263258v1_genomic.fna/checkm_result [2024-01-24 15:20:10,458] [INFO] Task succeeded: CheckM [2024-01-24 15:20:10,460] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2024-01-24 15:20:10,480] [INFO] ===== Completeness check finished ===== [2024-01-24 15:20:10,480] [INFO] ===== Start GTDB Search ===== [2024-01-24 15:20:10,481] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_002632585.1_ASM263258v1_genomic.fna/markers.fasta) [2024-01-24 15:20:10,481] [INFO] Task started: Blastn [2024-01-24 15:20:10,481] [INFO] Running command: blastn -query GCF_002632585.1_ASM263258v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgf637841d-381f-4897-ad0b-cf0999c8d342/dqc_reference/reference_markers_gtdb.fasta -out GCF_002632585.1_ASM263258v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 15:20:11,674] [INFO] Task succeeded: Blastn [2024-01-24 15:20:11,678] [INFO] Selected 15 target genomes. [2024-01-24 15:20:11,678] [INFO] Target genome list was writen to GCF_002632585.1_ASM263258v1_genomic.fna/target_genomes_gtdb.txt [2024-01-24 15:20:11,689] [INFO] Task started: fastANI [2024-01-24 15:20:11,689] [INFO] Running command: fastANI --query /var/lib/cwl/stgf0d5279d-5d1b-432d-b9e2-0d5e50dc7410/GCF_002632585.1_ASM263258v1_genomic.fna.gz --refList GCF_002632585.1_ASM263258v1_genomic.fna/target_genomes_gtdb.txt --output GCF_002632585.1_ASM263258v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2024-01-24 15:20:27,121] [INFO] Task succeeded: fastANI [2024-01-24 15:20:27,137] [INFO] Found 15 fastANI hits (1 hits with ANI > circumscription radius) [2024-01-24 15:20:27,138] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_002632585.1 s__Xenorhabdus szentirmaii 99.9991 1634 1636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Xenorhabdus 95.0 99.85 99.72 0.97 0.94 3 conclusive GCF_000252955.1 s__Xenorhabdus nematophila 83.3779 897 1636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Xenorhabdus 95.0 99.26 98.94 0.93 0.90 7 - GCF_900113945.1 s__Xenorhabdus mauleonii 83.3362 970 1636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Xenorhabdus 95.0 99.99 99.99 0.99 0.99 2 - GCF_016306625.1 s__Xenorhabdus sp016306625 83.065 910 1636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Xenorhabdus 95.0 N/A N/A N/A N/A 1 - GCF_900115195.1 s__Xenorhabdus japonica 82.278 775 1636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Xenorhabdus 95.0 N/A N/A N/A N/A 1 - GCF_002632615.1 s__Xenorhabdus miraniensis 82.1662 909 1636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Xenorhabdus 95.0 95.51 95.51 0.84 0.84 2 - GCF_003610465.1 s__Xenorhabdus ehlersii 82.118 811 1636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Xenorhabdus 95.0 99.74 99.51 0.89 0.84 3 - GCF_002127545.1 s__Xenorhabdus beddingii 82.1022 813 1636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Xenorhabdus 95.0 N/A N/A N/A N/A 1 - GCF_001037465.1 s__Xenorhabdus khoisanae 82.0552 889 1636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Xenorhabdus 95.0 N/A N/A N/A N/A 1 - GCF_002632755.1 s__Xenorhabdus ishibashii 82.0162 770 1636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Xenorhabdus 95.0 N/A N/A N/A N/A 1 - GCF_001908105.1 s__Xenorhabdus eapokensis 81.8963 789 1636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Xenorhabdus 95.0 N/A N/A N/A N/A 1 - GCF_002632725.1 s__Xenorhabdus hominickii 81.7617 881 1636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Xenorhabdus 95.0 99.90 99.90 0.96 0.96 2 - GCF_001908095.1 s__Xenorhabdus thuongxuanensis 81.7408 766 1636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Xenorhabdus 95.0 N/A N/A N/A N/A 1 - GCF_015163665.1 s__Xenorhabdus sp015163665 81.7381 779 1636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Xenorhabdus 95.0 N/A N/A N/A N/A 1 - GCF_002127535.1 s__Xenorhabdus vietnamensis 81.7287 886 1636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Xenorhabdus 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2024-01-24 15:20:27,140] [INFO] GTDB search result was written to GCF_002632585.1_ASM263258v1_genomic.fna/result_gtdb.tsv [2024-01-24 15:20:27,140] [INFO] ===== GTDB Search completed ===== [2024-01-24 15:20:27,147] [INFO] DFAST_QC result json was written to GCF_002632585.1_ASM263258v1_genomic.fna/dqc_result.json [2024-01-24 15:20:27,147] [INFO] DFAST_QC completed! [2024-01-24 15:20:27,147] [INFO] Total running time: 0h2m4s