[2024-01-25 18:09:50,728] [INFO] DFAST_QC pipeline started. [2024-01-25 18:09:50,730] [INFO] DFAST_QC version: 0.5.7 [2024-01-25 18:09:50,730] [INFO] DQC Reference Directory: /var/lib/cwl/stg0bfef197-564b-48ef-9a50-571b3d00afab/dqc_reference [2024-01-25 18:09:51,914] [INFO] ===== Start taxonomy check using ANI ===== [2024-01-25 18:09:51,915] [INFO] Task started: Prodigal [2024-01-25 18:09:51,915] [INFO] Running command: gunzip -c /var/lib/cwl/stg07df93c7-f42e-4e84-ba2a-54a97e0b3e00/GCF_016632365.1_ASM1663236v1_genomic.fna.gz | prodigal -d GCF_016632365.1_ASM1663236v1_genomic.fna/cds.fna -a GCF_016632365.1_ASM1663236v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2024-01-25 18:10:00,569] [INFO] Task succeeded: Prodigal [2024-01-25 18:10:00,570] [INFO] Task started: HMMsearch [2024-01-25 18:10:00,570] [INFO] Running command: hmmsearch --tblout GCF_016632365.1_ASM1663236v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg0bfef197-564b-48ef-9a50-571b3d00afab/dqc_reference/reference_markers.hmm GCF_016632365.1_ASM1663236v1_genomic.fna/protein.faa > /dev/null [2024-01-25 18:10:00,922] [INFO] Task succeeded: HMMsearch [2024-01-25 18:10:00,923] [INFO] Found 6/6 markers. [2024-01-25 18:10:00,967] [INFO] Query marker FASTA was written to GCF_016632365.1_ASM1663236v1_genomic.fna/markers.fasta [2024-01-25 18:10:00,967] [INFO] Task started: Blastn [2024-01-25 18:10:00,967] [INFO] Running command: blastn -query GCF_016632365.1_ASM1663236v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg0bfef197-564b-48ef-9a50-571b3d00afab/dqc_reference/reference_markers.fasta -out GCF_016632365.1_ASM1663236v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-25 18:10:01,834] [INFO] Task succeeded: Blastn [2024-01-25 18:10:01,837] [INFO] Selected 31 target genomes. [2024-01-25 18:10:01,837] [INFO] Target genome list was writen to GCF_016632365.1_ASM1663236v1_genomic.fna/target_genomes.txt [2024-01-25 18:10:01,863] [INFO] Task started: fastANI [2024-01-25 18:10:01,863] [INFO] Running command: fastANI --query /var/lib/cwl/stg07df93c7-f42e-4e84-ba2a-54a97e0b3e00/GCF_016632365.1_ASM1663236v1_genomic.fna.gz --refList GCF_016632365.1_ASM1663236v1_genomic.fna/target_genomes.txt --output GCF_016632365.1_ASM1663236v1_genomic.fna/fastani_result.tsv --threads 1 [2024-01-25 18:10:27,164] [INFO] Task succeeded: fastANI [2024-01-25 18:10:27,164] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg0bfef197-564b-48ef-9a50-571b3d00afab/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2024-01-25 18:10:27,165] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg0bfef197-564b-48ef-9a50-571b3d00afab/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2024-01-25 18:10:27,181] [INFO] Found 31 fastANI hits (0 hits with ANI > threshold) [2024-01-25 18:10:27,182] [INFO] The taxonomy check result is classified as 'below_threshold'. [2024-01-25 18:10:27,182] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Cronobacter dublinensis subsp. lausannensis strain=LMG 23824 GCA_000409365.1 413500 413497 type True 79.4908 526 1244 95 below_threshold Cronobacter dublinensis subsp. lactaridi strain=LMG 23825 GCA_000409345.1 413499 413497 type True 79.4697 528 1244 95 below_threshold Cronobacter dublinensis subsp. dublinensis strain=LMG 23823 GCA_000409225.1 413498 413497 type True 79.4381 487 1244 95 below_threshold Cronobacter muytjensii strain=ATCC 51329 GCA_000409285.1 413501 413501 type True 79.3959 496 1244 95 below_threshold Cronobacter muytjensii strain=ATCC 51329 GCA_001277195.1 413501 413501 type True 79.3858 506 1244 95 below_threshold Cronobacter malonaticus strain=LMG 23826 GCA_000409305.1 413503 413503 type True 79.284 514 1244 95 below_threshold Cronobacter universalis strain=NCTC 9529 GCA_000319325.1 535744 535744 type True 79.2595 517 1244 95 below_threshold Cronobacter malonaticus strain=LMG 23826 GCA_001277215.2 413503 413503 type True 79.1785 536 1244 95 below_threshold Franconibacter helveticus strain=513 GCA_000485945.1 357240 357240 type True 79.0744 488 1244 95 below_threshold Franconibacter helveticus strain=LMG 23732 GCA_000463115.2 357240 357240 type True 79.0596 514 1244 95 below_threshold Atlantibacter hermannii strain=FDAARGOS_888 GCA_016027855.1 565 565 type True 79.0221 469 1244 95 below_threshold Enterobacter sichuanensis strain=WCHECL1597 GCA_025002605.1 2071710 2071710 type True 79.0206 410 1244 95 below_threshold Atlantibacter hermannii strain=NBRC 105704 GCA_000248015.2 565 565 type True 78.9174 464 1244 95 below_threshold Raoultella electrica strain=DSM 102253 GCA_006711645.1 1259973 1259973 type True 78.8956 459 1244 95 below_threshold Enterobacter roggenkampii strain=DSM 16690 GCA_024390995.1 1812935 1812935 type True 78.8642 448 1244 95 below_threshold Klebsiella quasipneumoniae strain=FDAARGOS_1503 GCA_020099175.1 1463165 1463165 type True 78.757 500 1244 95 below_threshold Enterobacter hormaechei strain=FDAARGOS 1433 GCA_019048245.1 158836 158836 suspected-type True 78.7502 468 1244 95 below_threshold Raoultella planticola strain=ATCC 33531 GCA_000735435.1 575 575 type True 78.7392 453 1244 95 below_threshold Klebsiella quasipneumoniae subsp. quasipneumoniae strain=01A030T GCA_020525925.1 1667327 1463165 type True 78.7349 501 1244 95 below_threshold Klebsiella quasipneumoniae subsp. quasipneumoniae strain=01A030 GCA_000751755.1 1667327 1463165 type True 78.7252 493 1244 95 below_threshold Enterobacter wuhouensis strain=WCHEW120002 GCA_004331265.1 2529381 2529381 type True 78.7061 465 1244 95 below_threshold Yokenella regensburgei strain=NCTC11966 GCA_900460805.1 158877 158877 type True 78.7052 437 1244 95 below_threshold Klebsiella quasipneumoniae strain=DSM 28211 GCA_020115515.1 1463165 1463165 type True 78.671 494 1244 95 below_threshold Pseudocitrobacter vendiensis strain=type strain: CPO20170097 GCA_943590815.1 2488306 2488306 type True 78.6584 435 1244 95 below_threshold Klebsiella variicola strain=type strain: F2R9 GCA_900978195.1 244366 244366 type True 78.5835 484 1244 95 below_threshold Yokenella regensburgei strain=ATCC 49455 GCA_000735455.1 158877 158877 type True 78.5774 435 1244 95 below_threshold Kluyvera georgiana strain=ATCC 51603 GCA_001654985.1 73098 73098 type True 78.5742 450 1244 95 below_threshold Citrobacter youngae strain=NCTC13709 GCA_900638065.1 133448 133448 type True 78.5346 400 1244 95 below_threshold Klebsiella variicola strain=DSM 15968 GCA_000828055.2 244366 244366 type True 78.5318 500 1244 95 below_threshold Mixta gaviniae strain=DSM 22758 GCA_002953195.1 665914 665914 type True 78.4465 385 1244 95 below_threshold Enterobacter cloacae strain=DSM 30054 GCA_021469225.1 550 550 type True 78.4397 479 1244 95 below_threshold -------------------------------------------------------------------------------- [2024-01-25 18:10:27,183] [INFO] DFAST Taxonomy check result was written to GCF_016632365.1_ASM1663236v1_genomic.fna/tc_result.tsv [2024-01-25 18:10:27,184] [INFO] ===== Taxonomy check completed ===== [2024-01-25 18:10:27,184] [INFO] ===== Start completeness check using CheckM ===== [2024-01-25 18:10:27,184] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg0bfef197-564b-48ef-9a50-571b3d00afab/dqc_reference/checkm_data [2024-01-25 18:10:27,185] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2024-01-25 18:10:27,226] [INFO] Task started: CheckM [2024-01-25 18:10:27,226] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_016632365.1_ASM1663236v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_016632365.1_ASM1663236v1_genomic.fna/checkm_input GCF_016632365.1_ASM1663236v1_genomic.fna/checkm_result [2024-01-25 18:10:55,312] [INFO] Task succeeded: CheckM [2024-01-25 18:10:55,313] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2024-01-25 18:10:55,332] [INFO] ===== Completeness check finished ===== [2024-01-25 18:10:55,332] [INFO] ===== Start GTDB Search ===== [2024-01-25 18:10:55,333] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_016632365.1_ASM1663236v1_genomic.fna/markers.fasta) [2024-01-25 18:10:55,333] [INFO] Task started: Blastn [2024-01-25 18:10:55,333] [INFO] Running command: blastn -query GCF_016632365.1_ASM1663236v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg0bfef197-564b-48ef-9a50-571b3d00afab/dqc_reference/reference_markers_gtdb.fasta -out GCF_016632365.1_ASM1663236v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-25 18:10:56,705] [INFO] Task succeeded: Blastn [2024-01-25 18:10:56,710] [INFO] Selected 14 target genomes. [2024-01-25 18:10:56,710] [INFO] Target genome list was writen to GCF_016632365.1_ASM1663236v1_genomic.fna/target_genomes_gtdb.txt [2024-01-25 18:10:56,727] [INFO] Task started: fastANI [2024-01-25 18:10:56,727] [INFO] Running command: fastANI --query /var/lib/cwl/stg07df93c7-f42e-4e84-ba2a-54a97e0b3e00/GCF_016632365.1_ASM1663236v1_genomic.fna.gz --refList GCF_016632365.1_ASM1663236v1_genomic.fna/target_genomes_gtdb.txt --output GCF_016632365.1_ASM1663236v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2024-01-25 18:11:09,110] [INFO] Task succeeded: fastANI [2024-01-25 18:11:09,120] [INFO] Found 14 fastANI hits (1 hits with ANI > circumscription radius) [2024-01-25 18:11:09,121] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_019140855.1 s__Jejubacter sp019140855 99.1465 1134 1244 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Jejubacter 95.0 99.14 99.12 0.91 0.91 3 conclusive GCA_000302695.1 s__Jejubacter sp000302695 81.2542 779 1244 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Jejubacter 95.0 N/A N/A N/A N/A 1 - GCA_010092695.1 s__Jejubacter sp010092695 81.2435 783 1244 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Jejubacter 95.0 N/A N/A N/A N/A 1 - GCF_001277235.1 s__Cronobacter dublinensis 79.4141 501 1244 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Cronobacter 95.0 97.74 96.92 0.93 0.89 42 - GCF_001277175.1 s__Cronobacter universalis 79.2934 522 1244 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Cronobacter 95.0 99.87 99.62 0.99 0.98 4 - GCA_000621185.1 s__Franconibacter pulveris 79.2729 499 1244 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Franconibacter 95.0 98.43 96.71 0.95 0.93 7 - GCF_001277215.2 s__Cronobacter malonaticus 79.1963 534 1244 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Cronobacter 95.0 98.86 98.11 0.94 0.89 62 - GCF_005671395.1 s__Jejubacter calystegiae 79.0979 595 1244 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Jejubacter 95.0 98.55 98.55 0.90 0.90 2 - GCF_000463115.2 s__Franconibacter helveticus 79.0576 513 1244 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Franconibacter 95.0 99.42 99.04 0.94 0.92 6 - GCA_900635495.1 s__Atlantibacter hermannii 79.0516 473 1244 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Atlantibacter 95.0 98.95 98.59 0.96 0.90 28 - GCA_019237635.1 s__Enterobacter_B sp019237635 79.0442 448 1244 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Enterobacter_B 95.0 N/A N/A N/A N/A 1 - GCF_000164865.1 s__Enterobacter_B lignolyticus 78.9247 526 1244 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Enterobacter_B 95.0 98.80 98.80 0.94 0.94 2 - GCF_900021175.1 s__Enterobacter_A timonensis 78.9188 476 1244 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Enterobacter_A 95.0 100.00 100.00 1.00 1.00 2 - GCF_002953215.1 s__Mixta calida 78.2279 398 1244 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Mixta 95.0 99.65 99.42 0.96 0.92 22 - -------------------------------------------------------------------------------- [2024-01-25 18:11:09,122] [INFO] GTDB search result was written to GCF_016632365.1_ASM1663236v1_genomic.fna/result_gtdb.tsv [2024-01-25 18:11:09,122] [INFO] ===== GTDB Search completed ===== [2024-01-25 18:11:09,126] [INFO] DFAST_QC result json was written to GCF_016632365.1_ASM1663236v1_genomic.fna/dqc_result.json [2024-01-25 18:11:09,126] [INFO] DFAST_QC completed! [2024-01-25 18:11:09,126] [INFO] Total running time: 0h1m18s