[2023-06-27 05:50:21,322] [INFO] DFAST_QC pipeline started. [2023-06-27 05:50:21,325] [INFO] DFAST_QC version: 0.5.7 [2023-06-27 05:50:21,325] [INFO] DQC Reference Directory: /var/lib/cwl/stgd03e9b52-4035-4b85-9fc9-f14841229e06/dqc_reference [2023-06-27 05:50:22,538] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-27 05:50:22,539] [INFO] Task started: Prodigal [2023-06-27 05:50:22,539] [INFO] Running command: gunzip -c /var/lib/cwl/stg731ea11f-74da-4999-add2-d08739c78a82/GCA_026197035.1_ASM2619703v1_genomic.fna.gz | prodigal -d GCA_026197035.1_ASM2619703v1_genomic.fna/cds.fna -a GCA_026197035.1_ASM2619703v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-27 05:50:30,150] [INFO] Task succeeded: Prodigal [2023-06-27 05:50:30,151] [INFO] Task started: HMMsearch [2023-06-27 05:50:30,151] [INFO] Running command: hmmsearch --tblout GCA_026197035.1_ASM2619703v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgd03e9b52-4035-4b85-9fc9-f14841229e06/dqc_reference/reference_markers.hmm GCA_026197035.1_ASM2619703v1_genomic.fna/protein.faa > /dev/null [2023-06-27 05:50:30,402] [INFO] Task succeeded: HMMsearch [2023-06-27 05:50:30,404] [INFO] Found 6/6 markers. [2023-06-27 05:50:30,436] [INFO] Query marker FASTA was written to GCA_026197035.1_ASM2619703v1_genomic.fna/markers.fasta [2023-06-27 05:50:30,436] [INFO] Task started: Blastn [2023-06-27 05:50:30,436] [INFO] Running command: blastn -query GCA_026197035.1_ASM2619703v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgd03e9b52-4035-4b85-9fc9-f14841229e06/dqc_reference/reference_markers.fasta -out GCA_026197035.1_ASM2619703v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-27 05:50:31,309] [INFO] Task succeeded: Blastn [2023-06-27 05:50:31,314] [INFO] Selected 28 target genomes. [2023-06-27 05:50:31,315] [INFO] Target genome list was writen to GCA_026197035.1_ASM2619703v1_genomic.fna/target_genomes.txt [2023-06-27 05:50:31,320] [INFO] Task started: fastANI [2023-06-27 05:50:31,320] [INFO] Running command: fastANI --query /var/lib/cwl/stg731ea11f-74da-4999-add2-d08739c78a82/GCA_026197035.1_ASM2619703v1_genomic.fna.gz --refList GCA_026197035.1_ASM2619703v1_genomic.fna/target_genomes.txt --output GCA_026197035.1_ASM2619703v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-27 05:50:48,394] [INFO] Task succeeded: fastANI [2023-06-27 05:50:48,395] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgd03e9b52-4035-4b85-9fc9-f14841229e06/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-27 05:50:48,395] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgd03e9b52-4035-4b85-9fc9-f14841229e06/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-27 05:50:48,421] [INFO] Found 28 fastANI hits (0 hits with ANI > threshold) [2023-06-27 05:50:48,422] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-27 05:50:48,422] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Sulfurivermis fontis strain=JG42 GCA_004001245.1 1972068 1972068 type True 78.1437 280 864 95 below_threshold Thiohalomonas denitrificans strain=HLD2 GCA_900102855.1 415747 415747 type True 77.3681 184 864 95 below_threshold Thioalbus denitrificans strain=DSM 26407 GCA_003337735.1 547122 547122 type True 77.2819 214 864 95 below_threshold Marichromatium gracile strain=DSM 203 GCA_016583515.1 1048 1048 type True 77.1273 156 864 95 below_threshold Marichromatium gracile strain=DSM 203 GCA_004343155.1 1048 1048 type True 77.0664 160 864 95 below_threshold Salinicola endophyticus strain=CPA92 GCA_003206575.1 1949083 1949083 type True 77.0107 91 864 95 below_threshold Thiohalocapsa marina strain=DSM 19078 GCA_008632335.1 424902 424902 type True 76.9496 125 864 95 below_threshold Ectothiorhodospira magna strain=B7-7 GCA_900110965.1 867345 867345 type True 76.7761 83 864 95 below_threshold Pseudomonas thermotolerans strain=DSM 14292 GCA_000364625.1 157784 157784 type True 76.7445 100 864 95 below_threshold Pseudomonas linyingensis strain=LMG 25967 GCA_900109175.1 915471 915471 type True 76.6807 121 864 95 below_threshold Pseudomonas oryzae strain=KCTC 32247 GCA_900104805.1 1392877 1392877 type True 76.5947 143 864 95 below_threshold Azotobacter chroococcum subsp. isscasi strain=P205 GCA_004327895.1 2528971 353 type True 76.544 117 864 95 below_threshold Plasticicumulans lactativorans strain=DSM 25287 GCA_004341245.1 1133106 1133106 type True 76.4901 113 864 95 below_threshold Methylonatrum kenyense strain=AMT 1 GCA_023195885.1 455253 455253 type True 76.4893 68 864 95 below_threshold Pseudomonas flexibilis strain=ATCC 29606 GCA_900155995.1 706570 706570 type True 76.4819 101 864 95 below_threshold Azotobacter chroococcum strain=DSM 2286 GCA_004339665.1 353 353 type True 76.439 116 864 95 below_threshold Azotobacter chroococcum strain=ATCC 9043 GCA_004327905.1 353 353 type True 76.4379 115 864 95 below_threshold Halomonas taeanensis strain=BH539 GCA_900100755.1 284577 284577 type True 76.3864 71 864 95 below_threshold Halomonas gudaonensis strain=CGMCC 1.6133 GCA_900100195.1 376427 376427 type True 76.2913 86 864 95 below_threshold Luteimonas salinisoli strain=SJ-92 GCA_013425525.1 2752307 2752307 type True 76.1602 76 864 95 below_threshold Pseudoduganella namucuonensis strain=CGMCC 1.11014 GCA_900116645.1 1035707 1035707 type True 76.1006 83 864 95 below_threshold Stutzerimonas kunmingensis strain=DSM 25974 GCA_024397575.1 1211807 1211807 type True 76.0853 66 864 95 below_threshold Stutzerimonas kunmingensis strain=DSM 25974 GCA_900114065.1 1211807 1211807 type True 76.0558 69 864 95 below_threshold Stutzerimonas chloritidismutans strain=AW-1 GCA_000495915.1 203192 203192 type True 76.0144 73 864 95 below_threshold Halofilum ochraceum strain=XJ16 GCA_001614315.2 1611323 1611323 type True 76.0113 56 864 95 below_threshold Pseudomonas songnenensis strain=NEAU-ST5-5 GCA_003696315.1 1176259 1176259 type True 75.9359 71 864 95 below_threshold Pseudomonas songnenensis strain=DSM 27560T GCA_024448495.1 1176259 1176259 type True 75.8981 73 864 95 below_threshold Pseudomonas cavernicola strain=K1S02-6 GCA_003596405.1 2320866 2320866 type True 75.8205 59 864 95 below_threshold -------------------------------------------------------------------------------- [2023-06-27 05:50:48,425] [INFO] DFAST Taxonomy check result was written to GCA_026197035.1_ASM2619703v1_genomic.fna/tc_result.tsv [2023-06-27 05:50:48,426] [INFO] ===== Taxonomy check completed ===== [2023-06-27 05:50:48,426] [INFO] ===== Start completeness check using CheckM ===== [2023-06-27 05:50:48,426] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgd03e9b52-4035-4b85-9fc9-f14841229e06/dqc_reference/checkm_data [2023-06-27 05:50:48,428] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-27 05:50:48,473] [INFO] Task started: CheckM [2023-06-27 05:50:48,474] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_026197035.1_ASM2619703v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_026197035.1_ASM2619703v1_genomic.fna/checkm_input GCA_026197035.1_ASM2619703v1_genomic.fna/checkm_result [2023-06-27 05:51:16,200] [INFO] Task succeeded: CheckM [2023-06-27 05:51:16,201] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-27 05:51:16,222] [INFO] ===== Completeness check finished ===== [2023-06-27 05:51:16,222] [INFO] ===== Start GTDB Search ===== [2023-06-27 05:51:16,222] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_026197035.1_ASM2619703v1_genomic.fna/markers.fasta) [2023-06-27 05:51:16,223] [INFO] Task started: Blastn [2023-06-27 05:51:16,223] [INFO] Running command: blastn -query GCA_026197035.1_ASM2619703v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgd03e9b52-4035-4b85-9fc9-f14841229e06/dqc_reference/reference_markers_gtdb.fasta -out GCA_026197035.1_ASM2619703v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-27 05:51:17,776] [INFO] Task succeeded: Blastn [2023-06-27 05:51:17,780] [INFO] Selected 19 target genomes. [2023-06-27 05:51:17,780] [INFO] Target genome list was writen to GCA_026197035.1_ASM2619703v1_genomic.fna/target_genomes_gtdb.txt [2023-06-27 05:51:17,785] [INFO] Task started: fastANI [2023-06-27 05:51:17,785] [INFO] Running command: fastANI --query /var/lib/cwl/stg731ea11f-74da-4999-add2-d08739c78a82/GCA_026197035.1_ASM2619703v1_genomic.fna.gz --refList GCA_026197035.1_ASM2619703v1_genomic.fna/target_genomes_gtdb.txt --output GCA_026197035.1_ASM2619703v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-27 05:51:28,045] [INFO] Task succeeded: fastANI [2023-06-27 05:51:28,065] [INFO] Found 19 fastANI hits (0 hits with ANI > circumscription radius) [2023-06-27 05:51:28,066] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_007125455.1 s__SLDE01 sp007125455 78.7201 249 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Thiohalomonadales;f__Thiohalomonadaceae;g__SLDE01 95.0 99.17 99.17 0.92 0.92 2 - GCA_002376325.1 s__Sulfurivermis sp002376325 78.1854 191 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Thiohalomonadales;f__Thiohalomonadaceae;g__Sulfurivermis 95.0 N/A N/A N/A N/A 1 - GCF_004001245.1 s__Sulfurivermis fontis 78.1309 281 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Thiohalomonadales;f__Thiohalomonadaceae;g__Sulfurivermis 95.0 N/A N/A N/A N/A 1 - GCA_014859545.1 s__SLDE01 sp014859545 78.0137 226 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Thiohalomonadales;f__Thiohalomonadaceae;g__SLDE01 95.0 N/A N/A N/A N/A 1 - GCA_002840095.1 s__Sulfurivermis sp002840095 77.9726 286 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Thiohalomonadales;f__Thiohalomonadaceae;g__Sulfurivermis 95.0 N/A N/A N/A N/A 1 - GCA_003062205.1 s__Sedimenticola_A endophacoides 77.6637 149 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Chromatiales;f__Sedimenticolaceae;g__Sedimenticola_A 95.0 99.88 99.76 0.97 0.95 6 - GCF_900102855.1 s__Thiohalomonas denitrificans 77.3681 184 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Thiohalomonadales;f__Thiohalomonadaceae;g__Thiohalomonas 95.0 N/A N/A N/A N/A 1 - GCA_011371455.1 s__DRQN01 sp011371455 77.3676 134 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__SZUA-152;f__SZUA-152;g__DRQN01 95.0 97.38 97.35 0.90 0.88 5 - GCA_014762505.1 s__SpSt-1174 sp014762505 77.232 182 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__SpSt-1174;f__SpSt-1174;g__SpSt-1174 95.0 N/A N/A N/A N/A 1 - GCA_014762495.1 s__JABURT01 sp014762495 77.1482 103 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__JABURT01;f__JABURT01;g__JABURT01 95.0 N/A N/A N/A N/A 1 - GCF_004801855.1 s__Pseudomonas_K sp004801855 77.056 135 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_K 95.0 98.38 98.38 0.89 0.89 2 - GCA_003251535.1 s__SZUA-493 sp003251535 76.9746 171 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Thiohalomonadales;f__Thiohalomonadaceae;g__SZUA-493 95.0 N/A N/A N/A N/A 1 - GCF_000364625.1 s__Pseudomonas_E thermotolerans 76.7445 100 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E 95.0 99.25 99.24 0.93 0.93 3 - GCF_900109175.1 s__Pseudomonas_K linyingensis 76.7175 121 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_K 95.0 N/A N/A N/A N/A 1 - GCA_011046015.1 s__SpSt-1174 sp011046015 76.6933 117 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__SpSt-1174;f__SpSt-1174;g__SpSt-1174 95.0 N/A N/A N/A N/A 1 - GCA_016712635.1 s__JADJWH01 sp016712635 76.6765 95 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__JADJWH01;f__JADJWH01;g__JADJWH01 95.0 N/A N/A N/A N/A 1 - GCF_000017205.1 s__Pseudomonas aeruginosa_A 76.6389 115 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas 95.0 99.07 98.78 0.93 0.80 42 - GCA_015487785.1 s__S012-40 sp015487785 76.4632 88 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__S140-43;f__S140-43;g__S012-40 95.0 99.75 99.72 0.93 0.92 3 - GCF_004339665.1 s__Azotobacter chroococcum 76.4214 117 864 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Azotobacter 95.0 97.27 96.18 0.89 0.83 10 - -------------------------------------------------------------------------------- [2023-06-27 05:51:28,083] [INFO] GTDB search result was written to GCA_026197035.1_ASM2619703v1_genomic.fna/result_gtdb.tsv [2023-06-27 05:51:28,084] [INFO] ===== GTDB Search completed ===== [2023-06-27 05:51:28,110] [INFO] DFAST_QC result json was written to GCA_026197035.1_ASM2619703v1_genomic.fna/dqc_result.json [2023-06-27 05:51:28,110] [INFO] DFAST_QC completed! [2023-06-27 05:51:28,111] [INFO] Total running time: 0h1m7s