[2023-06-19 09:16:55,786] [INFO] DFAST_QC pipeline started. [2023-06-19 09:16:55,795] [INFO] DFAST_QC version: 0.5.7 [2023-06-19 09:16:55,796] [INFO] DQC Reference Directory: /var/lib/cwl/stgbbb00906-aaa8-4b4c-8df1-3d448d64e8cc/dqc_reference [2023-06-19 09:16:57,441] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-19 09:16:57,442] [INFO] Task started: Prodigal [2023-06-19 09:16:57,442] [INFO] Running command: gunzip -c /var/lib/cwl/stgab62869e-7fe4-4102-8f76-fe7e55c72490/GCA_014378295.1_ASM1437829v1_genomic.fna.gz | prodigal -d GCA_014378295.1_ASM1437829v1_genomic.fna/cds.fna -a GCA_014378295.1_ASM1437829v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-19 09:17:03,595] [INFO] Task succeeded: Prodigal [2023-06-19 09:17:03,596] [INFO] Task started: HMMsearch [2023-06-19 09:17:03,596] [INFO] Running command: hmmsearch --tblout GCA_014378295.1_ASM1437829v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgbbb00906-aaa8-4b4c-8df1-3d448d64e8cc/dqc_reference/reference_markers.hmm GCA_014378295.1_ASM1437829v1_genomic.fna/protein.faa > /dev/null [2023-06-19 09:17:03,827] [INFO] Task succeeded: HMMsearch [2023-06-19 09:17:03,828] [WARNING] Found 3/6 markers. [/var/lib/cwl/stgab62869e-7fe4-4102-8f76-fe7e55c72490/GCA_014378295.1_ASM1437829v1_genomic.fna.gz] [2023-06-19 09:17:03,860] [INFO] Query marker FASTA was written to GCA_014378295.1_ASM1437829v1_genomic.fna/markers.fasta [2023-06-19 09:17:03,860] [INFO] Task started: Blastn [2023-06-19 09:17:03,861] [INFO] Running command: blastn -query GCA_014378295.1_ASM1437829v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgbbb00906-aaa8-4b4c-8df1-3d448d64e8cc/dqc_reference/reference_markers.fasta -out GCA_014378295.1_ASM1437829v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-19 09:17:04,435] [INFO] Task succeeded: Blastn [2023-06-19 09:17:04,442] [INFO] Selected 22 target genomes. [2023-06-19 09:17:04,443] [INFO] Target genome list was writen to GCA_014378295.1_ASM1437829v1_genomic.fna/target_genomes.txt [2023-06-19 09:17:04,448] [INFO] Task started: fastANI [2023-06-19 09:17:04,449] [INFO] Running command: fastANI --query /var/lib/cwl/stgab62869e-7fe4-4102-8f76-fe7e55c72490/GCA_014378295.1_ASM1437829v1_genomic.fna.gz --refList GCA_014378295.1_ASM1437829v1_genomic.fna/target_genomes.txt --output GCA_014378295.1_ASM1437829v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-19 09:17:20,329] [INFO] Task succeeded: fastANI [2023-06-19 09:17:20,330] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgbbb00906-aaa8-4b4c-8df1-3d448d64e8cc/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-19 09:17:20,330] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgbbb00906-aaa8-4b4c-8df1-3d448d64e8cc/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-19 09:17:20,344] [INFO] Found 18 fastANI hits (0 hits with ANI > threshold) [2023-06-19 09:17:20,345] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-19 09:17:20,345] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Arenimonas oryziterrae strain=YC6267 GCA_000747135.1 498055 498055 type True 79.2277 207 433 95 below_threshold Arenimonas oryziterrae strain=DSM 21050 GCA_000420545.1 498055 498055 type True 79.2271 207 433 95 below_threshold Arenimonas terrae strain=R29 GCA_006265115.1 2546226 2546226 type True 78.6864 197 433 95 below_threshold Arenimonas soli strain=CGMCC 1.15905 GCA_014643775.1 2269504 2269504 type True 78.6625 187 433 95 below_threshold Arenimonas caeni strain=z29 GCA_003024235.1 2058085 2058085 type True 78.6171 173 433 95 below_threshold Arenimonas metalli strain=CF5-1 GCA_000747155.1 948077 948077 type True 78.5956 192 433 95 below_threshold Arenimonas malthae strain=CC-JY-1 GCA_000747075.1 354197 354197 type True 78.5045 197 433 95 below_threshold Lysobacter ruishenii strain=CGMCC 1.10136 GCA_007830115.1 686800 686800 type True 78.0668 121 433 95 below_threshold Luteimonas gilva strain=H23 GCA_005239095.1 2572684 2572684 type True 77.8276 123 433 95 below_threshold Luteimonas cucumeris strain=CGMCC 1.10821 GCA_007830035.1 985012 985012 type True 77.6393 124 433 95 below_threshold Thermomonas fusca strain=DSM 15424 GCA_000423885.1 215690 215690 type True 77.5858 117 433 95 below_threshold Luteimonas marina strain=FR1330 GCA_007859325.1 488485 488485 type True 77.5715 122 433 95 below_threshold Luteimonas aquatica strain=RIB1-20 GCA_022662575.1 450364 450364 type True 77.5419 128 433 95 below_threshold Luteimonas lumbrici strain=1.1416 GCA_006476065.1 2559601 2559601 type True 77.5307 103 433 95 below_threshold Vulcaniibacterium thermophilum strain=KCTC 32020 GCA_007923255.1 1169913 1169913 type True 77.5001 124 433 95 below_threshold Vulcaniibacterium thermophilum strain=KCTC 32020 GCA_014656335.1 1169913 1169913 type True 77.4901 126 433 95 below_threshold Pseudoxanthomonas spadix strain=DSM 18855 GCA_003703395.1 415229 415229 type True 77.2273 101 433 95 below_threshold Variovorax boronicumulans strain=NBRC 103145 GCA_001591345.1 436515 436515 type True 75.7905 51 433 95 below_threshold -------------------------------------------------------------------------------- [2023-06-19 09:17:20,348] [INFO] DFAST Taxonomy check result was written to GCA_014378295.1_ASM1437829v1_genomic.fna/tc_result.tsv [2023-06-19 09:17:20,349] [INFO] ===== Taxonomy check completed ===== [2023-06-19 09:17:20,349] [INFO] ===== Start completeness check using CheckM ===== [2023-06-19 09:17:20,349] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgbbb00906-aaa8-4b4c-8df1-3d448d64e8cc/dqc_reference/checkm_data [2023-06-19 09:17:20,350] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-19 09:17:20,378] [INFO] Task started: CheckM [2023-06-19 09:17:20,379] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_014378295.1_ASM1437829v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_014378295.1_ASM1437829v1_genomic.fna/checkm_input GCA_014378295.1_ASM1437829v1_genomic.fna/checkm_result [2023-06-19 09:17:44,234] [INFO] Task succeeded: CheckM [2023-06-19 09:17:44,236] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 48.61% Contamintation: 2.08% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-19 09:17:44,263] [INFO] ===== Completeness check finished ===== [2023-06-19 09:17:44,264] [INFO] ===== Start GTDB Search ===== [2023-06-19 09:17:44,264] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_014378295.1_ASM1437829v1_genomic.fna/markers.fasta) [2023-06-19 09:17:44,264] [INFO] Task started: Blastn [2023-06-19 09:17:44,265] [INFO] Running command: blastn -query GCA_014378295.1_ASM1437829v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgbbb00906-aaa8-4b4c-8df1-3d448d64e8cc/dqc_reference/reference_markers_gtdb.fasta -out GCA_014378295.1_ASM1437829v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-19 09:17:44,930] [INFO] Task succeeded: Blastn [2023-06-19 09:17:44,936] [INFO] Selected 22 target genomes. [2023-06-19 09:17:44,936] [INFO] Target genome list was writen to GCA_014378295.1_ASM1437829v1_genomic.fna/target_genomes_gtdb.txt [2023-06-19 09:17:44,949] [INFO] Task started: fastANI [2023-06-19 09:17:44,949] [INFO] Running command: fastANI --query /var/lib/cwl/stgab62869e-7fe4-4102-8f76-fe7e55c72490/GCA_014378295.1_ASM1437829v1_genomic.fna.gz --refList GCA_014378295.1_ASM1437829v1_genomic.fna/target_genomes_gtdb.txt --output GCA_014378295.1_ASM1437829v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-19 09:17:58,093] [INFO] Task succeeded: fastANI [2023-06-19 09:17:58,111] [INFO] Found 18 fastANI hits (0 hits with ANI > circumscription radius) [2023-06-19 09:17:58,111] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_001801685.1 s__Arenimonas sp001801685 83.8233 336 433 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Arenimonas 95.0 N/A N/A N/A N/A 1 - GCF_000420545.1 s__Arenimonas oryziterrae 79.2271 207 433 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Arenimonas 95.0 100.00 100.00 1.00 1.00 2 - GCF_007993735.1 s__Arenimonas daejeonensis 78.9865 186 433 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Arenimonas 95.0 N/A N/A N/A N/A 1 - GCF_006265115.1 s__Arenimonas terrae 78.6864 197 433 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Arenimonas 95.0 N/A N/A N/A N/A 1 - GCF_003024235.1 s__Arenimonas caeni 78.6171 173 433 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Arenimonas 95.0 N/A N/A N/A N/A 1 - GCF_000747155.1 s__Arenimonas metalli 78.5956 192 433 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Arenimonas 95.0 N/A N/A N/A N/A 1 - GCF_014638745.1 s__Arenimonas maotaiensis 78.4814 129 433 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Arenimonas 95.0 N/A N/A N/A N/A 1 - GCF_000743535.1 s__Arenimonas donghaensis 78.2719 168 433 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Arenimonas 95.0 N/A N/A N/A N/A 1 - GCA_003497545.1 s__Arenimonas sp003497545 78.2458 134 433 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Arenimonas 95.0 99.43 99.43 0.81 0.81 2 - GCF_007830115.1 s__Lysobacter ruishenii 78.0668 121 433 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Lysobacter 95.0 N/A N/A N/A N/A 1 - GCF_018122625.2 s__Coralloluteibacterium stylophorae 77.7106 122 433 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Coralloluteibacterium 95.0 N/A N/A N/A N/A 1 - GCF_014395425.1 s__Thermomonas brevis 77.5545 129 433 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Thermomonas 95.0 N/A N/A N/A N/A 1 - GCF_006476065.1 s__Luteimonas_B lumbrici 77.5307 103 433 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Luteimonas_B 95.0 N/A N/A N/A N/A 1 - GCF_014202935.1 s__Rehaibacterium terrae 77.4893 138 433 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Rehaibacterium 95.0 N/A N/A N/A N/A 1 - GCF_004284655.1 s__Pseudoxanthomonas_A spadix_A 77.4612 121 433 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Pseudoxanthomonas_A 95.0 98.46 98.41 0.91 0.91 3 - GCA_018333835.1 s__Silanimonas sp018333835 77.3309 92 433 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Silanimonas 95.0 N/A N/A N/A N/A 1 - GCF_003703395.1 s__Pseudoxanthomonas_A spadix 77.2273 101 433 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Pseudoxanthomonas_A 95.0 99.25 98.51 0.95 0.90 3 - GCF_900112425.1 s__Variovorax sp900112425 75.8729 59 433 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Variovorax 95.0 99.32 98.64 0.96 0.91 3 - -------------------------------------------------------------------------------- [2023-06-19 09:17:58,113] [INFO] GTDB search result was written to GCA_014378295.1_ASM1437829v1_genomic.fna/result_gtdb.tsv [2023-06-19 09:17:58,113] [INFO] ===== GTDB Search completed ===== [2023-06-19 09:17:58,118] [INFO] DFAST_QC result json was written to GCA_014378295.1_ASM1437829v1_genomic.fna/dqc_result.json [2023-06-19 09:17:58,119] [INFO] DFAST_QC completed! [2023-06-19 09:17:58,119] [INFO] Total running time: 0h1m2s