[2023-06-13 21:20:32,239] [INFO] DFAST_QC pipeline started. [2023-06-13 21:20:32,244] [INFO] DFAST_QC version: 0.5.7 [2023-06-13 21:20:32,244] [INFO] DQC Reference Directory: /var/lib/cwl/stgdceca92e-e185-44df-896b-51f77f359f0f/dqc_reference [2023-06-13 21:20:33,535] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-13 21:20:33,536] [INFO] Task started: Prodigal [2023-06-13 21:20:33,536] [INFO] Running command: gunzip -c /var/lib/cwl/stge6390157-783a-443b-8f4b-5fe1c8d48dcb/GCA_022841325.1_ASM2284132v1_genomic.fna.gz | prodigal -d GCA_022841325.1_ASM2284132v1_genomic.fna/cds.fna -a GCA_022841325.1_ASM2284132v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-13 21:20:50,519] [INFO] Task succeeded: Prodigal [2023-06-13 21:20:50,520] [INFO] Task started: HMMsearch [2023-06-13 21:20:50,520] [INFO] Running command: hmmsearch --tblout GCA_022841325.1_ASM2284132v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgdceca92e-e185-44df-896b-51f77f359f0f/dqc_reference/reference_markers.hmm GCA_022841325.1_ASM2284132v1_genomic.fna/protein.faa > /dev/null [2023-06-13 21:20:50,848] [INFO] Task succeeded: HMMsearch [2023-06-13 21:20:50,849] [INFO] Found 6/6 markers. [2023-06-13 21:20:50,894] [INFO] Query marker FASTA was written to GCA_022841325.1_ASM2284132v1_genomic.fna/markers.fasta [2023-06-13 21:20:50,894] [INFO] Task started: Blastn [2023-06-13 21:20:50,894] [INFO] Running command: blastn -query GCA_022841325.1_ASM2284132v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgdceca92e-e185-44df-896b-51f77f359f0f/dqc_reference/reference_markers.fasta -out GCA_022841325.1_ASM2284132v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-13 21:20:51,726] [INFO] Task succeeded: Blastn [2023-06-13 21:20:51,735] [INFO] Selected 31 target genomes. [2023-06-13 21:20:51,735] [INFO] Target genome list was writen to GCA_022841325.1_ASM2284132v1_genomic.fna/target_genomes.txt [2023-06-13 21:20:51,737] [INFO] Task started: fastANI [2023-06-13 21:20:51,737] [INFO] Running command: fastANI --query /var/lib/cwl/stge6390157-783a-443b-8f4b-5fe1c8d48dcb/GCA_022841325.1_ASM2284132v1_genomic.fna.gz --refList GCA_022841325.1_ASM2284132v1_genomic.fna/target_genomes.txt --output GCA_022841325.1_ASM2284132v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-13 21:21:11,815] [INFO] Task succeeded: fastANI [2023-06-13 21:21:11,815] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgdceca92e-e185-44df-896b-51f77f359f0f/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-13 21:21:11,816] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgdceca92e-e185-44df-896b-51f77f359f0f/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-13 21:21:11,833] [INFO] Found 31 fastANI hits (0 hits with ANI > threshold) [2023-06-13 21:21:11,833] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-13 21:21:11,834] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Arenimonas metalli strain=CF5-1 GCA_000747155.1 948077 948077 type True 77.0054 273 1856 95 below_threshold Arenimonas donghaensis strain=HO3-R19 GCA_000743535.1 375061 375061 type True 76.9887 231 1856 95 below_threshold Lysobacter alkalisoli strain=SJ-36 GCA_006547045.1 2591633 2591633 type True 76.8896 207 1856 95 below_threshold Luteimonas weifangensis strain=WF-2 GCA_003416885.1 2303539 2303539 type True 76.8713 219 1856 95 below_threshold Pseudoxanthomonas sangjuensis strain=DSM 28345 GCA_010211755.1 1503750 1503750 type True 76.8632 224 1856 95 below_threshold Oleiagrimonas soli strain=3.5X GCA_000761445.1 1543381 1543381 type True 76.8581 224 1856 95 below_threshold Pseudoxanthomonas helianthi strain=110414 GCA_017939625.1 1453541 1453541 type True 76.8454 215 1856 95 below_threshold Dyella solisilvae strain=DHG54 GCA_003351225.1 1920168 1920168 type True 76.8414 201 1856 95 below_threshold Oleiagrimonas soli strain=DSM 107085 GCA_014201555.1 1543381 1543381 type True 76.8331 226 1856 95 below_threshold Rhodanobacter spathiphylli strain=B39 GCA_000264295.1 347483 347483 type True 76.829 234 1856 95 below_threshold Dokdonella immobilis strain=CGMCC 1.7659 GCA_900115085.1 578942 578942 type True 76.8216 241 1856 95 below_threshold Rhodanobacter denitrificans strain=2APBS1 GCA_000230695.3 666685 666685 type True 76.8058 241 1856 95 below_threshold Oleiagrimonas citrea strain=MEBiC09124 GCA_012395255.1 1665687 1665687 type True 76.7939 182 1856 95 below_threshold Luteimonas aquatica strain=RIB1-20 GCA_022662575.1 450364 450364 type True 76.7258 256 1856 95 below_threshold Rhodanobacter thiooxydans strain=LCS2 GCA_000264375.1 416169 416169 type True 76.5794 215 1856 95 below_threshold Dokdonella koreensis strain=DS-123 GCA_001632775.1 323415 323415 type True 76.5413 254 1856 95 below_threshold Metallibacterium scheffleri strain=DSM 24874 GCA_004798955.1 993689 993689 type True 76.5056 198 1856 95 below_threshold Metallibacterium scheffleri strain=DKE6 GCA_002077135.1 993689 993689 type True 76.4968 196 1856 95 below_threshold Tahibacter caeni strain=BUT-6 GCA_024609805.1 1453545 1453545 type True 76.4763 299 1856 95 below_threshold Pseudoxanthomonas japonensis strain=DSM 17109 GCA_010093445.1 69284 69284 type True 76.4744 223 1856 95 below_threshold Rhodanobacter fulvus strain=Jip2 GCA_000264315.1 219571 219571 type True 76.4232 223 1856 95 below_threshold Dyella acidiphila strain=7MK23 GCA_014863405.1 2775866 2775866 type True 76.391 213 1856 95 below_threshold Dyella japonica strain=DSM 16301 GCA_001010355.1 231455 231455 type True 76.329 176 1856 95 below_threshold Dyella choica strain=4 M-K27 GCA_003977665.1 1927959 1927959 type True 76.255 161 1856 95 below_threshold Marichromatium gracile strain=DSM 203 GCA_016583515.1 1048 1048 type True 75.8066 131 1856 95 below_threshold Marichromatium gracile strain=DSM 203 GCA_004343155.1 1048 1048 type True 75.7789 136 1856 95 below_threshold Pseudomonas nitroreducens strain=DSM 14399 GCA_012986245.1 46680 46680 suspected-type True 75.5585 164 1856 95 below_threshold Salinicola salarius strain=DSM 18044 GCA_003206135.1 430457 430457 type True 75.5337 81 1856 95 below_threshold Pseudomonas nitroreducens strain=NBRC 12694 GCA_002091755.1 46680 46680 suspected-type True 75.5113 166 1856 95 below_threshold Salinicola peritrichatus strain=JCM 18795 GCA_003206715.1 1267424 1267424 type True 75.4548 87 1856 95 below_threshold Parahaliea mediterranea strain=DSM 21924 GCA_003402235.1 651086 651086 type True 75.1974 102 1856 95 below_threshold -------------------------------------------------------------------------------- [2023-06-13 21:21:11,836] [INFO] DFAST Taxonomy check result was written to GCA_022841325.1_ASM2284132v1_genomic.fna/tc_result.tsv [2023-06-13 21:21:11,837] [INFO] ===== Taxonomy check completed ===== [2023-06-13 21:21:11,837] [INFO] ===== Start completeness check using CheckM ===== [2023-06-13 21:21:11,838] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgdceca92e-e185-44df-896b-51f77f359f0f/dqc_reference/checkm_data [2023-06-13 21:21:11,838] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-13 21:21:11,895] [INFO] Task started: CheckM [2023-06-13 21:21:11,896] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_022841325.1_ASM2284132v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_022841325.1_ASM2284132v1_genomic.fna/checkm_input GCA_022841325.1_ASM2284132v1_genomic.fna/checkm_result [2023-06-13 21:22:07,320] [INFO] Task succeeded: CheckM [2023-06-13 21:22:07,321] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 88.89% Contamintation: 12.50% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-13 21:22:07,342] [INFO] ===== Completeness check finished ===== [2023-06-13 21:22:07,342] [INFO] ===== Start GTDB Search ===== [2023-06-13 21:22:07,344] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_022841325.1_ASM2284132v1_genomic.fna/markers.fasta) [2023-06-13 21:22:07,344] [INFO] Task started: Blastn [2023-06-13 21:22:07,344] [INFO] Running command: blastn -query GCA_022841325.1_ASM2284132v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgdceca92e-e185-44df-896b-51f77f359f0f/dqc_reference/reference_markers_gtdb.fasta -out GCA_022841325.1_ASM2284132v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-13 21:22:08,921] [INFO] Task succeeded: Blastn [2023-06-13 21:22:08,924] [INFO] Selected 21 target genomes. [2023-06-13 21:22:08,924] [INFO] Target genome list was writen to GCA_022841325.1_ASM2284132v1_genomic.fna/target_genomes_gtdb.txt [2023-06-13 21:22:08,972] [INFO] Task started: fastANI [2023-06-13 21:22:08,972] [INFO] Running command: fastANI --query /var/lib/cwl/stge6390157-783a-443b-8f4b-5fe1c8d48dcb/GCA_022841325.1_ASM2284132v1_genomic.fna.gz --refList GCA_022841325.1_ASM2284132v1_genomic.fna/target_genomes_gtdb.txt --output GCA_022841325.1_ASM2284132v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-13 21:22:22,776] [INFO] Task succeeded: fastANI [2023-06-13 21:22:22,789] [INFO] Found 20 fastANI hits (0 hits with ANI > circumscription radius) [2023-06-13 21:22:22,789] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_016717705.1 s__JADKFD01 sp016717705 80.2251 798 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__SZUA-5;g__JADKFD01 95.0 N/A N/A N/A N/A 1 - GCA_016713135.1 s__JADKFD01 sp016713135 79.9242 836 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__SZUA-5;g__JADKFD01 95.0 N/A N/A N/A N/A 1 - GCA_019187485.1 s__JADKFD01 sp019187485 79.6668 674 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__SZUA-5;g__JADKFD01 95.0 N/A N/A N/A N/A 1 - GCF_900101825.1 s__Aquimonas voraii 77.2985 318 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Aquimonas 95.0 N/A N/A N/A N/A 1 - GCF_014202935.1 s__Rehaibacterium terrae 77.214 254 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Rehaibacterium 95.0 N/A N/A N/A N/A 1 - GCA_016182785.1 s__JADKHK01 sp016182785 76.9771 271 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Ahniellaceae;g__JADKHK01 95.0 N/A N/A N/A N/A 1 - GCA_018334015.1 s__Silanimonas sp018334015 76.9458 175 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Silanimonas 95.0 99.95 99.95 0.95 0.95 2 - GCF_000761445.1 s__Oleiagrimonas soli 76.8787 222 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Rhodanobacteraceae;g__Oleiagrimonas 95.0 99.99 99.99 1.00 1.00 2 - GCF_000953855.2 s__Mizugakiibacter sediminis 76.8331 279 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Rhodanobacteraceae;g__Mizugakiibacter 95.0 N/A N/A N/A N/A 1 - GCF_012395255.1 s__Oleiagrimonas citrea 76.8049 181 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Rhodanobacteraceae;g__Oleiagrimonas 95.0 98.37 98.37 0.90 0.90 2 - GCA_004297225.1 s__Luteimonas_B sp004297225 76.6843 168 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Luteimonas_B 95.0 N/A N/A N/A N/A 1 - GCF_000185965.1 s__Pseudoxanthomonas suwonensis_A 76.6734 237 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Pseudoxanthomonas 95.0 N/A N/A N/A N/A 1 - GCA_009996615.1 s__Dokdonella_A sp009996615 76.5757 241 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Rhodanobacteraceae;g__Dokdonella_A 95.0 N/A N/A N/A N/A 1 - GCF_002077135.1 s__Metallibacterium scheffleri 76.4573 200 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Rhodanobacteraceae;g__Metallibacterium 95.0 98.26 96.51 0.97 0.94 3 - GCA_002483055.1 s__Dokdonella sp002483055 76.4143 243 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Rhodanobacteraceae;g__Dokdonella 95.0 N/A N/A N/A N/A 1 - GCA_014338185.1 s__Dokdonella_A sp014338185 76.3715 205 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Rhodanobacteraceae;g__Dokdonella_A 95.0 96.24 96.24 0.88 0.88 2 - GCF_003148905.1 s__Fulvimonas soli 76.3124 278 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Rhodanobacteraceae;g__Fulvimonas 95.0 99.99 99.99 0.99 0.99 2 - GCF_000801295.1 s__MONJU sp000801295 76.0301 86 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Chromatiales;f__Sedimenticolaceae;g__MONJU 95.0 N/A N/A N/A N/A 1 - GCF_000377905.1 s__Thioalkalivibrio sp000377905 75.9842 82 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Ectothiorhodospirales;f__Thioalkalivibrionaceae;g__Thioalkalivibrio 95.0 N/A N/A N/A N/A 1 - GCA_011051615.1 s__HyVt-448 sp011051615 75.7566 71 1856 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__UBA6429;f__UBA6429;g__HyVt-448 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-13 21:22:22,791] [INFO] GTDB search result was written to GCA_022841325.1_ASM2284132v1_genomic.fna/result_gtdb.tsv [2023-06-13 21:22:22,792] [INFO] ===== GTDB Search completed ===== [2023-06-13 21:22:22,796] [INFO] DFAST_QC result json was written to GCA_022841325.1_ASM2284132v1_genomic.fna/dqc_result.json [2023-06-13 21:22:22,796] [INFO] DFAST_QC completed! [2023-06-13 21:22:22,796] [INFO] Total running time: 0h1m51s