[2023-06-28 23:25:44,839] [INFO] DFAST_QC pipeline started. [2023-06-28 23:25:44,847] [INFO] DFAST_QC version: 0.5.7 [2023-06-28 23:25:44,847] [INFO] DQC Reference Directory: /var/lib/cwl/stg3caa2763-89da-4057-bbce-026c807110ef/dqc_reference [2023-06-28 23:25:46,689] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-28 23:25:46,690] [INFO] Task started: Prodigal [2023-06-28 23:25:46,690] [INFO] Running command: gunzip -c /var/lib/cwl/stg8b8ce295-5ffb-4c0a-829b-8d990044efc8/GCA_017304895.1_ASM1730489v1_genomic.fna.gz | prodigal -d GCA_017304895.1_ASM1730489v1_genomic.fna/cds.fna -a GCA_017304895.1_ASM1730489v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-28 23:25:52,761] [INFO] Task succeeded: Prodigal [2023-06-28 23:25:52,761] [INFO] Task started: HMMsearch [2023-06-28 23:25:52,761] [INFO] Running command: hmmsearch --tblout GCA_017304895.1_ASM1730489v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg3caa2763-89da-4057-bbce-026c807110ef/dqc_reference/reference_markers.hmm GCA_017304895.1_ASM1730489v1_genomic.fna/protein.faa > /dev/null [2023-06-28 23:25:52,994] [INFO] Task succeeded: HMMsearch [2023-06-28 23:25:52,995] [WARNING] Found 5/6 markers. [/var/lib/cwl/stg8b8ce295-5ffb-4c0a-829b-8d990044efc8/GCA_017304895.1_ASM1730489v1_genomic.fna.gz] [2023-06-28 23:25:53,027] [INFO] Query marker FASTA was written to GCA_017304895.1_ASM1730489v1_genomic.fna/markers.fasta [2023-06-28 23:25:53,028] [INFO] Task started: Blastn [2023-06-28 23:25:53,028] [INFO] Running command: blastn -query GCA_017304895.1_ASM1730489v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg3caa2763-89da-4057-bbce-026c807110ef/dqc_reference/reference_markers.fasta -out GCA_017304895.1_ASM1730489v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-28 23:25:54,212] [INFO] Task succeeded: Blastn [2023-06-28 23:25:54,215] [INFO] Selected 30 target genomes. [2023-06-28 23:25:54,215] [INFO] Target genome list was writen to GCA_017304895.1_ASM1730489v1_genomic.fna/target_genomes.txt [2023-06-28 23:25:54,220] [INFO] Task started: fastANI [2023-06-28 23:25:54,220] [INFO] Running command: fastANI --query /var/lib/cwl/stg8b8ce295-5ffb-4c0a-829b-8d990044efc8/GCA_017304895.1_ASM1730489v1_genomic.fna.gz --refList GCA_017304895.1_ASM1730489v1_genomic.fna/target_genomes.txt --output GCA_017304895.1_ASM1730489v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-28 23:26:14,239] [INFO] Task succeeded: fastANI [2023-06-28 23:26:14,240] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg3caa2763-89da-4057-bbce-026c807110ef/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-28 23:26:14,240] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg3caa2763-89da-4057-bbce-026c807110ef/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-28 23:26:14,256] [INFO] Found 30 fastANI hits (0 hits with ANI > threshold) [2023-06-28 23:26:14,257] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-28 23:26:14,257] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Cellulomonas hominis strain=DSM 9581 GCA_014201095.1 156981 156981 suspected-type True 78.0451 261 641 95 below_threshold Cellulomonas hominis strain=NBRC 16055 GCA_007989225.1 156981 156981 suspected-type True 78.0252 258 641 95 below_threshold Cellulomonas rhizosphaerae strain=NEAU-TCZ24 GCA_003470205.1 2293719 2293719 type True 77.8531 211 641 95 below_threshold Beutenbergia cavernae strain=DSM 12333 GCA_000023105.1 84757 84757 type True 77.8467 282 641 95 below_threshold Cellulomonas palmilytica strain=EW123 GCA_021590045.1 2608402 2608402 type True 77.8313 249 641 95 below_threshold Cellulosimicrobium marinum strain=NBRC 110994 GCA_020551945.1 1638992 1638992 type True 77.7661 250 641 95 below_threshold Oerskovia rustica strain=Sa4CUA1 GCA_014836555.1 2762237 2762237 type True 77.7158 225 641 95 below_threshold Oerskovia enterophila strain=DSM 43852 GCA_001692445.1 43678 43678 type True 77.7023 222 641 95 below_threshold Cellulomonas carbonis strain=CGMCC 1.10786 GCA_014636275.1 1386092 1386092 type True 77.6944 243 641 95 below_threshold Georgenia wutianyii strain=Z294 GCA_006349365.1 2585135 2585135 type True 77.693 242 641 95 below_threshold Actinotalea subterranea strain=HO-Ch2 GCA_008364845.1 2607497 2607497 type True 77.6748 223 641 95 below_threshold Cellulosimicrobium funkei strain=NBRC 104118 GCA_001570825.1 264251 264251 suspected-type True 77.6626 269 641 95 below_threshold Cellulomonas carbonis strain=T26 GCA_000767175.1 1386092 1386092 type True 77.6554 238 641 95 below_threshold Actinotalea solisilvae strain=KACC 19191 GCA_016464425.1 2072922 2072922 type True 77.604 264 641 95 below_threshold Cellulomonas biazotea strain=NBRC12680 GCA_004306155.1 1709 1709 type True 77.5893 268 641 95 below_threshold Georgenia faecalis strain=ZLJ0423 GCA_003710105.1 2483799 2483799 type True 77.5553 234 641 95 below_threshold Cellulosimicrobium funkei strain=JCM 14302 GCA_004519295.1 264251 264251 suspected-type True 77.545 278 641 95 below_threshold Promicromonospora citrea strain=ATCC 15908 GCA_013004695.1 43677 43677 type True 77.5117 240 641 95 below_threshold Oerskovia merdavium strain=Sa2CUA9 GCA_014836755.1 2762227 2762227 type True 77.4717 231 641 95 below_threshold Oerskovia douganii strain=Sa1BUA8 GCA_015142735.1 2762210 2762210 type True 77.4622 231 641 95 below_threshold Promicromonospora citrea strain=JCM 3051 GCA_014647735.1 43677 43677 type True 77.4175 253 641 95 below_threshold Cellulomonas humilata strain=ATCC 25174 GCA_013359775.1 144055 144055 type True 77.367 221 641 95 below_threshold Demequina silvatica strain=NBRC 109395 GCA_000971215.1 1638988 1638988 type True 77.3177 159 641 95 below_threshold Cellulomonas xylanilytica strain=NBRC 101102 GCA_007989805.1 233583 233583 type True 77.2875 234 641 95 below_threshold Brevibacterium casei strain=CIP 102111 GCA_900169275.1 33889 33889 type True 76.7042 109 641 95 below_threshold Brevibacterium casei strain=FDAARGOS_936 GCA_016026595.1 33889 33889 type True 76.6855 111 641 95 below_threshold Nocardioides seonyuensis strain=MMS17-SY207-3 GCA_004683965.1 2518371 2518371 type True 76.5866 121 641 95 below_threshold Nocardioides deserti strain=SC8A-24 GCA_014266025.1 1588644 1588644 type True 76.4529 169 641 95 below_threshold Nocardioides deserti strain=CGMCC 4.7183 GCA_014646035.1 1588644 1588644 type True 76.4352 167 641 95 below_threshold Sphaerisporangium rubeum strain=DSM 44936 GCA_014207705.1 321317 321317 type True 75.8471 149 641 95 below_threshold -------------------------------------------------------------------------------- [2023-06-28 23:26:14,259] [INFO] DFAST Taxonomy check result was written to GCA_017304895.1_ASM1730489v1_genomic.fna/tc_result.tsv [2023-06-28 23:26:14,259] [INFO] ===== Taxonomy check completed ===== [2023-06-28 23:26:14,259] [INFO] ===== Start completeness check using CheckM ===== [2023-06-28 23:26:14,259] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg3caa2763-89da-4057-bbce-026c807110ef/dqc_reference/checkm_data [2023-06-28 23:26:14,260] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-28 23:26:14,297] [INFO] Task started: CheckM [2023-06-28 23:26:14,297] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_017304895.1_ASM1730489v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_017304895.1_ASM1730489v1_genomic.fna/checkm_input GCA_017304895.1_ASM1730489v1_genomic.fna/checkm_result [2023-06-28 23:26:35,631] [INFO] Task succeeded: CheckM [2023-06-28 23:26:35,632] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 78.49% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-28 23:26:35,649] [INFO] ===== Completeness check finished ===== [2023-06-28 23:26:35,650] [INFO] ===== Start GTDB Search ===== [2023-06-28 23:26:35,650] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_017304895.1_ASM1730489v1_genomic.fna/markers.fasta) [2023-06-28 23:26:35,650] [INFO] Task started: Blastn [2023-06-28 23:26:35,650] [INFO] Running command: blastn -query GCA_017304895.1_ASM1730489v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg3caa2763-89da-4057-bbce-026c807110ef/dqc_reference/reference_markers_gtdb.fasta -out GCA_017304895.1_ASM1730489v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-28 23:26:37,641] [INFO] Task succeeded: Blastn [2023-06-28 23:26:37,644] [INFO] Selected 17 target genomes. [2023-06-28 23:26:37,645] [INFO] Target genome list was writen to GCA_017304895.1_ASM1730489v1_genomic.fna/target_genomes_gtdb.txt [2023-06-28 23:26:37,708] [INFO] Task started: fastANI [2023-06-28 23:26:37,708] [INFO] Running command: fastANI --query /var/lib/cwl/stg8b8ce295-5ffb-4c0a-829b-8d990044efc8/GCA_017304895.1_ASM1730489v1_genomic.fna.gz --refList GCA_017304895.1_ASM1730489v1_genomic.fna/target_genomes_gtdb.txt --output GCA_017304895.1_ASM1730489v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-28 23:26:49,757] [INFO] Task succeeded: fastANI [2023-06-28 23:26:49,768] [INFO] Found 17 fastANI hits (1 hits with ANI > circumscription radius) [2023-06-28 23:26:49,768] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_003751805.1 s__Salana multivorans 98.7184 585 641 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Beutenbergiaceae;g__Salana 95.0 99.05 99.05 0.93 0.93 2 conclusive GCF_006351005.1 s__Miniimonas arenae 80.642 330 641 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Beutenbergiaceae;g__Miniimonas 95.0 98.18 97.40 0.91 0.83 3 - GCF_003121705.1 s__Serinibacter_A sp003121705 79.6683 336 641 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Beutenbergiaceae;g__Serinibacter_A 95.0 N/A N/A N/A N/A 1 - GCF_004786095.1 s__Serinibacter_A arcticus 79.6072 307 641 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Beutenbergiaceae;g__Serinibacter_A 95.0 N/A N/A N/A N/A 1 - GCF_002563925.1 s__Serinibacter salmoneus 78.3506 201 641 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Beutenbergiaceae;g__Serinibacter 95.0 N/A N/A N/A N/A 1 - GCF_001722485.1 s__Cellulosimicrobium cellulans_A 77.9485 244 641 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Cellulomonadaceae;g__Cellulosimicrobium 95.0 N/A N/A N/A N/A 1 - GCF_000215105.1 s__Isoptericola variabilis_A 77.8586 256 641 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Cellulomonadaceae;g__Isoptericola 95.0 99.96 99.96 0.98 0.98 2 - GCF_016907365.1 s__Oerskovia paurometabola 77.6434 230 641 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Cellulomonadaceae;g__Oerskovia 95.0 99.20 99.20 0.97 0.97 2 - GCF_001942335.1 s__Cellulosimicrobium sp001942335 77.6393 256 641 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Cellulomonadaceae;g__Cellulosimicrobium 95.0 N/A N/A N/A N/A 1 - GCF_013004695.1 s__Promicromonospora citrea 77.5241 239 641 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Cellulomonadaceae;g__Promicromonospora 95.0 99.97 99.97 0.99 0.99 2 - GCF_010287905.1 s__Cellulosimicrobium fucosivorans 77.4956 285 641 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Cellulomonadaceae;g__Cellulosimicrobium 95.0 98.84 98.55 0.94 0.90 6 - GCF_018388465.1 s__Cellulomonas sp018388465 77.4415 223 641 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Cellulomonadaceae;g__Cellulomonas 95.0 99.65 99.29 0.98 0.97 3 - GCF_900169275.1 s__Brevibacterium casei 76.6653 111 641 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Brevibacteriaceae;g__Brevibacterium 95.0 98.23 97.33 0.93 0.90 17 - GCF_000960475.2 s__Nocardioides luteus_A 76.143 152 641 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides 95.0 N/A N/A N/A N/A 1 - GCF_006715725.1 s__Nocardioides sp006715725 76.1133 191 641 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides 95.0 N/A N/A N/A N/A 1 - GCF_016803235.1 s__Nocardioides simplex_A 76.0509 199 641 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides 95.0 N/A N/A N/A N/A 1 - GCF_004117935.1 s__Streptomyces roseicoloratus 76.0389 187 641 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Streptomycetales;f__Streptomycetaceae;g__Streptomyces 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-28 23:26:49,770] [INFO] GTDB search result was written to GCA_017304895.1_ASM1730489v1_genomic.fna/result_gtdb.tsv [2023-06-28 23:26:49,770] [INFO] ===== GTDB Search completed ===== [2023-06-28 23:26:49,774] [INFO] DFAST_QC result json was written to GCA_017304895.1_ASM1730489v1_genomic.fna/dqc_result.json [2023-06-28 23:26:49,774] [INFO] DFAST_QC completed! [2023-06-28 23:26:49,774] [INFO] Total running time: 0h1m5s