[2023-06-26 23:45:11,926] [INFO] DFAST_QC pipeline started. [2023-06-26 23:45:11,928] [INFO] DFAST_QC version: 0.5.7 [2023-06-26 23:45:11,928] [INFO] DQC Reference Directory: /var/lib/cwl/stg805b47cc-f2b2-428e-84fb-d01fc2f72b84/dqc_reference [2023-06-26 23:45:13,222] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-26 23:45:13,223] [INFO] Task started: Prodigal [2023-06-26 23:45:13,223] [INFO] Running command: gunzip -c /var/lib/cwl/stgc6c99fd1-3167-45e8-b8f0-e16f9b0d15a0/GCA_002430065.1_ASM243006v1_genomic.fna.gz | prodigal -d GCA_002430065.1_ASM243006v1_genomic.fna/cds.fna -a GCA_002430065.1_ASM243006v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-26 23:45:19,460] [INFO] Task succeeded: Prodigal [2023-06-26 23:45:19,461] [INFO] Task started: HMMsearch [2023-06-26 23:45:19,461] [INFO] Running command: hmmsearch --tblout GCA_002430065.1_ASM243006v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg805b47cc-f2b2-428e-84fb-d01fc2f72b84/dqc_reference/reference_markers.hmm GCA_002430065.1_ASM243006v1_genomic.fna/protein.faa > /dev/null [2023-06-26 23:45:19,667] [INFO] Task succeeded: HMMsearch [2023-06-26 23:45:19,669] [INFO] Found 6/6 markers. [2023-06-26 23:45:19,755] [INFO] Query marker FASTA was written to GCA_002430065.1_ASM243006v1_genomic.fna/markers.fasta [2023-06-26 23:45:19,756] [INFO] Task started: Blastn [2023-06-26 23:45:19,756] [INFO] Running command: blastn -query GCA_002430065.1_ASM243006v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg805b47cc-f2b2-428e-84fb-d01fc2f72b84/dqc_reference/reference_markers.fasta -out GCA_002430065.1_ASM243006v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-26 23:45:20,371] [INFO] Task succeeded: Blastn [2023-06-26 23:45:20,376] [INFO] Selected 25 target genomes. [2023-06-26 23:45:20,377] [INFO] Target genome list was writen to GCA_002430065.1_ASM243006v1_genomic.fna/target_genomes.txt [2023-06-26 23:45:20,391] [INFO] Task started: fastANI [2023-06-26 23:45:20,392] [INFO] Running command: fastANI --query /var/lib/cwl/stgc6c99fd1-3167-45e8-b8f0-e16f9b0d15a0/GCA_002430065.1_ASM243006v1_genomic.fna.gz --refList GCA_002430065.1_ASM243006v1_genomic.fna/target_genomes.txt --output GCA_002430065.1_ASM243006v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-26 23:45:36,857] [INFO] Task succeeded: fastANI [2023-06-26 23:45:36,858] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg805b47cc-f2b2-428e-84fb-d01fc2f72b84/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-26 23:45:36,858] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg805b47cc-f2b2-428e-84fb-d01fc2f72b84/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-26 23:45:36,878] [INFO] Found 16 fastANI hits (0 hits with ANI > threshold) [2023-06-26 23:45:36,878] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-26 23:45:36,878] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Hydrogenophaga flava strain=NBRC 102514 GCA_001571145.1 65657 65657 type True 76.5692 56 621 95 below_threshold Rhodoferax saidenbachensis strain=DSM 22694 GCA_001955715.1 1484693 1484693 type True 76.5615 68 621 95 below_threshold Comamonas aquatica strain=NBRC 14918 GCA_000739875.1 225991 225991 type True 76.5168 52 621 95 below_threshold Limnohabitans curvus strain=MWH-C5 GCA_003063475.1 323423 323423 type True 76.4972 86 621 95 below_threshold Rhodoferax saidenbachensis strain=ED16 GCA_000498435.1 1484693 1484693 type True 76.4962 69 621 95 below_threshold Limnohabitans planktonicus strain=II-D5 GCA_001270065.2 540060 540060 type True 76.4367 92 621 95 below_threshold Comamonas fluminis strain=CJ34 GCA_019186805.1 2796366 2796366 type True 76.3121 50 621 95 below_threshold Delftia acidovorans strain=NBRC 14950 GCA_001598795.1 80866 80866 type True 76.2673 63 621 95 below_threshold Delftia acidovorans strain=FDAARGOS_997 GCA_016127415.1 80866 80866 type True 76.2258 65 621 95 below_threshold Comamonas testosteroni strain=NCTC10698 GCA_900461225.1 285 285 suspected-type True 75.93 51 621 95 below_threshold Comamonas testosteroni strain=ATCC 11996 GCA_000241525.2 285 285 suspected-type True 75.8718 52 621 95 below_threshold Rhodoferax fermentans strain=JCM 7819 GCA_002017865.1 28066 28066 type True 75.8639 62 621 95 below_threshold Rhodoferax ferrireducens strain=DSM 15236 GCA_000013605.1 192843 192843 type True 75.8603 60 621 95 below_threshold Comamonas kerstersii strain=CCUG 15333 GCA_008801935.1 225992 225992 type True 75.8388 51 621 95 below_threshold Comamonas suwonensis strain=EJ-4 GCA_012844455.2 2606214 2606214 type True 75.7977 56 621 95 below_threshold Variovorax paradoxus strain=NBRC 15149 GCA_001591365.1 34073 34073 suspected-type True 75.7129 74 621 95 below_threshold -------------------------------------------------------------------------------- [2023-06-26 23:45:36,882] [INFO] DFAST Taxonomy check result was written to GCA_002430065.1_ASM243006v1_genomic.fna/tc_result.tsv [2023-06-26 23:45:36,882] [INFO] ===== Taxonomy check completed ===== [2023-06-26 23:45:36,882] [INFO] ===== Start completeness check using CheckM ===== [2023-06-26 23:45:36,882] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg805b47cc-f2b2-428e-84fb-d01fc2f72b84/dqc_reference/checkm_data [2023-06-26 23:45:36,883] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-26 23:45:36,914] [INFO] Task started: CheckM [2023-06-26 23:45:36,914] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_002430065.1_ASM243006v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_002430065.1_ASM243006v1_genomic.fna/checkm_input GCA_002430065.1_ASM243006v1_genomic.fna/checkm_result [2023-06-26 23:46:00,859] [INFO] Task succeeded: CheckM [2023-06-26 23:46:00,861] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 71.53% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-26 23:46:00,885] [INFO] ===== Completeness check finished ===== [2023-06-26 23:46:00,885] [INFO] ===== Start GTDB Search ===== [2023-06-26 23:46:00,886] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_002430065.1_ASM243006v1_genomic.fna/markers.fasta) [2023-06-26 23:46:00,886] [INFO] Task started: Blastn [2023-06-26 23:46:00,886] [INFO] Running command: blastn -query GCA_002430065.1_ASM243006v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg805b47cc-f2b2-428e-84fb-d01fc2f72b84/dqc_reference/reference_markers_gtdb.fasta -out GCA_002430065.1_ASM243006v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-26 23:46:01,775] [INFO] Task succeeded: Blastn [2023-06-26 23:46:01,780] [INFO] Selected 25 target genomes. [2023-06-26 23:46:01,780] [INFO] Target genome list was writen to GCA_002430065.1_ASM243006v1_genomic.fna/target_genomes_gtdb.txt [2023-06-26 23:46:01,825] [INFO] Task started: fastANI [2023-06-26 23:46:01,825] [INFO] Running command: fastANI --query /var/lib/cwl/stgc6c99fd1-3167-45e8-b8f0-e16f9b0d15a0/GCA_002430065.1_ASM243006v1_genomic.fna.gz --refList GCA_002430065.1_ASM243006v1_genomic.fna/target_genomes_gtdb.txt --output GCA_002430065.1_ASM243006v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-26 23:46:17,602] [INFO] Task succeeded: fastANI [2023-06-26 23:46:17,624] [INFO] Found 22 fastANI hits (1 hits with ANI > circumscription radius) [2023-06-26 23:46:17,624] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_018607835.1 s__RS62 sp002340675 98.9646 544 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__RS62 95.0 99.19 98.96 0.91 0.88 5 conclusive GCA_000496475.1 s__RS62 sp000496475 78.4648 103 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__RS62 95.0 98.53 98.53 0.95 0.95 2 - GCF_010104355.1 s__Hydrogenophaga sp010104355 76.9339 82 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Hydrogenophaga 95.0 100.00 100.00 1.00 1.00 2 - GCF_003063355.1 s__Limnohabitans_A sp003063355 76.8903 72 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Limnohabitans_A 95.0 N/A N/A N/A N/A 1 - GCF_001269345.1 s__Limnohabitans_A sp001269345 76.7518 79 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Limnohabitans_A 95.0 N/A N/A N/A N/A 1 - GCA_002256145.1 s__Limnohabitans_A sp002256145 76.7307 88 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Limnohabitans_A 95.0 N/A N/A N/A N/A 1 - GCA_903832625.1 s__Limnohabitans sp903832625 76.7227 63 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Limnohabitans 95.0 97.24 97.19 0.90 0.89 4 - GCF_017798165.1 s__Rhodoferax sp017798165 76.6986 53 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Rhodoferax 95.0 N/A N/A N/A N/A 1 - GCF_001571145.1 s__Hydrogenophaga flava 76.5692 56 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Hydrogenophaga 95.0 N/A N/A N/A N/A 1 - GCF_003063475.1 s__Limnohabitans curvus 76.4972 86 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Limnohabitans 95.0 98.49 98.49 0.87 0.87 2 - GCA_003241965.1 s__Hydrogenophaga sp003241965 76.455 75 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Hydrogenophaga 95.0 N/A N/A N/A N/A 1 - GCF_000175235.1 s__Acidovorax delafieldii_B 76.4256 61 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Acidovorax 95.0 N/A N/A N/A N/A 1 - GCF_002778325.1 s__Limnohabitans sp002778325 76.3405 91 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Limnohabitans 95.0 N/A N/A N/A N/A 1 - GCF_000745855.1 s__Xenophilus azovorans 76.3102 70 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Xenophilus 95.0 N/A N/A N/A N/A 1 - GCF_003096555.1 s__Giesbergeria sp003096555 76.0779 60 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Giesbergeria 95.0 99.19 98.39 0.95 0.90 3 - GCF_016806145.1 s__Variovorax sp900115375 76.0484 64 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Variovorax 95.0 96.08 95.68 0.83 0.79 13 - GCA_001795455.1 s__Rhodoferax sp001795455 76.0377 65 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Rhodoferax 95.0 99.98 99.98 0.98 0.98 3 - GCF_003053545.1 s__Acidovorax sp003053545 76.0101 65 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Acidovorax 95.0 98.66 97.32 0.95 0.90 3 - GCF_012932145.1 s__Giesbergeria sp012932145 75.9593 61 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Giesbergeria 95.0 98.72 98.72 0.89 0.89 2 - GCA_016721925.1 s__JAABQG01 sp016721925 75.9204 60 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__JAABQG01 95.0 98.62 98.32 0.93 0.90 9 - GCA_903902035.1 s__Curvibacter sp903902035 75.8496 52 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Curvibacter 95.0 98.85 97.80 0.89 0.83 3 - GCF_014170375.1 s__Variovorax sp014170375 75.8418 72 621 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Variovorax 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-26 23:46:17,626] [INFO] GTDB search result was written to GCA_002430065.1_ASM243006v1_genomic.fna/result_gtdb.tsv [2023-06-26 23:46:17,626] [INFO] ===== GTDB Search completed ===== [2023-06-26 23:46:17,631] [INFO] DFAST_QC result json was written to GCA_002430065.1_ASM243006v1_genomic.fna/dqc_result.json [2023-06-26 23:46:17,631] [INFO] DFAST_QC completed! [2023-06-26 23:46:17,631] [INFO] Total running time: 0h1m6s