[2023-06-28 07:21:17,735] [INFO] DFAST_QC pipeline started. [2023-06-28 07:21:17,737] [INFO] DFAST_QC version: 0.5.7 [2023-06-28 07:21:17,738] [INFO] DQC Reference Directory: /var/lib/cwl/stg9c6bbe2a-205b-426c-8956-c54dc875b24f/dqc_reference [2023-06-28 07:21:18,916] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-28 07:21:18,917] [INFO] Task started: Prodigal [2023-06-28 07:21:18,917] [INFO] Running command: gunzip -c /var/lib/cwl/stge249f466-ff6c-4a88-91b4-a61b76cb9aee/GCA_017993135.1_ASM1799313v1_genomic.fna.gz | prodigal -d GCA_017993135.1_ASM1799313v1_genomic.fna/cds.fna -a GCA_017993135.1_ASM1799313v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-28 07:21:24,779] [INFO] Task succeeded: Prodigal [2023-06-28 07:21:24,780] [INFO] Task started: HMMsearch [2023-06-28 07:21:24,780] [INFO] Running command: hmmsearch --tblout GCA_017993135.1_ASM1799313v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg9c6bbe2a-205b-426c-8956-c54dc875b24f/dqc_reference/reference_markers.hmm GCA_017993135.1_ASM1799313v1_genomic.fna/protein.faa > /dev/null [2023-06-28 07:21:25,023] [INFO] Task succeeded: HMMsearch [2023-06-28 07:21:25,024] [INFO] Found 6/6 markers. [2023-06-28 07:21:25,043] [INFO] Query marker FASTA was written to GCA_017993135.1_ASM1799313v1_genomic.fna/markers.fasta [2023-06-28 07:21:25,044] [INFO] Task started: Blastn [2023-06-28 07:21:25,044] [INFO] Running command: blastn -query GCA_017993135.1_ASM1799313v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg9c6bbe2a-205b-426c-8956-c54dc875b24f/dqc_reference/reference_markers.fasta -out GCA_017993135.1_ASM1799313v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-28 07:21:25,761] [INFO] Task succeeded: Blastn [2023-06-28 07:21:25,788] [INFO] Selected 28 target genomes. [2023-06-28 07:21:25,788] [INFO] Target genome list was writen to GCA_017993135.1_ASM1799313v1_genomic.fna/target_genomes.txt [2023-06-28 07:21:25,792] [INFO] Task started: fastANI [2023-06-28 07:21:25,793] [INFO] Running command: fastANI --query /var/lib/cwl/stge249f466-ff6c-4a88-91b4-a61b76cb9aee/GCA_017993135.1_ASM1799313v1_genomic.fna.gz --refList GCA_017993135.1_ASM1799313v1_genomic.fna/target_genomes.txt --output GCA_017993135.1_ASM1799313v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-28 07:21:42,651] [INFO] Task succeeded: fastANI [2023-06-28 07:21:42,652] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg9c6bbe2a-205b-426c-8956-c54dc875b24f/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-28 07:21:42,652] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg9c6bbe2a-205b-426c-8956-c54dc875b24f/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-28 07:21:42,680] [INFO] Found 24 fastANI hits (0 hits with ANI > threshold) [2023-06-28 07:21:42,680] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-28 07:21:42,681] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Serpentinimonas barnesii strain=H1 GCA_000696225.1 1458427 1458427 type True 76.9343 93 541 95 below_threshold Serpentinimonas maccroryi strain=B1 GCA_000828915.1 1458426 1458426 type True 76.8212 87 541 95 below_threshold Brachymonas chironomi strain=DSM 19884 GCA_000374625.1 491919 491919 type True 76.8157 99 541 95 below_threshold Simplicispira metamorpha strain=NBRC 13960 GCA_003568725.1 80881 80881 type True 76.7566 128 541 95 below_threshold Simplicispira metamorpha strain=DSM 1837 GCA_004341365.1 80881 80881 type True 76.7372 128 541 95 below_threshold Brachymonas denitrificans strain=DSM 15123 GCA_900110225.1 28220 28220 type True 76.6641 121 541 95 below_threshold Simplicispira hankyongi strain=NY-02 GCA_003570885.1 2315688 2315688 type True 76.6391 82 541 95 below_threshold Rhodoferax antarcticus strain=DSM 24876 GCA_001955735.1 81479 81479 type True 76.6117 80 541 95 below_threshold Acidovorax facilis strain=DSM 649 GCA_023913775.1 12917 12917 type True 76.5359 90 541 95 below_threshold Ottowia testudinis strain=27C GCA_017498525.1 2816950 2816950 type True 76.5229 88 541 95 below_threshold Rhodoferax antarcticus strain=ANT.BR GCA_001938565.1 81479 81479 type True 76.5169 82 541 95 below_threshold Acidovorax kalamii strain=KNDSW-TSA6 GCA_002245625.1 2004485 2004485 type True 76.4188 103 541 95 below_threshold Acidovorax temperans strain=DSM 7270 GCA_006716905.1 80878 80878 type True 76.4114 108 541 95 below_threshold Rhodoferax lacus strain=IMCC26218 GCA_003415675.1 2184758 2184758 type True 76.3864 92 541 95 below_threshold Limnohabitans curvus strain=MWH-C5 GCA_003063475.1 323423 323423 type True 76.3519 89 541 95 below_threshold Comamonas terrigena strain=NCTC1937 GCA_900461435.1 32013 32013 type True 76.3325 65 541 95 below_threshold Limnohabitans radicicola strain=JUR4 GCA_014837235.1 2771427 2771427 type True 76.2912 94 541 95 below_threshold Simplicispira suum strain=SC1-8 GCA_003008595.1 2109915 2109915 type True 76.25 79 541 95 below_threshold Comamonas koreensis strain=KCTC 12005 GCA_021026195.1 160825 160825 type True 76.1831 89 541 95 below_threshold Verminephrobacter eiseniae strain=EF01-2 GCA_000015565.1 364317 364317 type True 75.9957 84 541 95 below_threshold Comamonas suwonensis strain=EJ-4 GCA_012844455.2 2606214 2606214 type True 75.9865 77 541 95 below_threshold Rhodoferax aquaticus strain=Gr-4 GCA_006974105.1 2527691 2527691 type True 75.9428 75 541 95 below_threshold Comamonas avium strain=Sa2CVA6 GCA_014836675.1 2762231 2762231 type True 75.8826 74 541 95 below_threshold Hydrogenophaga borbori strain=LA-38 GCA_003417535.1 2294117 2294117 type True 75.8798 84 541 95 below_threshold -------------------------------------------------------------------------------- [2023-06-28 07:21:42,683] [INFO] DFAST Taxonomy check result was written to GCA_017993135.1_ASM1799313v1_genomic.fna/tc_result.tsv [2023-06-28 07:21:42,684] [INFO] ===== Taxonomy check completed ===== [2023-06-28 07:21:42,684] [INFO] ===== Start completeness check using CheckM ===== [2023-06-28 07:21:42,684] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg9c6bbe2a-205b-426c-8956-c54dc875b24f/dqc_reference/checkm_data [2023-06-28 07:21:42,686] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-28 07:21:42,709] [INFO] Task started: CheckM [2023-06-28 07:21:42,710] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_017993135.1_ASM1799313v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_017993135.1_ASM1799313v1_genomic.fna/checkm_input GCA_017993135.1_ASM1799313v1_genomic.fna/checkm_result [2023-06-28 07:22:05,235] [INFO] Task succeeded: CheckM [2023-06-28 07:22:05,237] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 81.02% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-28 07:22:05,257] [INFO] ===== Completeness check finished ===== [2023-06-28 07:22:05,257] [INFO] ===== Start GTDB Search ===== [2023-06-28 07:22:05,258] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_017993135.1_ASM1799313v1_genomic.fna/markers.fasta) [2023-06-28 07:22:05,258] [INFO] Task started: Blastn [2023-06-28 07:22:05,258] [INFO] Running command: blastn -query GCA_017993135.1_ASM1799313v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg9c6bbe2a-205b-426c-8956-c54dc875b24f/dqc_reference/reference_markers_gtdb.fasta -out GCA_017993135.1_ASM1799313v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-28 07:22:06,503] [INFO] Task succeeded: Blastn [2023-06-28 07:22:06,508] [INFO] Selected 25 target genomes. [2023-06-28 07:22:06,509] [INFO] Target genome list was writen to GCA_017993135.1_ASM1799313v1_genomic.fna/target_genomes_gtdb.txt [2023-06-28 07:22:06,520] [INFO] Task started: fastANI [2023-06-28 07:22:06,520] [INFO] Running command: fastANI --query /var/lib/cwl/stge249f466-ff6c-4a88-91b4-a61b76cb9aee/GCA_017993135.1_ASM1799313v1_genomic.fna.gz --refList GCA_017993135.1_ASM1799313v1_genomic.fna/target_genomes_gtdb.txt --output GCA_017993135.1_ASM1799313v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-28 07:22:18,731] [INFO] Task succeeded: fastANI [2023-06-28 07:22:18,751] [INFO] Found 24 fastANI hits (1 hits with ANI > circumscription radius) [2023-06-28 07:22:18,752] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_017993135.1 s__Brachymonas sp017993135 100.0 537 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Brachymonas 95.0 99.65 99.42 0.91 0.88 7 conclusive GCA_017996515.1 s__Brachymonas sp017996515 77.7568 127 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Brachymonas 95.0 N/A N/A N/A N/A 1 - GCA_018384835.1 s__Comamonas sp018384835 77.0153 91 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Comamonas 95.0 N/A N/A N/A N/A 1 - GCA_013297875.1 s__JACMQX01 sp013297875 76.9763 109 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__JACMQX01 95.0 N/A N/A N/A N/A 1 - GCA_903832625.1 s__Limnohabitans sp903832625 76.9016 106 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Limnohabitans 95.0 97.24 97.19 0.90 0.89 4 - GCA_017988165.1 s__Giesbergeria sp017988165 76.8483 82 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Giesbergeria 95.0 N/A N/A N/A N/A 1 - GCF_000374625.1 s__Brachymonas chironomi 76.8157 99 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Brachymonas 95.0 N/A N/A N/A N/A 1 - GCF_004341365.1 s__Giesbergeria metamorpha 76.7398 129 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Giesbergeria 95.0 99.27 98.54 0.92 0.85 3 - GCF_900110225.1 s__Brachymonas denitrificans 76.6641 121 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Brachymonas 95.0 N/A N/A N/A N/A 1 - GCF_002837135.1 s__Macromonas bipunctata 76.6617 96 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Macromonas 95.0 N/A N/A N/A N/A 1 - GCA_013297865.1 s__JAAFIP01 sp013297865 76.585 106 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__JAAFIP01 95.0 N/A N/A N/A N/A 1 - GCF_011303755.1 s__Acidovorax_C sp011303755 76.558 119 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Acidovorax_C 95.0 N/A N/A N/A N/A 1 - GCF_003096555.1 s__Giesbergeria sp003096555 76.5193 87 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Giesbergeria 95.0 99.19 98.39 0.95 0.90 3 - GCA_903928495.1 s__UBA2334 sp903928495 76.4982 92 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__UBA2334 95.0 99.91 99.91 0.98 0.98 2 - GCA_018882835.1 s__Limnohabitans sp018882835 76.4872 83 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Limnohabitans 95.0 N/A N/A N/A N/A 1 - GCA_002842265.1 s__Rhodoferax sp002842265 76.4505 74 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Rhodoferax 95.0 N/A N/A N/A N/A 1 - GCF_003604195.1 s__Giesbergeria lacusdiani 76.4192 95 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Giesbergeria 95.0 N/A N/A N/A N/A 1 - GCA_903902035.1 s__Curvibacter sp903902035 76.3602 84 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Curvibacter 95.0 98.85 97.80 0.89 0.83 3 - GCF_000484635.1 s__Comamonas_C badia 76.3573 75 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Comamonas_C 95.0 98.01 98.01 0.89 0.89 2 - GCA_002282445.1 s__Limnohabitans sp002282445 76.2518 86 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Limnohabitans 95.0 100.00 100.00 0.99 0.99 3 - GCF_000204645.1 s__Alicycliphilus denitrificans 76.2397 89 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Alicycliphilus 95.0 97.26 95.80 0.83 0.76 8 - GCA_903823025.1 s__Rhodoferax sp903823025 76.1925 58 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Rhodoferax 95.0 99.95 99.83 0.97 0.95 6 - GCF_001713375.1 s__Hydrogenophaga sp001713375 76.1775 84 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Hydrogenophaga 95.0 N/A N/A N/A N/A 1 - GCA_018883305.1 s__Limnohabitans sp018883305 76.1328 53 541 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Limnohabitans 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-28 07:22:18,754] [INFO] GTDB search result was written to GCA_017993135.1_ASM1799313v1_genomic.fna/result_gtdb.tsv [2023-06-28 07:22:18,754] [INFO] ===== GTDB Search completed ===== [2023-06-28 07:22:18,759] [INFO] DFAST_QC result json was written to GCA_017993135.1_ASM1799313v1_genomic.fna/dqc_result.json [2023-06-28 07:22:18,759] [INFO] DFAST_QC completed! [2023-06-28 07:22:18,759] [INFO] Total running time: 0h1m1s