[2023-06-07 18:56:58,084] [INFO] DFAST_QC pipeline started. [2023-06-07 18:56:58,088] [INFO] DFAST_QC version: 0.5.7 [2023-06-07 18:56:58,088] [INFO] DQC Reference Directory: /var/lib/cwl/stg83122a49-c5b6-46ee-8e81-a7479494b00a/dqc_reference [2023-06-07 18:56:59,418] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-07 18:56:59,419] [INFO] Task started: Prodigal [2023-06-07 18:56:59,419] [INFO] Running command: gunzip -c /var/lib/cwl/stg1a995ebb-0274-4e6e-b7ea-b4eae5e41ca6/GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna.gz | prodigal -d GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/cds.fna -a GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-07 18:57:03,470] [INFO] Task succeeded: Prodigal [2023-06-07 18:57:03,470] [INFO] Task started: HMMsearch [2023-06-07 18:57:03,471] [INFO] Running command: hmmsearch --tblout GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg83122a49-c5b6-46ee-8e81-a7479494b00a/dqc_reference/reference_markers.hmm GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/protein.faa > /dev/null [2023-06-07 18:57:03,718] [INFO] Task succeeded: HMMsearch [2023-06-07 18:57:03,720] [INFO] Found 6/6 markers. [2023-06-07 18:57:03,749] [INFO] Query marker FASTA was written to GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/markers.fasta [2023-06-07 18:57:03,750] [INFO] Task started: Blastn [2023-06-07 18:57:03,750] [INFO] Running command: blastn -query GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/markers.fasta -db /var/lib/cwl/stg83122a49-c5b6-46ee-8e81-a7479494b00a/dqc_reference/reference_markers.fasta -out GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-07 18:57:04,330] [INFO] Task succeeded: Blastn [2023-06-07 18:57:04,334] [INFO] Selected 11 target genomes. [2023-06-07 18:57:04,335] [INFO] Target genome list was writen to GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/target_genomes.txt [2023-06-07 18:57:04,337] [INFO] Task started: fastANI [2023-06-07 18:57:04,337] [INFO] Running command: fastANI --query /var/lib/cwl/stg1a995ebb-0274-4e6e-b7ea-b4eae5e41ca6/GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna.gz --refList GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/target_genomes.txt --output GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/fastani_result.tsv --threads 1 [2023-06-07 18:57:08,700] [INFO] Task succeeded: fastANI [2023-06-07 18:57:08,700] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg83122a49-c5b6-46ee-8e81-a7479494b00a/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-07 18:57:08,700] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg83122a49-c5b6-46ee-8e81-a7479494b00a/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-07 18:57:08,710] [INFO] Found 11 fastANI hits (0 hits with ANI > threshold) [2023-06-07 18:57:08,711] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-07 18:57:08,711] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Candidatus Sulfurimonas baltica strain=GD2 GCA_015265455.1 2740404 2740404 type True 78.3469 180 591 95 below_threshold Sulfurimonas gotlandica strain=GD1 GCA_000242915.2 1176482 1176482 type True 78.2211 218 591 95 below_threshold Sulfurimonas gotlandica strain=GD 1 GCA_000156095.1 1176482 1176482 type True 78.1483 218 591 95 below_threshold Candidatus Sulfurimonas marisnigri strain=SoZ1 GCA_015265475.1 2740405 2740405 type True 78.1304 204 591 95 below_threshold Sulfurimonas hongkongensis strain=AST-10 GCA_000445475.1 1172190 1172190 type True 77.744 138 591 95 below_threshold Sulfurimonas autotrophica strain=DSM 16294 GCA_000147355.1 202747 202747 type True 77.4014 144 591 95 below_threshold Sulfurimonas xiamenensis strain=1-1N GCA_009258045.1 2590021 2590021 type True 77.3444 138 591 95 below_threshold Sulfurimonas denitrificans strain=DSM 1251 GCA_000012965.1 39766 39766 type True 77.3432 137 591 95 below_threshold Sulfurimonas lithotrophica strain=GYSZ_1 GCA_009258225.1 2590022 2590022 type True 77.2795 109 591 95 below_threshold Sulfurimonas crateris strain=SN118 GCA_005217605.1 2574727 2574727 type True 77.1678 114 591 95 below_threshold Sulfurimonas hydrogeniphila strain=NW10 GCA_009068765.1 2509341 2509341 type True 77.1137 98 591 95 below_threshold -------------------------------------------------------------------------------- [2023-06-07 18:57:08,718] [INFO] DFAST Taxonomy check result was written to GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/tc_result.tsv [2023-06-07 18:57:08,719] [INFO] ===== Taxonomy check completed ===== [2023-06-07 18:57:08,719] [INFO] ===== Start completeness check using CheckM ===== [2023-06-07 18:57:08,719] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg83122a49-c5b6-46ee-8e81-a7479494b00a/dqc_reference/checkm_data [2023-06-07 18:57:08,721] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-07 18:57:08,753] [INFO] Task started: CheckM [2023-06-07 18:57:08,753] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/checkm_input GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/checkm_result [2023-06-07 18:57:29,556] [INFO] Task succeeded: CheckM [2023-06-07 18:57:29,557] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 79.17% Contamintation: 12.50% Strain heterogeneity: 42.86% -------------------------------------------------------------------------------- [2023-06-07 18:57:29,579] [INFO] ===== Completeness check finished ===== [2023-06-07 18:57:29,580] [INFO] ===== Start GTDB Search ===== [2023-06-07 18:57:29,580] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/markers.fasta) [2023-06-07 18:57:29,580] [INFO] Task started: Blastn [2023-06-07 18:57:29,581] [INFO] Running command: blastn -query GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/markers.fasta -db /var/lib/cwl/stg83122a49-c5b6-46ee-8e81-a7479494b00a/dqc_reference/reference_markers_gtdb.fasta -out GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-07 18:57:30,367] [INFO] Task succeeded: Blastn [2023-06-07 18:57:30,371] [INFO] Selected 15 target genomes. [2023-06-07 18:57:30,371] [INFO] Target genome list was writen to GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/target_genomes_gtdb.txt [2023-06-07 18:57:30,379] [INFO] Task started: fastANI [2023-06-07 18:57:30,380] [INFO] Running command: fastANI --query /var/lib/cwl/stg1a995ebb-0274-4e6e-b7ea-b4eae5e41ca6/GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna.gz --refList GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/target_genomes_gtdb.txt --output GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-07 18:57:36,504] [INFO] Task succeeded: fastANI [2023-06-07 18:57:36,524] [INFO] Found 15 fastANI hits (0 hits with ANI > circumscription radius) [2023-06-07 18:57:36,524] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_002781365.1 s__Sulfurimonas sp002781365 78.6819 184 591 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 99.99 99.89 0.98 0.96 19 - GCA_002732645.1 s__Sulfurimonas sp002732645 78.5944 104 591 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 N/A N/A N/A N/A 1 - GCF_015265455.1 s__Sulfurimonas baltica 78.3537 179 591 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 N/A N/A N/A N/A 1 - GCA_001873135.1 s__Sulfurimonas sp001873135 78.2623 191 591 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 N/A N/A N/A N/A 1 - GCF_000242915.1 s__Sulfurimonas gotlandica 78.2211 218 591 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 99.99 99.99 1.00 1.00 2 - GCA_018822665.1 s__Sulfurimonas sp018822665 78.1951 197 591 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 99.95 99.95 0.96 0.96 2 - GCA_016744025.1 s__Sulfurimonas sp016744025 78.121 164 591 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 N/A N/A N/A N/A 1 - GCF_015265475.1 s__Sulfurimonas marisnigri 78.1132 205 591 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 N/A N/A N/A N/A 1 - GCA_011391635.1 s__Sulfurimonas sp011391635 78.0923 171 591 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 N/A N/A N/A N/A 1 - GCA_018825665.1 s__Sulfurimonas sp018825665 77.6701 170 591 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 99.99 99.99 0.97 0.97 2 - GCA_001829715.1 s__Sulfurimonas sp001829715 77.6698 133 591 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 98.65 98.65 0.90 0.90 2 - GCF_009883775.1 s__Sulfurimonas sp002452895 77.4357 140 591 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 98.89 98.30 0.92 0.92 4 - GCF_000147355.1 s__Sulfurimonas autotrophica 77.4014 144 591 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 N/A N/A N/A N/A 1 - GCF_000012965.1 s__Sulfurimonas denitrificans 77.3432 137 591 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 N/A N/A N/A N/A 1 - GCF_009068765.1 s__Sulfurimonas sp009068765 77.1165 97 591 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-07 18:57:36,526] [INFO] GTDB search result was written to GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/result_gtdb.tsv [2023-06-07 18:57:36,527] [INFO] ===== GTDB Search completed ===== [2023-06-07 18:57:36,530] [INFO] DFAST_QC result json was written to GCA_902752425.1_P2236_101_bin9_mag_fasta_genomic.fna/dqc_result.json [2023-06-07 18:57:36,530] [INFO] DFAST_QC completed! [2023-06-07 18:57:36,530] [INFO] Total running time: 0h0m38s