[2023-03-19 04:29:37,748] [INFO] DFAST_QC pipeline started. [2023-03-19 04:29:37,748] [INFO] DFAST_QC version: 0.5.7 [2023-03-19 04:29:37,748] [INFO] DQC Reference Directory: /var/lib/cwl/stg533ef604-b6ea-4870-b03b-6e63e00d7c0f/dqc_reference [2023-03-19 04:29:38,862] [INFO] ===== Start taxonomy check using ANI ===== [2023-03-19 04:29:38,863] [INFO] Task started: Prodigal [2023-03-19 04:29:38,864] [INFO] Running command: cat /var/lib/cwl/stg6e342dcd-6926-4a55-be9a-9fbcc470d391/OceanDNA-b13479.fa | prodigal -d OceanDNA-b13479/cds.fna -a OceanDNA-b13479/protein.faa -g 11 -q > /dev/null [2023-03-19 04:29:49,537] [INFO] Task succeeded: Prodigal [2023-03-19 04:29:49,537] [INFO] Task started: HMMsearch [2023-03-19 04:29:49,537] [INFO] Running command: hmmsearch --tblout OceanDNA-b13479/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg533ef604-b6ea-4870-b03b-6e63e00d7c0f/dqc_reference/reference_markers.hmm OceanDNA-b13479/protein.faa > /dev/null [2023-03-19 04:29:49,700] [INFO] Task succeeded: HMMsearch [2023-03-19 04:29:49,701] [INFO] Found 6/6 markers. [2023-03-19 04:29:49,715] [INFO] Query marker FASTA was written to OceanDNA-b13479/markers.fasta [2023-03-19 04:29:49,715] [INFO] Task started: Blastn [2023-03-19 04:29:49,715] [INFO] Running command: blastn -query OceanDNA-b13479/markers.fasta -db /var/lib/cwl/stg533ef604-b6ea-4870-b03b-6e63e00d7c0f/dqc_reference/reference_markers.fasta -out OceanDNA-b13479/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-19 04:29:50,228] [INFO] Task succeeded: Blastn [2023-03-19 04:29:50,229] [INFO] Selected 15 target genomes. [2023-03-19 04:29:50,229] [INFO] Target genome list was writen to OceanDNA-b13479/target_genomes.txt [2023-03-19 04:29:50,238] [INFO] Task started: fastANI [2023-03-19 04:29:50,238] [INFO] Running command: fastANI --query /var/lib/cwl/stg6e342dcd-6926-4a55-be9a-9fbcc470d391/OceanDNA-b13479.fa --refList OceanDNA-b13479/target_genomes.txt --output OceanDNA-b13479/fastani_result.tsv --threads 1 [2023-03-19 04:29:55,062] [INFO] Task succeeded: fastANI [2023-03-19 04:29:55,063] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg533ef604-b6ea-4870-b03b-6e63e00d7c0f/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-03-19 04:29:55,063] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg533ef604-b6ea-4870-b03b-6e63e00d7c0f/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-03-19 04:29:55,070] [INFO] Found 10 fastANI hits (0 hits with ANI > threshold) [2023-03-19 04:29:55,070] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-03-19 04:29:55,070] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Candidatus Sulfurimonas baltica strain=GD2 GCA_015265455.1 2740404 2740404 type True 77.355 94 654 95 below_threshold Sulfurimonas xiamenensis strain=1-1N GCA_009258045.1 2590021 2590021 type True 77.3048 64 654 95 below_threshold Candidatus Sulfurimonas marisnigri strain=SoZ1 GCA_015265475.1 2740405 2740405 type True 77.1082 96 654 95 below_threshold Sulfurimonas gotlandica strain=GD 1 GCA_000156095.1 1176482 1176482 type True 76.955 99 654 95 below_threshold Sulfurimonas gotlandica strain=GD1 GCA_000242915.2 1176482 1176482 type True 76.9478 102 654 95 below_threshold Sulfurimonas sediminis strain=S2-6 GCA_014905115.1 2590020 2590020 type True 76.7902 67 654 95 below_threshold Sulfurimonas autotrophica strain=DSM 16294 GCA_000147355.1 202747 202747 type True 76.7271 96 654 95 below_threshold Sulfurimonas denitrificans strain=DSM 1251 GCA_000012965.1 39766 39766 type True 76.7248 66 654 95 below_threshold Sulfurimonas lithotrophica strain=GYSZ_1 GCA_009258225.1 2590022 2590022 type True 76.6339 60 654 95 below_threshold Sulfurimonas hydrogeniphila strain=NW10 GCA_009068765.1 2509341 2509341 type True 76.605 72 654 95 below_threshold -------------------------------------------------------------------------------- [2023-03-19 04:29:55,070] [INFO] DFAST Taxonomy check result was written to OceanDNA-b13479/tc_result.tsv [2023-03-19 04:29:55,070] [INFO] ===== Taxonomy check completed ===== [2023-03-19 04:29:55,070] [INFO] ===== Start completeness check using CheckM ===== [2023-03-19 04:29:55,070] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg533ef604-b6ea-4870-b03b-6e63e00d7c0f/dqc_reference/checkm_data [2023-03-19 04:29:55,071] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-03-19 04:29:55,074] [INFO] Task started: CheckM [2023-03-19 04:29:55,074] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b13479/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b13479/checkm_input OceanDNA-b13479/checkm_result [2023-03-19 04:30:25,914] [INFO] Task succeeded: CheckM [2023-03-19 04:30:25,915] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 79.92% Contamintation: 4.17% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-03-19 04:30:25,918] [INFO] ===== Completeness check finished ===== [2023-03-19 04:30:25,918] [INFO] ===== Start GTDB Search ===== [2023-03-19 04:30:25,918] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b13479/markers.fasta) [2023-03-19 04:30:25,918] [INFO] Task started: Blastn [2023-03-19 04:30:25,918] [INFO] Running command: blastn -query OceanDNA-b13479/markers.fasta -db /var/lib/cwl/stg533ef604-b6ea-4870-b03b-6e63e00d7c0f/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b13479/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-19 04:30:26,621] [INFO] Task succeeded: Blastn [2023-03-19 04:30:26,622] [INFO] Selected 22 target genomes. [2023-03-19 04:30:26,622] [INFO] Target genome list was writen to OceanDNA-b13479/target_genomes_gtdb.txt [2023-03-19 04:30:26,664] [INFO] Task started: fastANI [2023-03-19 04:30:26,664] [INFO] Running command: fastANI --query /var/lib/cwl/stg6e342dcd-6926-4a55-be9a-9fbcc470d391/OceanDNA-b13479.fa --refList OceanDNA-b13479/target_genomes_gtdb.txt --output OceanDNA-b13479/fastani_result_gtdb.tsv --threads 1 [2023-03-19 04:30:34,335] [INFO] Task succeeded: fastANI [2023-03-19 04:30:34,345] [INFO] Found 17 fastANI hits (0 hits with ANI > circumscription radius) [2023-03-19 04:30:34,345] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_001873135.1 s__Sulfurimonas sp001873135 77.6156 90 654 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 N/A N/A N/A N/A 1 - GCA_011391635.1 s__Sulfurimonas sp011391635 77.4455 76 654 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 N/A N/A N/A N/A 1 - GCF_015265455.1 s__Sulfurimonas baltica 77.3381 93 654 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 N/A N/A N/A N/A 1 - GCA_015487265.1 s__Sulfurimonas sp015487265 77.3067 87 654 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 N/A N/A N/A N/A 1 - GCF_017357825.1 s__Sulfurimonas sp017357825 77.2032 113 654 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 N/A N/A N/A N/A 1 - GCA_001829625.1 s__Sulfurimonas sp001829625 77.1827 82 654 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 99.76 99.76 0.87 0.87 2 - GCA_016744025.1 s__Sulfurimonas sp016744025 77.1691 83 654 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 N/A N/A N/A N/A 1 - GCA_018698455.1 s__Sulfurimonas sp018698455 77.0991 95 654 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 N/A N/A N/A N/A 1 - GCA_002781365.1 s__Sulfurimonas sp002781365 77.0174 90 654 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 99.99 99.89 0.98 0.96 19 - GCF_000242915.1 s__Sulfurimonas gotlandica 76.9478 102 654 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 99.99 99.99 1.00 1.00 2 - GCA_018825665.1 s__Sulfurimonas sp018825665 76.823 63 654 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 99.99 99.99 0.97 0.97 2 - GCA_018822665.1 s__Sulfurimonas sp018822665 76.7274 92 654 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 99.95 99.95 0.96 0.96 2 - GCF_000147355.1 s__Sulfurimonas autotrophica 76.7271 96 654 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 N/A N/A N/A N/A 1 - GCA_001829715.1 s__Sulfurimonas sp001829715 76.6827 68 654 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 98.65 98.65 0.90 0.90 2 - GCA_001829675.1 s__Sulfurimonas sp001829675 76.6391 71 654 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 99.77 99.75 0.96 0.96 4 - GCF_009258225.1 s__Sulfurimonas sp009258225 76.6339 60 654 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__Sulfurimonas 95.0 N/A N/A N/A N/A 1 - GCA_903892995.1 s__CAITKP01 sp903892995 76.3455 52 654 d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Sulfurimonadaceae;g__CAITKP01 95.0 99.47 98.97 0.95 0.95 3 - -------------------------------------------------------------------------------- [2023-03-19 04:30:34,345] [INFO] GTDB search result was written to OceanDNA-b13479/result_gtdb.tsv [2023-03-19 04:30:34,346] [INFO] ===== GTDB Search completed ===== [2023-03-19 04:30:34,347] [INFO] DFAST_QC result json was written to OceanDNA-b13479/dqc_result.json [2023-03-19 04:30:34,347] [INFO] DFAST_QC completed! [2023-03-19 04:30:34,347] [INFO] Total running time: 0h0m57s