[2023-03-18 06:30:24,814] [INFO] DFAST_QC pipeline started. [2023-03-18 06:30:24,817] [INFO] DFAST_QC version: 0.5.7 [2023-03-18 06:30:24,817] [INFO] DQC Reference Directory: /var/lib/cwl/stgb0b0c464-b019-4da5-8537-f488565e6e8e/dqc_reference [2023-03-18 06:30:25,962] [INFO] ===== Start taxonomy check using ANI ===== [2023-03-18 06:30:25,962] [INFO] Task started: Prodigal [2023-03-18 06:30:25,962] [INFO] Running command: cat /var/lib/cwl/stg5cdad0ef-6b63-4cf5-9142-a60ed34e8949/OceanDNA-b520.fa | prodigal -d OceanDNA-b520/cds.fna -a OceanDNA-b520/protein.faa -g 11 -q > /dev/null [2023-03-18 06:30:35,795] [INFO] Task succeeded: Prodigal [2023-03-18 06:30:35,796] [INFO] Task started: HMMsearch [2023-03-18 06:30:35,796] [INFO] Running command: hmmsearch --tblout OceanDNA-b520/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgb0b0c464-b019-4da5-8537-f488565e6e8e/dqc_reference/reference_markers.hmm OceanDNA-b520/protein.faa > /dev/null [2023-03-18 06:30:35,943] [INFO] Task succeeded: HMMsearch [2023-03-18 06:30:35,944] [WARNING] Found 5/6 markers. [/var/lib/cwl/stg5cdad0ef-6b63-4cf5-9142-a60ed34e8949/OceanDNA-b520.fa] [2023-03-18 06:30:35,984] [INFO] Query marker FASTA was written to OceanDNA-b520/markers.fasta [2023-03-18 06:30:35,985] [INFO] Task started: Blastn [2023-03-18 06:30:35,985] [INFO] Running command: blastn -query OceanDNA-b520/markers.fasta -db /var/lib/cwl/stgb0b0c464-b019-4da5-8537-f488565e6e8e/dqc_reference/reference_markers.fasta -out OceanDNA-b520/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-18 06:30:36,601] [INFO] Task succeeded: Blastn [2023-03-18 06:30:36,608] [INFO] Selected 23 target genomes. [2023-03-18 06:30:36,608] [INFO] Target genome list was writen to OceanDNA-b520/target_genomes.txt [2023-03-18 06:30:36,672] [INFO] Task started: fastANI [2023-03-18 06:30:36,673] [INFO] Running command: fastANI --query /var/lib/cwl/stg5cdad0ef-6b63-4cf5-9142-a60ed34e8949/OceanDNA-b520.fa --refList OceanDNA-b520/target_genomes.txt --output OceanDNA-b520/fastani_result.tsv --threads 1 [2023-03-18 06:30:59,163] [INFO] Task succeeded: fastANI [2023-03-18 06:30:59,163] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgb0b0c464-b019-4da5-8537-f488565e6e8e/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-03-18 06:30:59,163] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgb0b0c464-b019-4da5-8537-f488565e6e8e/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-03-18 06:30:59,174] [INFO] Found 16 fastANI hits (0 hits with ANI > threshold) [2023-03-18 06:30:59,174] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-03-18 06:30:59,174] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Ilumatobacter fluminis strain=DSM 18936 GCA_004364865.1 467091 467091 type True 77.1326 138 497 95 below_threshold Desertimonas flava strain=SYSU D60003 GCA_003426815.1 2064846 2064846 type True 77.0622 166 497 95 below_threshold Ilumatobacter coccineus strain=YM16-304 GCA_000348785.1 467094 467094 type True 76.681 109 497 95 below_threshold Actinomarinicola tropica strain=SCSIO 58843 GCA_009650215.1 2789776 2789776 type True 76.6475 123 497 95 below_threshold Rhabdothermincola salaria strain=EGI L10124 GCA_021246445.1 2903142 2903142 type True 76.5914 104 497 95 below_threshold Amycolatopsis eburnea strain=GLM-1 GCA_003937945.1 2267691 2267691 type True 75.4552 60 497 95 below_threshold Saccharothrix syringae strain=NRRL B-16468 GCA_009498035.1 103733 103733 type True 75.4132 77 497 95 below_threshold Saccharothrix syringae strain=NRRL B-16468 GCA_000716755.1 103733 103733 type True 75.3768 74 497 95 below_threshold Actinomadura violacea strain=LCR2-06 GCA_017573465.1 2819934 2819934 type True 75.2736 85 497 95 below_threshold Nocardia vulneris strain=NBRC 108936 GCA_001613425.1 1141657 1141657 type True 75.2245 52 497 95 below_threshold Actinokineospora terrae strain=DSM 44260 GCA_900111175.1 155974 155974 type True 75.198 51 497 95 below_threshold Nocardia vulneris strain=W9851 GCA_000811985.1 1141657 1141657 type True 75.1977 51 497 95 below_threshold Actinomadura bangladeshensis strain=DSM 45347 GCA_004348335.1 453573 453573 type True 75.1777 69 497 95 below_threshold Actinomadura cremea strain=JCM 3308 GCA_014648495.1 1991 1991 type True 75.131 71 497 95 below_threshold Actinokineospora cianjurensis strain=DSM 45657 GCA_003663795.1 585224 585224 type True 75.0986 54 497 95 below_threshold Actinomadura geliboluensis strain=A8036 GCA_005889745.1 882440 882440 type True 75.011 64 497 95 below_threshold -------------------------------------------------------------------------------- [2023-03-18 06:30:59,176] [INFO] DFAST Taxonomy check result was written to OceanDNA-b520/tc_result.tsv [2023-03-18 06:30:59,176] [INFO] ===== Taxonomy check completed ===== [2023-03-18 06:30:59,176] [INFO] ===== Start completeness check using CheckM ===== [2023-03-18 06:30:59,176] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgb0b0c464-b019-4da5-8537-f488565e6e8e/dqc_reference/checkm_data [2023-03-18 06:30:59,177] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-03-18 06:30:59,204] [INFO] Task started: CheckM [2023-03-18 06:30:59,204] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b520/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b520/checkm_input OceanDNA-b520/checkm_result [2023-03-18 06:31:35,804] [INFO] Task succeeded: CheckM [2023-03-18 06:31:35,805] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 54.17% Contamintation: 4.17% Strain heterogeneity: 100.00% -------------------------------------------------------------------------------- [2023-03-18 06:31:35,905] [INFO] ===== Completeness check finished ===== [2023-03-18 06:31:35,905] [INFO] ===== Start GTDB Search ===== [2023-03-18 06:31:35,905] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b520/markers.fasta) [2023-03-18 06:31:35,906] [INFO] Task started: Blastn [2023-03-18 06:31:35,906] [INFO] Running command: blastn -query OceanDNA-b520/markers.fasta -db /var/lib/cwl/stgb0b0c464-b019-4da5-8537-f488565e6e8e/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b520/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-18 06:31:36,832] [INFO] Task succeeded: Blastn [2023-03-18 06:31:36,852] [INFO] Selected 18 target genomes. [2023-03-18 06:31:36,852] [INFO] Target genome list was writen to OceanDNA-b520/target_genomes_gtdb.txt [2023-03-18 06:31:37,063] [INFO] Task started: fastANI [2023-03-18 06:31:37,063] [INFO] Running command: fastANI --query /var/lib/cwl/stg5cdad0ef-6b63-4cf5-9142-a60ed34e8949/OceanDNA-b520.fa --refList OceanDNA-b520/target_genomes_gtdb.txt --output OceanDNA-b520/fastani_result_gtdb.tsv --threads 1 [2023-03-18 06:31:47,095] [INFO] Task succeeded: fastANI [2023-03-18 06:31:47,105] [INFO] Found 16 fastANI hits (1 hits with ANI > circumscription radius) [2023-03-18 06:31:47,105] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_009920715.1 s__UBA2093 sp009920715 98.8455 400 497 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Ilumatobacteraceae;g__UBA2093 95.0 97.14 95.07 0.80 0.77 6 conclusive GCA_005788205.1 s__UBA2093 sp005788205 83.2263 290 497 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Ilumatobacteraceae;g__UBA2093 95.0 N/A N/A N/A N/A 1 - GCA_016870775.1 s__UBA2093 sp016870775 78.7698 208 497 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Ilumatobacteraceae;g__UBA2093 95.0 N/A N/A N/A N/A 1 - GCA_018969685.1 s__UBA3006 sp018969685 77.8355 148 497 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Ilumatobacteraceae;g__UBA3006 95.0 N/A N/A N/A N/A 1 - GCA_009704985.1 s__VFMC01 sp009704985 77.7543 192 497 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Ilumatobacteraceae;g__VFMC01 95.0 N/A N/A N/A N/A 1 - GCA_016871355.1 s__Casp-actino8 sp016871355 77.5674 98 497 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Ilumatobacteraceae;g__Casp-actino8 95.0 N/A N/A N/A N/A 1 - GCA_016700055.1 s__Kalu-18 sp016700055 77.3575 162 497 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Ilumatobacteraceae;g__Kalu-18 95.0 N/A N/A N/A N/A 1 - GCF_004364865.1 s__Ilumatobacter fluminis 77.172 136 497 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Ilumatobacteraceae;g__Ilumatobacter 95.0 N/A N/A N/A N/A 1 - GCA_009919275.1 s__F1-20-MAGs119 sp009919275 77.1615 97 497 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Ilumatobacteraceae;g__F1-20-MAGs119 95.0 99.53 99.53 0.87 0.87 2 - GCF_003426815.1 s__Desertimonas flava 77.0495 168 497 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Ilumatobacteraceae;g__Desertimonas 95.0 100.00 100.00 1.00 1.00 2 - GCA_013815505.1 s__JACDHN01 sp013815505 76.8932 69 497 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Ilumatobacteraceae;g__JACDHN01 95.0 N/A N/A N/A N/A 1 - GCA_013002175.1 s__Ilumatobacter_A sp013002175 76.7948 104 497 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Ilumatobacteraceae;g__Ilumatobacter_A 95.0 N/A N/A N/A N/A 1 - GCA_016716005.1 s__JADJXE01 sp016716005 76.7085 112 497 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__JADJXE01;g__JADJXE01 95.0 N/A N/A N/A N/A 1 - GCA_902805555.1 s__CADCSY01 sp902805555 76.5078 76 497 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__CADCSY01;g__CADCSY01 95.0 N/A N/A N/A N/A 1 - GCA_902805655.1 s__CADCTF01 sp902805655 76.2826 59 497 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__CADCTF01;g__CADCTF01 95.0 N/A N/A N/A N/A 1 - GCF_009498035.1 s__Actinosynnema syringae 75.4034 78 497 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__Actinosynnema 95.0 99.98 99.98 1.00 1.00 2 - -------------------------------------------------------------------------------- [2023-03-18 06:31:47,113] [INFO] GTDB search result was written to OceanDNA-b520/result_gtdb.tsv [2023-03-18 06:31:47,124] [INFO] ===== GTDB Search completed ===== [2023-03-18 06:31:47,137] [INFO] DFAST_QC result json was written to OceanDNA-b520/dqc_result.json [2023-03-18 06:31:47,137] [INFO] DFAST_QC completed! [2023-03-18 06:31:47,137] [INFO] Total running time: 0h1m22s