[2023-03-19 04:55:39,431] [INFO] DFAST_QC pipeline started. [2023-03-19 04:55:39,433] [INFO] DFAST_QC version: 0.5.7 [2023-03-19 04:55:39,433] [INFO] DQC Reference Directory: /var/lib/cwl/stg577bd177-e737-476b-ba0d-7474a6db74ab/dqc_reference [2023-03-19 04:55:40,560] [INFO] ===== Start taxonomy check using ANI ===== [2023-03-19 04:55:40,561] [INFO] Task started: Prodigal [2023-03-19 04:55:40,561] [INFO] Running command: cat /var/lib/cwl/stg6e1c73af-1b8d-4096-842f-d28de425d113/OceanDNA-b31471.fa | prodigal -d OceanDNA-b31471/cds.fna -a OceanDNA-b31471/protein.faa -g 11 -q > /dev/null [2023-03-19 04:55:55,090] [INFO] Task succeeded: Prodigal [2023-03-19 04:55:55,091] [INFO] Task started: HMMsearch [2023-03-19 04:55:55,091] [INFO] Running command: hmmsearch --tblout OceanDNA-b31471/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg577bd177-e737-476b-ba0d-7474a6db74ab/dqc_reference/reference_markers.hmm OceanDNA-b31471/protein.faa > /dev/null [2023-03-19 04:55:55,258] [INFO] Task succeeded: HMMsearch [2023-03-19 04:55:55,258] [WARNING] Found 5/6 markers. [/var/lib/cwl/stg6e1c73af-1b8d-4096-842f-d28de425d113/OceanDNA-b31471.fa] [2023-03-19 04:55:55,285] [INFO] Query marker FASTA was written to OceanDNA-b31471/markers.fasta [2023-03-19 04:55:55,286] [INFO] Task started: Blastn [2023-03-19 04:55:55,286] [INFO] Running command: blastn -query OceanDNA-b31471/markers.fasta -db /var/lib/cwl/stg577bd177-e737-476b-ba0d-7474a6db74ab/dqc_reference/reference_markers.fasta -out OceanDNA-b31471/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-19 04:55:56,138] [INFO] Task succeeded: Blastn [2023-03-19 04:55:56,140] [INFO] Selected 24 target genomes. [2023-03-19 04:55:56,140] [INFO] Target genome list was writen to OceanDNA-b31471/target_genomes.txt [2023-03-19 04:55:56,157] [INFO] Task started: fastANI [2023-03-19 04:55:56,158] [INFO] Running command: fastANI --query /var/lib/cwl/stg6e1c73af-1b8d-4096-842f-d28de425d113/OceanDNA-b31471.fa --refList OceanDNA-b31471/target_genomes.txt --output OceanDNA-b31471/fastani_result.tsv --threads 1 [2023-03-19 04:56:12,116] [INFO] Task succeeded: fastANI [2023-03-19 04:56:12,117] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg577bd177-e737-476b-ba0d-7474a6db74ab/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-03-19 04:56:12,117] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg577bd177-e737-476b-ba0d-7474a6db74ab/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-03-19 04:56:12,131] [INFO] Found 24 fastANI hits (1 hits with ANI > threshold) [2023-03-19 04:56:12,131] [INFO] The taxonomy check result is classified as 'conclusive'. [2023-03-19 04:56:12,131] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Hankyongella ginsenosidimutans strain=W1-2-3 GCA_005144965.1 1763828 1763828 type True 98.3097 727 793 95 conclusive Pedomonas mirosovicensis strain=A1X5R2 GCA_022569295.1 2908641 2908641 type True 77.6936 220 793 95 below_threshold Sphingomonas parva strain=17J27-24 GCA_004564275.1 2555898 2555898 type True 77.4668 160 793 95 below_threshold Sphingomonas chungangi strain=CGMCC 1.13654 GCA_013778325.1 2683589 2683589 type True 77.2481 164 793 95 below_threshold Sphingomonas sanxanigenens strain=NX02 GCA_000512205.2 397260 397260 type True 77.1955 180 793 95 below_threshold Sphingomonas jejuensis strain=DSM 27651 GCA_011927695.1 904715 904715 type True 77.1861 142 793 95 below_threshold Sphingomonas spermidinifaciens strain=9NM-10 GCA_002351485.1 1141889 1141889 type True 77.084 124 793 95 below_threshold Sphingomonas profundi strain=LMO-1 GCA_009739515.1 2681549 2681549 type True 77.0259 186 793 95 below_threshold Sphingomonas ginkgonis strain=HMF7854 GCA_003970925.1 2315330 2315330 type True 77.0168 144 793 95 below_threshold Sphingomonas naasensis strain=KIS18-15 GCA_004792695.1 1344951 1344951 type True 76.9811 154 793 95 below_threshold Sphingomonas jaspsi strain=DSM 18422 GCA_000585415.1 392409 392409 type True 76.9674 120 793 95 below_threshold Sphingomonas naasensis strain=DSM 100060 GCA_011762145.1 1344951 1344951 type True 76.9634 155 793 95 below_threshold Sphingomonas endophytica strain=DSM 101535 GCA_014199415.1 869719 869719 type True 76.9575 161 793 95 below_threshold Sphingomonas adhaesiva strain=NBRC 15099 GCA_001592345.1 28212 28212 type True 76.8811 160 793 95 below_threshold Sphingomonas adhaesiva strain=DSM 7418 GCA_002374855.1 28212 28212 type True 76.7822 176 793 95 below_threshold Starkeya novella strain=DSM 506 GCA_000092925.1 921 921 type True 76.6521 132 793 95 below_threshold Blastochloris sulfoviridis strain=DSM 729 GCA_008630065.1 50712 50712 type True 76.5666 138 793 95 below_threshold Blastochloris tepida strain=GI GCA_003966715.1 2233851 2233851 type True 76.5436 134 793 95 below_threshold Ancylobacter rudongensis strain=CGMCC 1.1761 GCA_900100155.1 177413 177413 type True 76.5282 112 793 95 below_threshold Cucumibacter marinus strain=DSM 18995 GCA_000429865.1 1121252 1121252 type True 76.4716 53 793 95 below_threshold Kaistia adipata strain=DSM 17808 GCA_000423225.1 166954 166954 type True 76.4554 130 793 95 below_threshold Reyranella aquatilis strain=KCTC 52223 GCA_020880995.1 2035356 2035356 type True 76.3439 110 793 95 below_threshold Starkeya koreensis strain=Jip08 GCA_023016525.1 266121 266121 type True 76.2889 121 793 95 below_threshold Nisaea acidiphila strain=MEBiC11861 GCA_024662015.1 1862145 1862145 type True 76.1834 94 793 95 below_threshold -------------------------------------------------------------------------------- [2023-03-19 04:56:12,133] [INFO] DFAST Taxonomy check result was written to OceanDNA-b31471/tc_result.tsv [2023-03-19 04:56:12,135] [INFO] ===== Taxonomy check completed ===== [2023-03-19 04:56:12,135] [INFO] ===== Start completeness check using CheckM ===== [2023-03-19 04:56:12,136] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg577bd177-e737-476b-ba0d-7474a6db74ab/dqc_reference/checkm_data [2023-03-19 04:56:12,136] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-03-19 04:56:12,142] [INFO] Task started: CheckM [2023-03-19 04:56:12,142] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b31471/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b31471/checkm_input OceanDNA-b31471/checkm_result [2023-03-19 04:56:51,155] [INFO] Task succeeded: CheckM [2023-03-19 04:56:51,155] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 83.33% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-03-19 04:56:51,173] [INFO] ===== Completeness check finished ===== [2023-03-19 04:56:51,173] [INFO] ===== Start GTDB Search ===== [2023-03-19 04:56:51,173] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b31471/markers.fasta) [2023-03-19 04:56:51,175] [INFO] Task started: Blastn [2023-03-19 04:56:51,175] [INFO] Running command: blastn -query OceanDNA-b31471/markers.fasta -db /var/lib/cwl/stg577bd177-e737-476b-ba0d-7474a6db74ab/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b31471/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-19 04:56:52,849] [INFO] Task succeeded: Blastn [2023-03-19 04:56:52,850] [INFO] Selected 26 target genomes. [2023-03-19 04:56:52,850] [INFO] Target genome list was writen to OceanDNA-b31471/target_genomes_gtdb.txt [2023-03-19 04:56:52,871] [INFO] Task started: fastANI [2023-03-19 04:56:52,871] [INFO] Running command: fastANI --query /var/lib/cwl/stg6e1c73af-1b8d-4096-842f-d28de425d113/OceanDNA-b31471.fa --refList OceanDNA-b31471/target_genomes_gtdb.txt --output OceanDNA-b31471/fastani_result_gtdb.tsv --threads 1 [2023-03-19 04:57:09,897] [INFO] Task succeeded: fastANI [2023-03-19 04:57:09,912] [INFO] Found 26 fastANI hits (1 hits with ANI > circumscription radius) [2023-03-19 04:57:09,912] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_005144965.1 s__Hankyongella ginsenosidimutans 98.3097 727 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Hankyongella 95.0 98.05 98.05 0.80 0.80 2 conclusive GCF_013778325.1 s__Sphingomonas_N chungangi 77.2472 165 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas_N 95.0 100.00 100.00 1.00 1.00 2 - GCA_018006725.1 s__Rhizorhabdus sp018006725 77.2336 152 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Rhizorhabdus 95.0 N/A N/A N/A N/A 1 - GCF_011927695.1 s__Sphingomonas_K jejuensis 77.2056 141 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas_K 95.0 N/A N/A N/A N/A 1 - GCF_004681125.1 s__Polymorphobacter_A arshaanensis 77.1618 143 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Polymorphobacter_A 95.0 N/A N/A N/A N/A 1 - GCA_016793845.1 s__Sphingosinicella sp016793845 77.111 150 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingosinicella 95.0 N/A N/A N/A N/A 1 - GCF_000419605.1 s__Sphingomonas phyllosphaerae_B 77.0242 114 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 - GCF_014199415.1 s__Sphingomonas endophytica 76.9893 159 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 98.00 98.00 0.92 0.92 2 - GCF_004792695.1 s__Sphingomonas naasensis 76.9811 154 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 100.00 100.00 1.00 1.00 2 - GCF_016820445.1 s__Sphingomonas sp016820445 76.9674 143 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 97.21 97.21 0.90 0.90 2 - GCF_000585415.1 s__Sphingomicrobium jaspsi 76.9476 121 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomicrobium 95.0 N/A N/A N/A N/A 1 - GCF_000787715.1 s__Sphingomonas parapaucimobilis 76.8503 149 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 97.90 97.90 0.87 0.87 2 - GCF_014194975.1 s__Sphingomonas sp014194975 76.8248 184 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 - GCF_002374855.1 s__Sphingomonas adhaesiva 76.7958 175 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 99.76 99.56 0.89 0.87 3 - GCF_000092925.1 s__Starkeya novella 76.6229 134 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Starkeya 95.0 N/A N/A N/A N/A 1 - GCA_006438735.1 s__Sphingomonas koreensis_A 76.6067 132 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 96.05 96.05 0.80 0.80 2 - GCF_014349105.1 s__17J80-11 sp014349105 76.5744 142 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Caulobacterales;f__Caulobacteraceae;g__17J80-11 95.0 N/A N/A N/A N/A 1 - GCF_008630065.1 s__Blastochloris sulfoviridis 76.5666 138 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Blastochloris 95.0 N/A N/A N/A N/A 1 - GCF_001421535.1 s__Sphingomonas sp001421535 76.5295 129 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 - GCF_900100155.1 s__Ancylobacter rudongensis 76.5282 112 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Ancylobacter 95.0 96.36 96.36 0.88 0.88 2 - GCF_000429865.1 s__Cucumibacter marinus 76.4716 53 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Devosiaceae;g__Cucumibacter 95.0 N/A N/A N/A N/A 1 - GCF_016629525.1 s__Kaistia sp016629525 76.2812 123 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Kaistiaceae;g__Kaistia 95.0 N/A N/A N/A N/A 1 - GCA_016780605.1 s__Reyranella sp016780605 76.1882 73 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Reyranellales;f__Reyranellaceae;g__Reyranella 95.0 N/A N/A N/A N/A 1 - GCF_018449475.1 s__Angulomicrobium sp018449475 76.163 129 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Angulomicrobium 95.0 N/A N/A N/A N/A 1 - GCA_017304475.1 s__Phreatobacter sp017304475 76.151 149 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Phreatobacteraceae;g__Phreatobacter 95.0 N/A N/A N/A N/A 1 - GCF_002073975.1 s__Rhodovulum sp002073975 76.032 79 793 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-03-19 04:57:09,912] [INFO] GTDB search result was written to OceanDNA-b31471/result_gtdb.tsv [2023-03-19 04:57:09,912] [INFO] ===== GTDB Search completed ===== [2023-03-19 04:57:09,915] [INFO] DFAST_QC result json was written to OceanDNA-b31471/dqc_result.json [2023-03-19 04:57:09,915] [INFO] DFAST_QC completed! [2023-03-19 04:57:09,915] [INFO] Total running time: 0h1m30s