[2023-03-18 09:58:36,586] [INFO] DFAST_QC pipeline started. [2023-03-18 09:58:36,586] [INFO] DFAST_QC version: 0.5.7 [2023-03-18 09:58:36,586] [INFO] DQC Reference Directory: /var/lib/cwl/stgcdfbd02c-eaaf-46e7-96ce-77b1536e01aa/dqc_reference [2023-03-18 09:58:38,403] [INFO] ===== Start taxonomy check using ANI ===== [2023-03-18 09:58:38,404] [INFO] Task started: Prodigal [2023-03-18 09:58:38,404] [INFO] Running command: cat /var/lib/cwl/stgd35a4e38-a5aa-407c-a381-dd72ab2c78f9/OceanDNA-b32019.fa | prodigal -d OceanDNA-b32019/cds.fna -a OceanDNA-b32019/protein.faa -g 11 -q > /dev/null [2023-03-18 09:58:54,837] [INFO] Task succeeded: Prodigal [2023-03-18 09:58:54,837] [INFO] Task started: HMMsearch [2023-03-18 09:58:54,837] [INFO] Running command: hmmsearch --tblout OceanDNA-b32019/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgcdfbd02c-eaaf-46e7-96ce-77b1536e01aa/dqc_reference/reference_markers.hmm OceanDNA-b32019/protein.faa > /dev/null [2023-03-18 09:58:55,090] [INFO] Task succeeded: HMMsearch [2023-03-18 09:58:55,091] [INFO] Found 6/6 markers. [2023-03-18 09:58:55,110] [INFO] Query marker FASTA was written to OceanDNA-b32019/markers.fasta [2023-03-18 09:58:55,110] [INFO] Task started: Blastn [2023-03-18 09:58:55,110] [INFO] Running command: blastn -query OceanDNA-b32019/markers.fasta -db /var/lib/cwl/stgcdfbd02c-eaaf-46e7-96ce-77b1536e01aa/dqc_reference/reference_markers.fasta -out OceanDNA-b32019/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-18 09:58:55,896] [INFO] Task succeeded: Blastn [2023-03-18 09:58:55,897] [INFO] Selected 31 target genomes. [2023-03-18 09:58:55,897] [INFO] Target genome list was writen to OceanDNA-b32019/target_genomes.txt [2023-03-18 09:58:55,923] [INFO] Task started: fastANI [2023-03-18 09:58:55,923] [INFO] Running command: fastANI --query /var/lib/cwl/stgd35a4e38-a5aa-407c-a381-dd72ab2c78f9/OceanDNA-b32019.fa --refList OceanDNA-b32019/target_genomes.txt --output OceanDNA-b32019/fastani_result.tsv --threads 1 [2023-03-18 09:59:14,423] [INFO] Task succeeded: fastANI [2023-03-18 09:59:14,423] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgcdfbd02c-eaaf-46e7-96ce-77b1536e01aa/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-03-18 09:59:14,423] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgcdfbd02c-eaaf-46e7-96ce-77b1536e01aa/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-03-18 09:59:14,440] [INFO] Found 31 fastANI hits (0 hits with ANI > threshold) [2023-03-18 09:59:14,440] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-03-18 09:59:14,440] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Alteraurantiacibacter aestuarii strain=JCM 16339 GCA_009827405.1 650004 650004 type True 78.7806 202 867 95 below_threshold Croceibacterium atlanticum strain=26DY36 GCA_001008165.2 1267766 1267766 type True 78.6469 310 867 95 below_threshold Caenibius tardaugens strain=NBRC 16725 GCA_003860345.1 169176 169176 type True 78.639 327 867 95 below_threshold Croceibacterium atlanticum strain=DSM 100738 GCA_014199315.1 1267766 1267766 type True 78.636 312 867 95 below_threshold Caenibius tardaugens strain=NBRC 16725 GCA_000466945.1 169176 169176 type True 78.59 326 867 95 below_threshold Novosphingobium arvoryzae strain=KCTC 32422 GCA_014652615.1 1256514 1256514 type True 78.5628 334 867 95 below_threshold Novosphingobium percolationis strain=c1 GCA_020179425.1 2871811 2871811 type True 78.5 274 867 95 below_threshold Novosphingobium lentum strain=NBRC 107847 GCA_001590965.1 145287 145287 type True 78.4162 329 867 95 below_threshold Novosphingobium colocasiae strain=KCTC 32255 GCA_014652555.1 1256513 1256513 type True 78.4006 313 867 95 below_threshold Novosphingobium huizhouense strain=c7 GCA_020179475.1 2866625 2866625 type True 78.3648 264 867 95 below_threshold Qipengyuania pacifica strain=NZ-96 GCA_019857205.1 2860199 2860199 type True 78.3304 200 867 95 below_threshold Qipengyuania qiaonensis strain=6D47A GCA_019711515.1 2867240 2867240 type True 78.2251 216 867 95 below_threshold Erythrobacter donghaensis strain=DSM 16220 GCA_002155425.1 267135 267135 suspected-type True 78.1913 245 867 95 below_threshold Novosphingobium kunmingense strain=CGMCC 1.12274 GCA_002813245.1 1211806 1211806 type True 78.1553 273 867 95 below_threshold Croceibacterium ferulae strain=SX2RGS8 GCA_003660445.1 1854641 1854641 type True 78.1202 254 867 95 below_threshold Pelagerythrobacter marinus strain=H32 GCA_009827515.1 538382 538382 type True 78.1066 283 867 95 below_threshold Novosphingobium ginsenosidimutans strain=FW-6 GCA_007954425.1 1176536 1176536 type True 78.0751 270 867 95 below_threshold Aurantiacibacter rhizosphaerae strain=GH3-10 GCA_009807005.1 2691582 2691582 type True 78.044 243 867 95 below_threshold Aurantiacibacter suaedae strain=GH3-15 GCA_005434915.1 2545755 2545755 type True 78.0135 228 867 95 below_threshold Novosphingobium aromaticivorans strain=DSM 12444 GCA_000013325.1 48935 48935 type True 77.9526 241 867 95 below_threshold Erythrobacter ramosus strain=DSM 8510 GCA_014195675.1 35811 35811 type True 77.8893 226 867 95 below_threshold Erythrobacter ramosus strain=JCM 10282 GCA_009828055.1 35811 35811 type True 77.8763 224 867 95 below_threshold Qipengyuania gaetbuli strain=DSM 16225 GCA_009827315.1 266952 266952 type True 77.8715 230 867 95 below_threshold Novosphingobium taihuense strain=DSM 17507 GCA_014199635.1 260085 260085 type True 77.8664 211 867 95 below_threshold Novosphingobium taihuense strain=CGMCC 1.3432 GCA_007830315.1 260085 260085 type True 77.8525 212 867 95 below_threshold Croceicoccus hydrothermalis strain=JLT1 GCA_022378335.1 2867964 2867964 type True 77.8518 235 867 95 below_threshold Pelagerythrobacter rhizovicinus strain=AY-3R GCA_004135625.1 2268576 2268576 type True 77.8498 235 867 95 below_threshold Qipengyuania aerophila strain=GH25 GCA_019711555.1 2867242 2867242 type True 77.761 192 867 95 below_threshold Pelagerythrobacter aerophilus strain=Ery1 GCA_003581645.1 2306995 2306995 type True 77.7401 237 867 95 below_threshold Tsuneonella rigui strain=KCTC 42620 GCA_003958625.1 1708790 1708790 type True 77.7322 225 867 95 below_threshold Novosphingobium marinum strain=CGMCC 1.12918 GCA_014640055.1 1514948 1514948 type True 77.6291 220 867 95 below_threshold -------------------------------------------------------------------------------- [2023-03-18 09:59:14,440] [INFO] DFAST Taxonomy check result was written to OceanDNA-b32019/tc_result.tsv [2023-03-18 09:59:14,440] [INFO] ===== Taxonomy check completed ===== [2023-03-18 09:59:14,441] [INFO] ===== Start completeness check using CheckM ===== [2023-03-18 09:59:14,441] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgcdfbd02c-eaaf-46e7-96ce-77b1536e01aa/dqc_reference/checkm_data [2023-03-18 09:59:14,441] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-03-18 09:59:14,446] [INFO] Task started: CheckM [2023-03-18 09:59:14,446] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b32019/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b32019/checkm_input OceanDNA-b32019/checkm_result [2023-03-18 09:59:57,295] [INFO] Task succeeded: CheckM [2023-03-18 09:59:57,296] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 94.44% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-03-18 09:59:57,298] [INFO] ===== Completeness check finished ===== [2023-03-18 09:59:57,298] [INFO] ===== Start GTDB Search ===== [2023-03-18 09:59:57,298] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b32019/markers.fasta) [2023-03-18 09:59:57,299] [INFO] Task started: Blastn [2023-03-18 09:59:57,300] [INFO] Running command: blastn -query OceanDNA-b32019/markers.fasta -db /var/lib/cwl/stgcdfbd02c-eaaf-46e7-96ce-77b1536e01aa/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b32019/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-18 09:59:58,735] [INFO] Task succeeded: Blastn [2023-03-18 09:59:58,736] [INFO] Selected 28 target genomes. [2023-03-18 09:59:58,736] [INFO] Target genome list was writen to OceanDNA-b32019/target_genomes_gtdb.txt [2023-03-18 09:59:58,837] [INFO] Task started: fastANI [2023-03-18 09:59:58,837] [INFO] Running command: fastANI --query /var/lib/cwl/stgd35a4e38-a5aa-407c-a381-dd72ab2c78f9/OceanDNA-b32019.fa --refList OceanDNA-b32019/target_genomes_gtdb.txt --output OceanDNA-b32019/fastani_result_gtdb.tsv --threads 1 [2023-03-18 10:00:14,566] [INFO] Task succeeded: fastANI [2023-03-18 10:00:14,581] [INFO] Found 28 fastANI hits (0 hits with ANI > circumscription radius) [2023-03-18 10:00:14,581] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_001305965.1 s__Caenibius sp001305965 79.085 348 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Caenibius 95.0 N/A N/A N/A N/A 1 - GCA_014763545.1 s__JACXVD01 sp014763545 78.9672 336 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__JACXVD01 95.0 99.92 99.92 0.96 0.96 2 - GCF_009827405.1 s__Alteraurantiacibacter aestuarii 78.7806 202 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Alteraurantiacibacter 95.0 N/A N/A N/A N/A 1 - GCA_017308255.1 s__Croceibacterium sp001897135 78.7325 276 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Croceibacterium 95.0 99.96 99.96 0.98 0.98 2 - GCA_017302615.1 s__Novosphingobium sp017302615 78.6469 297 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Novosphingobium 95.0 N/A N/A N/A N/A 1 - GCF_003860345.1 s__Caenibius tardaugens 78.639 327 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Caenibius 95.0 100.00 100.00 1.00 1.00 2 - GCF_001008165.2 s__Croceibacterium atlanticum 78.6208 312 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Croceibacterium 95.0 100.00 100.00 1.00 1.00 2 - GCF_016461955.1 s__QFOP01 sp003248785 78.5774 261 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__QFOP01 95.0 98.23 98.19 0.91 0.89 3 - GCF_014652555.1 s__Novosphingobium colocasiae 78.413 312 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Novosphingobium 95.0 N/A N/A N/A N/A 1 - GCF_002198665.1 s__Novosphingobium sp002198665 78.2952 287 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Novosphingobium 95.0 N/A N/A N/A N/A 1 - GCF_002155425.1 s__Erythrobacter donghaensis 78.1913 245 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Erythrobacter 95.0 N/A N/A N/A N/A 1 - GCF_003454795.1 s__Novosphingobium sp003454795 78.1685 236 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Novosphingobium 95.0 98.02 97.88 0.92 0.91 3 - GCF_002813245.1 s__Novosphingobium kunmingense 78.1553 273 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Novosphingobium 95.0 N/A N/A N/A N/A 1 - GCF_003660445.1 s__Croceibacterium ferulae 78.1215 255 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Croceibacterium 95.0 N/A N/A N/A N/A 1 - GCF_001028625.1 s__Pelagerythrobacter marensis 78.1211 276 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Pelagerythrobacter 95.0 100.00 100.00 1.00 1.00 2 - GCA_002440635.1 s__Novosphingobium sp002440635 78.1072 258 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Novosphingobium 95.0 N/A N/A N/A N/A 1 - GCF_009827515.1 s__Pelagerythrobacter marinus 78.1066 283 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Pelagerythrobacter 95.0 N/A N/A N/A N/A 1 - GCA_002337385.1 s__Alteraurantiacibacter sp002337385 78.1039 231 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Alteraurantiacibacter 95.0 N/A N/A N/A N/A 1 - GCF_002269345.1 s__Tsuneonella mangrovi 78.0375 226 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Tsuneonella 95.0 N/A N/A N/A N/A 1 - GCF_005434915.1 s__Alteraurantiacibacter suaedae 78.0257 226 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Alteraurantiacibacter 95.0 N/A N/A N/A N/A 1 - GCF_003958635.1 s__Altericroceibacterium_A xinjiangense 77.8937 224 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Altericroceibacterium_A 95.0 N/A N/A N/A N/A 1 - GCF_009828055.1 s__Erythrobacter ramosus 77.8915 223 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Erythrobacter 95.0 99.99 99.99 1.00 1.00 2 - GCA_903832325.1 s__Novosphingobium sp903832325 77.8622 281 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Novosphingobium 95.0 99.98 99.94 0.97 0.96 4 - GCF_014644315.1 s__Tsuneonella deserti 77.8477 184 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Tsuneonella 95.0 N/A N/A N/A N/A 1 - GCF_004135625.1 s__Pelagerythrobacter rhizovicinus 77.8353 236 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Pelagerythrobacter 95.0 N/A N/A N/A N/A 1 - GCA_013141325.1 s__Novosphingobium sp013141325 77.7864 226 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Novosphingobium 95.0 N/A N/A N/A N/A 1 - GCF_009827495.1 s__Tsuneonella aeria 77.726 232 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Tsuneonella 95.0 N/A N/A N/A N/A 1 - GCA_009885425.1 s__Novosphingobium sp009885425 76.98 139 867 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Novosphingobium 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-03-18 10:00:14,581] [INFO] GTDB search result was written to OceanDNA-b32019/result_gtdb.tsv [2023-03-18 10:00:14,582] [INFO] ===== GTDB Search completed ===== [2023-03-18 10:00:14,584] [INFO] DFAST_QC result json was written to OceanDNA-b32019/dqc_result.json [2023-03-18 10:00:14,585] [INFO] DFAST_QC completed! [2023-03-18 10:00:14,585] [INFO] Total running time: 0h1m38s