[2023-03-14 15:02:46,517] [INFO] DFAST_QC pipeline started. [2023-03-14 15:02:46,517] [INFO] DFAST_QC version: 0.5.7 [2023-03-14 15:02:46,517] [INFO] DQC Reference Directory: /var/lib/cwl/stgb590f553-48c7-42f9-81b5-211f2d2b7b94/dqc_reference [2023-03-14 15:02:47,656] [INFO] ===== Start taxonomy check using ANI ===== [2023-03-14 15:02:47,657] [INFO] Task started: Prodigal [2023-03-14 15:02:47,657] [INFO] Running command: cat /var/lib/cwl/stg8023d1fd-1ec7-445a-80b1-fb144344cc04/OceanDNA-b7449.fa | prodigal -d OceanDNA-b7449/cds.fna -a OceanDNA-b7449/protein.faa -g 11 -q > /dev/null [2023-03-14 15:03:22,327] [INFO] Task succeeded: Prodigal [2023-03-14 15:03:22,327] [INFO] Task started: HMMsearch [2023-03-14 15:03:22,327] [INFO] Running command: hmmsearch --tblout OceanDNA-b7449/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgb590f553-48c7-42f9-81b5-211f2d2b7b94/dqc_reference/reference_markers.hmm OceanDNA-b7449/protein.faa > /dev/null [2023-03-14 15:03:22,546] [INFO] Task succeeded: HMMsearch [2023-03-14 15:03:22,547] [INFO] Found 6/6 markers. [2023-03-14 15:03:22,573] [INFO] Query marker FASTA was written to OceanDNA-b7449/markers.fasta [2023-03-14 15:03:22,574] [INFO] Task started: Blastn [2023-03-14 15:03:22,574] [INFO] Running command: blastn -query OceanDNA-b7449/markers.fasta -db /var/lib/cwl/stgb590f553-48c7-42f9-81b5-211f2d2b7b94/dqc_reference/reference_markers.fasta -out OceanDNA-b7449/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-14 15:03:23,118] [INFO] Task succeeded: Blastn [2023-03-14 15:03:23,118] [INFO] Selected 22 target genomes. [2023-03-14 15:03:23,119] [INFO] Target genome list was writen to OceanDNA-b7449/target_genomes.txt [2023-03-14 15:03:23,128] [INFO] Task started: fastANI [2023-03-14 15:03:23,128] [INFO] Running command: fastANI --query /var/lib/cwl/stg8023d1fd-1ec7-445a-80b1-fb144344cc04/OceanDNA-b7449.fa --refList OceanDNA-b7449/target_genomes.txt --output OceanDNA-b7449/fastani_result.tsv --threads 1 [2023-03-14 15:03:40,896] [INFO] Task succeeded: fastANI [2023-03-14 15:03:40,897] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgb590f553-48c7-42f9-81b5-211f2d2b7b94/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-03-14 15:03:40,897] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgb590f553-48c7-42f9-81b5-211f2d2b7b94/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-03-14 15:03:40,908] [INFO] Found 18 fastANI hits (0 hits with ANI > threshold) [2023-03-14 15:03:40,909] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-03-14 15:03:40,909] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Kriegella aquimaris strain=DSM 19886 GCA_900103215.1 192904 192904 type True 84.0397 1157 1565 95 below_threshold Zobellia roscoffensis strain=Asnod1-F08 GCA_015330165.1 2779508 2779508 type True 77.6308 182 1565 95 below_threshold Zobellia amurskyensis strain=KMM 3526 GCA_009725985.1 248905 248905 type True 77.0952 175 1565 95 below_threshold Maribacter polysiphoniae strain=DSM 23514 GCA_003148665.1 429344 429344 type True 77.0566 140 1565 95 below_threshold Maribacter polysiphoniae strain=KCTC 22021 GCA_014673435.1 429344 429344 type True 77.0029 139 1565 95 below_threshold Maribacter litoralis strain=SDRB-Phe2 GCA_003075045.1 2059726 2059726 type True 76.5114 137 1565 95 below_threshold Maribacter luteus strain=RZ05 GCA_009674825.1 2594478 2594478 type True 76.4505 144 1565 95 below_threshold Eudoraea adriatica strain=DSM 19308 GCA_000382125.1 446681 446681 type True 76.3938 74 1565 95 below_threshold Maribacter orientalis strain=DSM 16471 GCA_900109345.1 228957 228957 type True 76.3649 128 1565 95 below_threshold Maribacter arenosus strain=CAU 1321 GCA_014610845.1 1854708 1854708 type True 76.3294 131 1565 95 below_threshold Muricauda hadalis strain=MT-229 GCA_007785775.2 2597517 2597517 type True 76.1427 55 1565 95 below_threshold Muricauda onchidii strain=XY-359 GCA_004804315.1 2562684 2562684 type True 76.1126 85 1565 95 below_threshold Maribacter arcticus strain=DSM 23546 GCA_900167935.1 561365 561365 type True 76.0778 125 1565 95 below_threshold Muricauda hymeniacidonis strain=176CP4-71 GCA_004296335.1 2517819 2517819 type True 76.0771 73 1565 95 below_threshold Cellulophaga algicola strain=DSM 14237 GCA_000186265.1 59600 59600 type True 76.0658 89 1565 95 below_threshold Poritiphilus flavus strain=R33 GCA_009901585.1 2697053 2697053 type True 75.991 67 1565 95 below_threshold Muricauda oceani strain=501str8 GCA_019457985.1 2698672 2698672 type True 75.9462 78 1565 95 below_threshold Muricauda profundi strain=BC31-3-A3 GCA_017313275.1 2915620 2915620 type True 75.9245 73 1565 95 below_threshold -------------------------------------------------------------------------------- [2023-03-14 15:03:40,909] [INFO] DFAST Taxonomy check result was written to OceanDNA-b7449/tc_result.tsv [2023-03-14 15:03:40,909] [INFO] ===== Taxonomy check completed ===== [2023-03-14 15:03:40,909] [INFO] ===== Start completeness check using CheckM ===== [2023-03-14 15:03:40,909] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgb590f553-48c7-42f9-81b5-211f2d2b7b94/dqc_reference/checkm_data [2023-03-14 15:03:40,910] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-03-14 15:03:40,917] [INFO] Task started: CheckM [2023-03-14 15:03:40,917] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b7449/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b7449/checkm_input OceanDNA-b7449/checkm_result [2023-03-14 15:05:26,858] [INFO] Task succeeded: CheckM [2023-03-14 15:05:26,859] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 79.17% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-03-14 15:05:26,862] [INFO] ===== Completeness check finished ===== [2023-03-14 15:05:26,862] [INFO] ===== Start GTDB Search ===== [2023-03-14 15:05:26,862] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b7449/markers.fasta) [2023-03-14 15:05:26,862] [INFO] Task started: Blastn [2023-03-14 15:05:26,862] [INFO] Running command: blastn -query OceanDNA-b7449/markers.fasta -db /var/lib/cwl/stgb590f553-48c7-42f9-81b5-211f2d2b7b94/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b7449/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-14 15:05:28,700] [INFO] Task succeeded: Blastn [2023-03-14 15:05:28,700] [INFO] Selected 24 target genomes. [2023-03-14 15:05:28,701] [INFO] Target genome list was writen to OceanDNA-b7449/target_genomes_gtdb.txt [2023-03-14 15:05:28,821] [INFO] Task started: fastANI [2023-03-14 15:05:28,821] [INFO] Running command: fastANI --query /var/lib/cwl/stg8023d1fd-1ec7-445a-80b1-fb144344cc04/OceanDNA-b7449.fa --refList OceanDNA-b7449/target_genomes_gtdb.txt --output OceanDNA-b7449/fastani_result_gtdb.tsv --threads 1 [2023-03-14 15:05:49,580] [INFO] Task succeeded: fastANI [2023-03-14 15:05:49,593] [INFO] Found 22 fastANI hits (0 hits with ANI > circumscription radius) [2023-03-14 15:05:49,593] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_900103215.1 s__Kriegella aquimaris 84.0397 1157 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Kriegella 95.0 N/A N/A N/A N/A 1 - GCF_015330165.1 s__Zobellia sp015330165 77.6305 182 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Zobellia 95.0 98.41 98.41 0.92 0.92 2 - GCF_009725985.1 s__Zobellia amurskyensis 77.0819 176 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Zobellia 95.0 98.14 98.14 0.91 0.91 2 - GCF_003148665.1 s__Maribacter_A polysiphoniae 77.0566 140 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Maribacter_A 95.0 99.27 98.55 0.95 0.91 3 - GCF_014596745.1 s__Maribacter_A sp014596745 76.6837 151 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Maribacter_A 95.0 N/A N/A N/A N/A 1 - GCF_008367235.1 s__Pareuzebyella sediminis 76.583 132 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Pareuzebyella 95.0 99.39 99.39 0.97 0.97 2 - GCF_013402795.1 s__Costertonia aggregata 76.5255 134 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Costertonia 95.0 N/A N/A N/A N/A 1 - GCF_003075045.1 s__Maribacter litoralis 76.5114 137 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Maribacter 95.0 97.43 97.43 0.90 0.90 2 - GCF_001913155.1 s__Maribacter hydrothermalis 76.4934 127 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Maribacter 95.0 100.00 100.00 1.00 1.00 2 - GCF_012272855.1 s__Arenibacter sp012272855 76.4929 99 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Arenibacter 95.0 N/A N/A N/A N/A 1 - GCF_009674825.1 s__Maribacter_A luteus 76.4751 142 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Maribacter_A 95.0 N/A N/A N/A N/A 1 - GCF_014855515.1 s__Euzebyella marina 76.4453 141 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Euzebyella 95.0 97.49 97.10 0.92 0.88 10 - GCF_000382125.1 s__Eudoraea adriatica 76.3938 74 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Eudoraea 95.0 N/A N/A N/A N/A 1 - GCF_900109345.1 s__Maribacter orientalis 76.3833 128 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Maribacter 95.0 N/A N/A N/A N/A 1 - GCF_014610845.1 s__Maribacter_A arenosus 76.3294 131 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Maribacter_A 95.0 N/A N/A N/A N/A 1 - GCA_003232435.1 s__Aurantibacter_A sp003232435 76.2765 87 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Aurantibacter_A 95.0 N/A N/A N/A N/A 1 - GCA_013001935.1 s__Maribacter_A sp013001935 76.2441 69 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Maribacter_A 95.0 96.76 96.72 0.66 0.66 3 - GCF_004804315.1 s__Muricauda sp004804315 76.1126 85 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Muricauda 95.0 N/A N/A N/A N/A 1 - GCF_900167935.1 s__Maribacter arcticus 76.0778 125 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Maribacter 95.0 N/A N/A N/A N/A 1 - GCA_009901585.1 s__R33 sp009901585 75.991 67 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__R33 95.0 N/A N/A N/A N/A 1 - GCF_018861125.1 s__Cellulophaga baltica_A 75.889 108 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Cellulophaga 95.0 N/A N/A N/A N/A 1 - GCA_013042315.1 s__Muriicola sp013042315 75.8281 51 1565 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Muriicola 95.0 99.31 99.31 0.85 0.85 2 - -------------------------------------------------------------------------------- [2023-03-14 15:05:49,593] [INFO] GTDB search result was written to OceanDNA-b7449/result_gtdb.tsv [2023-03-14 15:05:49,594] [INFO] ===== GTDB Search completed ===== [2023-03-14 15:05:49,596] [INFO] DFAST_QC result json was written to OceanDNA-b7449/dqc_result.json [2023-03-14 15:05:49,596] [INFO] DFAST_QC completed! [2023-03-14 15:05:49,596] [INFO] Total running time: 0h3m3s