[2023-03-19 00:48:24,537] [INFO] DFAST_QC pipeline started. [2023-03-19 00:48:24,537] [INFO] DFAST_QC version: 0.5.7 [2023-03-19 00:48:24,537] [INFO] DQC Reference Directory: /var/lib/cwl/stgc952af98-82d0-4a37-b6d6-2cef26ac3095/dqc_reference [2023-03-19 00:48:26,000] [INFO] ===== Start taxonomy check using ANI ===== [2023-03-19 00:48:26,001] [INFO] Task started: Prodigal [2023-03-19 00:48:26,001] [INFO] Running command: cat /var/lib/cwl/stg19b455ea-a57d-43ad-934c-00460baedca4/OceanDNA-b9724.fa | prodigal -d OceanDNA-b9724/cds.fna -a OceanDNA-b9724/protein.faa -g 11 -q > /dev/null [2023-03-19 00:48:35,526] [INFO] Task succeeded: Prodigal [2023-03-19 00:48:35,527] [INFO] Task started: HMMsearch [2023-03-19 00:48:35,527] [INFO] Running command: hmmsearch --tblout OceanDNA-b9724/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgc952af98-82d0-4a37-b6d6-2cef26ac3095/dqc_reference/reference_markers.hmm OceanDNA-b9724/protein.faa > /dev/null [2023-03-19 00:48:35,701] [INFO] Task succeeded: HMMsearch [2023-03-19 00:48:35,702] [WARNING] Found 5/6 markers. [/var/lib/cwl/stg19b455ea-a57d-43ad-934c-00460baedca4/OceanDNA-b9724.fa] [2023-03-19 00:48:35,719] [INFO] Query marker FASTA was written to OceanDNA-b9724/markers.fasta [2023-03-19 00:48:35,720] [INFO] Task started: Blastn [2023-03-19 00:48:35,721] [INFO] Running command: blastn -query OceanDNA-b9724/markers.fasta -db /var/lib/cwl/stgc952af98-82d0-4a37-b6d6-2cef26ac3095/dqc_reference/reference_markers.fasta -out OceanDNA-b9724/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-19 00:48:36,259] [INFO] Task succeeded: Blastn [2023-03-19 00:48:36,260] [INFO] Selected 22 target genomes. [2023-03-19 00:48:36,260] [INFO] Target genome list was writen to OceanDNA-b9724/target_genomes.txt [2023-03-19 00:48:36,277] [INFO] Task started: fastANI [2023-03-19 00:48:36,277] [INFO] Running command: fastANI --query /var/lib/cwl/stg19b455ea-a57d-43ad-934c-00460baedca4/OceanDNA-b9724.fa --refList OceanDNA-b9724/target_genomes.txt --output OceanDNA-b9724/fastani_result.tsv --threads 1 [2023-03-19 00:48:48,370] [INFO] Task succeeded: fastANI [2023-03-19 00:48:48,370] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgc952af98-82d0-4a37-b6d6-2cef26ac3095/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-03-19 00:48:48,371] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgc952af98-82d0-4a37-b6d6-2cef26ac3095/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-03-19 00:48:48,383] [INFO] Found 21 fastANI hits (1 hits with ANI > threshold) [2023-03-19 00:48:48,383] [INFO] The taxonomy check result is classified as 'conclusive'. [2023-03-19 00:48:48,383] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Aurantibacter aestuarii strain=KCTC 32269 GCA_003008425.1 1266046 1266046 type True 96.7497 408 444 95 conclusive Mesoflavibacter zeaxanthinifaciens strain=DSM 18436 GCA_000422365.1 393060 393060 type True 77.979 131 444 95 below_threshold Olleya aquimaris strain=DSM 24464 GCA_003259525.1 639310 639310 type True 77.9281 133 444 95 below_threshold Algibacter pacificus strain=H164 GCA_008033385.1 2599389 2599389 type True 77.8078 95 444 95 below_threshold Algibacter marinivivus strain=ZY111 GCA_003143755.1 2100723 2100723 type True 77.6948 103 444 95 below_threshold Lacinutrix jangbogonensis strain=PAMC 27137 GCA_000797445.1 1469557 1469557 type True 77.6884 115 444 95 below_threshold Lacinutrix mariniflava strain=AKS432 GCA_001418015.1 342955 342955 type True 77.6866 138 444 95 below_threshold Winogradskyella eckloniae strain=EC29 GCA_013249045.1 1089306 1089306 type True 77.6756 91 444 95 below_threshold Olleya marilimosa strain=CAM030 GCA_000518485.1 272164 272164 type True 77.6717 134 444 95 below_threshold Algibacter amylolyticus strain=DSM 29199 GCA_014202225.1 1608400 1608400 type True 77.5756 107 444 95 below_threshold Lacinutrix himadriensis strain=E4-9a GCA_001418105.1 641549 641549 type True 77.5747 132 444 95 below_threshold Algibacter amylolyticus strain=RU-4-M-4 GCA_008630605.1 1608400 1608400 type True 77.5656 108 444 95 below_threshold Algibacter amylolyticus strain=RU-4-M-4 GCA_007559325.1 1608400 1608400 type True 77.5656 108 444 95 below_threshold Mesoflavibacter profundi strain=YC1039 GCA_014764305.1 2708110 2708110 type True 77.5202 154 444 95 below_threshold Winogradskyella epiphytica strain=KCTC 12220 GCA_014651315.1 262005 262005 type True 77.5147 75 444 95 below_threshold Winogradskyella epiphytica strain=CECT 7945 GCA_003217215.1 262005 262005 type True 77.4284 75 444 95 below_threshold Pontimicrobium aquaticum strain=CAU 1491 GCA_005047595.1 2565367 2565367 type True 77.3678 78 444 95 below_threshold Arenitalea lutea strain=CGMCC 1.12213 GCA_900141715.1 1178825 1178825 type True 77.307 105 444 95 below_threshold Bizionia algoritergicola strain=APA-1 GCA_008086165.1 291187 291187 type True 76.9218 101 444 95 below_threshold Aestuariivivens marinum strain=MT3-5-12 GCA_022662175.1 2913555 2913555 type True 76.9211 56 444 95 below_threshold Hyunsoonleella ulvae strain=HU1-3 GCA_016827605.1 2799948 2799948 type True 76.8689 71 444 95 below_threshold -------------------------------------------------------------------------------- [2023-03-19 00:48:48,384] [INFO] DFAST Taxonomy check result was written to OceanDNA-b9724/tc_result.tsv [2023-03-19 00:48:48,384] [INFO] ===== Taxonomy check completed ===== [2023-03-19 00:48:48,384] [INFO] ===== Start completeness check using CheckM ===== [2023-03-19 00:48:48,384] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgc952af98-82d0-4a37-b6d6-2cef26ac3095/dqc_reference/checkm_data [2023-03-19 00:48:48,385] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-03-19 00:48:48,389] [INFO] Task started: CheckM [2023-03-19 00:48:48,389] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b9724/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b9724/checkm_input OceanDNA-b9724/checkm_result [2023-03-19 00:49:17,013] [INFO] Task succeeded: CheckM [2023-03-19 00:49:17,013] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 95.83% Contamintation: 0.52% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-03-19 00:49:17,092] [INFO] ===== Completeness check finished ===== [2023-03-19 00:49:17,092] [INFO] ===== Start GTDB Search ===== [2023-03-19 00:49:17,092] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b9724/markers.fasta) [2023-03-19 00:49:17,096] [INFO] Task started: Blastn [2023-03-19 00:49:17,096] [INFO] Running command: blastn -query OceanDNA-b9724/markers.fasta -db /var/lib/cwl/stgc952af98-82d0-4a37-b6d6-2cef26ac3095/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b9724/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-19 00:49:17,854] [INFO] Task succeeded: Blastn [2023-03-19 00:49:17,856] [INFO] Selected 21 target genomes. [2023-03-19 00:49:17,856] [INFO] Target genome list was writen to OceanDNA-b9724/target_genomes_gtdb.txt [2023-03-19 00:49:17,871] [INFO] Task started: fastANI [2023-03-19 00:49:17,871] [INFO] Running command: fastANI --query /var/lib/cwl/stg19b455ea-a57d-43ad-934c-00460baedca4/OceanDNA-b9724.fa --refList OceanDNA-b9724/target_genomes_gtdb.txt --output OceanDNA-b9724/fastani_result_gtdb.tsv --threads 1 [2023-03-19 00:49:29,876] [INFO] Task succeeded: fastANI [2023-03-19 00:49:29,888] [INFO] Found 21 fastANI hits (1 hits with ANI > circumscription radius) [2023-03-19 00:49:29,888] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_003008425.1 s__Aurantibacter aestuarii 96.7497 408 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Aurantibacter 95.0 N/A N/A N/A N/A 1 conclusive GCF_003663945.1 s__Lacinutrix venerupis 77.9996 145 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Lacinutrix 95.0 98.08 98.08 0.92 0.92 2 - GCF_000422365.1 s__Mesoflavibacter zeaxanthinifaciens 77.9515 132 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Mesoflavibacter 95.0 97.07 97.05 0.91 0.91 3 - GCF_003259525.1 s__Olleya aquimaris 77.9281 133 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Olleya 95.0 N/A N/A N/A N/A 1 - GCF_002831665.1 s__Lacinutrix sp002831665 77.8715 139 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Lacinutrix 95.0 N/A N/A N/A N/A 1 - GCA_001874145.1 s__Lacinutrix sp001874145 77.862 150 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Lacinutrix 95.0 N/A N/A N/A N/A 1 - GCF_016785945.1 s__Olleya sediminilitoris 77.7714 153 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Olleya 95.0 98.16 97.66 0.93 0.91 5 - GCF_001418085.1 s__Lacinutrix algicola 77.7601 149 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Lacinutrix 95.0 N/A N/A N/A N/A 1 - GCF_001418015.1 s__Lacinutrix mariniflava 77.7342 136 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Lacinutrix 95.0 N/A N/A N/A N/A 1 - GCF_008806375.1 s__Tamlana_A haliotis 77.7196 93 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Tamlana_A 95.0 N/A N/A N/A N/A 1 - GCF_003143755.1 s__Algibacter_B sp003143755 77.6948 103 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Algibacter_B 95.0 N/A N/A N/A N/A 1 - GCF_015355625.1 s__Tamlana_A sp015355625 77.6925 99 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Tamlana_A 95.0 N/A N/A N/A N/A 1 - GCF_000797445.1 s__Lacinutrix jangbogonensis 77.6884 115 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Lacinutrix 95.0 N/A N/A N/A N/A 1 - GCF_000518485.1 s__Olleya marilimosa 77.6717 134 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Olleya 95.0 97.58 97.25 0.92 0.89 4 - GCF_000211855.2 s__Lacinutrix sp000211855 77.6519 160 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Lacinutrix 95.0 N/A N/A N/A N/A 1 - GCF_007827365.1 s__Olleya sp002323495 77.6367 140 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Olleya 95.0 96.78 96.78 0.90 0.90 2 - GCF_014202225.1 s__Algibacter_B amylolyticus 77.5756 107 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Algibacter_B 95.0 100.00 100.00 1.00 1.00 3 - GCF_001418105.1 s__Oceanihabitans himadriensis 77.5747 132 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Oceanihabitans 95.0 N/A N/A N/A N/A 1 - GCF_014764305.1 s__Mesoflavibacter profundi 77.5202 154 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Mesoflavibacter 95.0 99.53 99.53 0.98 0.98 3 - GCF_003217215.1 s__Winogradskyella epiphytica 77.4284 75 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Winogradskyella 95.0 99.99 99.99 1.00 1.00 2 - GCF_014297365.1 s__Winogradskyella echinorum 77.4187 93 444 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Winogradskyella 95.0 100.00 100.00 1.00 1.00 2 - -------------------------------------------------------------------------------- [2023-03-19 00:49:29,889] [INFO] GTDB search result was written to OceanDNA-b9724/result_gtdb.tsv [2023-03-19 00:49:29,890] [INFO] ===== GTDB Search completed ===== [2023-03-19 00:49:29,893] [INFO] DFAST_QC result json was written to OceanDNA-b9724/dqc_result.json [2023-03-19 00:49:29,894] [INFO] DFAST_QC completed! [2023-03-19 00:49:29,894] [INFO] Total running time: 0h1m5s