[2023-03-15 09:48:18,535] [INFO] DFAST_QC pipeline started. [2023-03-15 09:48:18,536] [INFO] DFAST_QC version: 0.5.7 [2023-03-15 09:48:18,536] [INFO] DQC Reference Directory: /var/lib/cwl/stg0ae20af0-4261-46e0-b20d-9c4eb095e8ae/dqc_reference [2023-03-15 09:48:19,626] [INFO] ===== Start taxonomy check using ANI ===== [2023-03-15 09:48:19,626] [INFO] Task started: Prodigal [2023-03-15 09:48:19,627] [INFO] Running command: cat /var/lib/cwl/stg6e910076-29c6-4fba-a688-6fd2b6555fc1/OceanDNA-b10199.fa | prodigal -d OceanDNA-b10199/cds.fna -a OceanDNA-b10199/protein.faa -g 11 -q > /dev/null [2023-03-15 09:48:29,482] [INFO] Task succeeded: Prodigal [2023-03-15 09:48:29,482] [INFO] Task started: HMMsearch [2023-03-15 09:48:29,482] [INFO] Running command: hmmsearch --tblout OceanDNA-b10199/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg0ae20af0-4261-46e0-b20d-9c4eb095e8ae/dqc_reference/reference_markers.hmm OceanDNA-b10199/protein.faa > /dev/null [2023-03-15 09:48:29,706] [INFO] Task succeeded: HMMsearch [2023-03-15 09:48:29,707] [INFO] Found 6/6 markers. [2023-03-15 09:48:29,721] [INFO] Query marker FASTA was written to OceanDNA-b10199/markers.fasta [2023-03-15 09:48:29,722] [INFO] Task started: Blastn [2023-03-15 09:48:29,722] [INFO] Running command: blastn -query OceanDNA-b10199/markers.fasta -db /var/lib/cwl/stg0ae20af0-4261-46e0-b20d-9c4eb095e8ae/dqc_reference/reference_markers.fasta -out OceanDNA-b10199/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-15 09:48:30,299] [INFO] Task succeeded: Blastn [2023-03-15 09:48:30,300] [INFO] Selected 22 target genomes. [2023-03-15 09:48:30,300] [INFO] Target genome list was writen to OceanDNA-b10199/target_genomes.txt [2023-03-15 09:48:30,309] [INFO] Task started: fastANI [2023-03-15 09:48:30,309] [INFO] Running command: fastANI --query /var/lib/cwl/stg6e910076-29c6-4fba-a688-6fd2b6555fc1/OceanDNA-b10199.fa --refList OceanDNA-b10199/target_genomes.txt --output OceanDNA-b10199/fastani_result.tsv --threads 1 [2023-03-15 09:48:42,339] [INFO] Task succeeded: fastANI [2023-03-15 09:48:42,339] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg0ae20af0-4261-46e0-b20d-9c4eb095e8ae/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-03-15 09:48:42,339] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg0ae20af0-4261-46e0-b20d-9c4eb095e8ae/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-03-15 09:48:42,351] [INFO] Found 22 fastANI hits (0 hits with ANI > threshold) [2023-03-15 09:48:42,352] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-03-15 09:48:42,352] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Polaribacter pectinis strain=L12M9 GCA_014352875.1 2738844 2738844 type True 77.7757 181 503 95 below_threshold Polaribacter aquimarinus strain=ZY113 GCA_003129485.1 2100726 2100726 type True 77.7481 152 503 95 below_threshold Polaribacter reichenbachii strain=KCTC 23969 GCA_002814055.1 996801 996801 type True 77.733 162 503 95 below_threshold Polaribacter reichenbachii strain=6Alg 8T GCA_001975665.1 996801 996801 type True 77.6969 164 503 95 below_threshold Polaribacter butkevichii strain=KCTC 12100 GCA_002954605.1 218490 218490 type True 77.6901 168 503 95 below_threshold Polaribacter porphyrae strain=NBRC 108759 GCA_002954685.1 1137780 1137780 type True 77.6853 176 503 95 below_threshold Polaribacter reichenbachii strain=LMG 26443 GCA_020532845.1 996801 996801 type True 77.6738 162 503 95 below_threshold Polaribacter reichenbachii strain=KCTC 23969 GCA_001680875.1 996801 996801 type True 77.6534 164 503 95 below_threshold Tenacibaculum aquimarinum strain=K20-16 GCA_022478115.1 2910675 2910675 type True 77.6475 161 503 95 below_threshold Polaribacter porphyrae strain=LMG 26671 GCA_020532745.1 1137780 1137780 type True 77.6454 181 503 95 below_threshold Tenacibaculum todarodis strain=LPB0136 GCA_001889045.1 1850252 1850252 type True 77.5884 152 503 95 below_threshold Tenacibaculum haliotis strain=KCTC 52419 GCA_025215075.1 1888914 1888914 type True 77.5776 110 503 95 below_threshold Polaribacter septentrionalilitoris strain=ANORD1 GCA_009832745.1 2494657 2494657 type True 77.556 164 503 95 below_threshold Polaribacter atrinae strain=KACC 17473 GCA_001640115.1 1333662 1333662 type True 77.5145 158 503 95 below_threshold Polaribacter aestuariivivens strain=DBTF-3 GCA_005885675.1 2304626 2304626 type True 77.5105 170 503 95 below_threshold Polaribacter undariae strain=KCTC 42175 GCA_024918935.1 1574269 1574269 type True 77.4604 173 503 95 below_threshold Polaribacter vadi strain=LPB0003 GCA_001761365.1 1774273 1774273 type True 77.4445 175 503 95 below_threshold Polaribacter dokdonensis strain=DSW-5 GCA_900106865.1 326329 326329 type True 77.3464 177 503 95 below_threshold Polaribacter cellanae strain=SM13 GCA_017569185.1 2818493 2818493 type True 77.2563 163 503 95 below_threshold Polaribacter sejongensis strain=KCTC 23670 GCA_002814075.1 985043 985043 type True 77.2266 177 503 95 below_threshold Tenacibaculum ovolyticum strain=DSM 18103 GCA_000430545.1 104270 104270 type True 77.03 140 503 95 below_threshold Tenacibaculum adriaticum strain=DSM 18961 GCA_008124875.1 413713 413713 type True 76.908 113 503 95 below_threshold -------------------------------------------------------------------------------- [2023-03-15 09:48:42,352] [INFO] DFAST Taxonomy check result was written to OceanDNA-b10199/tc_result.tsv [2023-03-15 09:48:42,352] [INFO] ===== Taxonomy check completed ===== [2023-03-15 09:48:42,352] [INFO] ===== Start completeness check using CheckM ===== [2023-03-15 09:48:42,352] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg0ae20af0-4261-46e0-b20d-9c4eb095e8ae/dqc_reference/checkm_data [2023-03-15 09:48:42,353] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-03-15 09:48:42,356] [INFO] Task started: CheckM [2023-03-15 09:48:42,356] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b10199/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b10199/checkm_input OceanDNA-b10199/checkm_result [2023-03-15 09:49:11,603] [INFO] Task succeeded: CheckM [2023-03-15 09:49:11,603] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-03-15 09:49:11,605] [INFO] ===== Completeness check finished ===== [2023-03-15 09:49:11,605] [INFO] ===== Start GTDB Search ===== [2023-03-15 09:49:11,605] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b10199/markers.fasta) [2023-03-15 09:49:11,605] [INFO] Task started: Blastn [2023-03-15 09:49:11,605] [INFO] Running command: blastn -query OceanDNA-b10199/markers.fasta -db /var/lib/cwl/stg0ae20af0-4261-46e0-b20d-9c4eb095e8ae/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b10199/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-15 09:49:12,417] [INFO] Task succeeded: Blastn [2023-03-15 09:49:12,418] [INFO] Selected 15 target genomes. [2023-03-15 09:49:12,418] [INFO] Target genome list was writen to OceanDNA-b10199/target_genomes_gtdb.txt [2023-03-15 09:49:12,677] [INFO] Task started: fastANI [2023-03-15 09:49:12,677] [INFO] Running command: fastANI --query /var/lib/cwl/stg6e910076-29c6-4fba-a688-6fd2b6555fc1/OceanDNA-b10199.fa --refList OceanDNA-b10199/target_genomes_gtdb.txt --output OceanDNA-b10199/fastani_result_gtdb.tsv --threads 1 [2023-03-15 09:49:20,785] [INFO] Task succeeded: fastANI [2023-03-15 09:49:20,795] [INFO] Found 15 fastANI hits (1 hits with ANI > circumscription radius) [2023-03-15 09:49:20,795] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_015665265.1 s__SCGC-AAA160-P02 sp015665265 96.4591 447 503 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__SCGC-AAA160-P02 95.0 96.43 96.43 0.88 0.88 2 conclusive GCA_018608485.1 s__SCGC-AAA160-P02 sp018608485 92.4735 353 503 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__SCGC-AAA160-P02 95.0 99.40 99.40 0.78 0.78 2 - GCA_905479435.1 s__SCGC-AAA160-P02 sp905479435 83.7227 291 503 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__SCGC-AAA160-P02 95.0 N/A N/A N/A N/A 1 - GCA_000383355.1 s__SCGC-AAA160-P02 sp000383355 82.2057 386 503 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__SCGC-AAA160-P02 95.0 96.64 95.28 0.86 0.84 6 - GCF_002163675.1 s__Polaribacter sp002163675 77.8237 167 503 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Polaribacter 95.0 N/A N/A N/A N/A 1 - GCA_000981625.1 s__Polaribacter sp000981625 77.6796 136 503 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Polaribacter 95.0 N/A N/A N/A N/A 1 - GCF_002163835.1 s__Polaribacter sp002163835 77.6722 194 503 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Polaribacter 95.0 N/A N/A N/A N/A 1 - GCF_002954685.1 s__Polaribacter porphyrae 77.6676 177 503 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Polaribacter 95.0 N/A N/A N/A N/A 1 - GCF_007827455.1 s__VISM01 sp007827455 77.564 189 503 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__VISM01 95.0 N/A N/A N/A N/A 1 - GCF_001640115.1 s__Polaribacter atrinae 77.5145 158 503 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Polaribacter 95.0 N/A N/A N/A N/A 1 - GCA_016763215.1 s__SCGC-AAA160-P02 sp016763215 77.4677 89 503 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__SCGC-AAA160-P02 95.0 N/A N/A N/A N/A 1 - GCF_900105145.1 s__Polaribacter sp900105145 77.4369 176 503 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Polaribacter 95.0 N/A N/A N/A N/A 1 - GCF_000766795.1 s__Polaribacter sp000766795 77.3838 199 503 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Polaribacter 95.0 N/A N/A N/A N/A 1 - GCF_900105985.1 s__Tenacibaculum sp900105985 77.0309 134 503 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Tenacibaculum 95.0 98.69 98.69 0.93 0.93 2 - GCF_002836595.1 s__Tenacibaculum sp002836595 76.8345 150 503 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Tenacibaculum 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-03-15 09:49:20,795] [INFO] GTDB search result was written to OceanDNA-b10199/result_gtdb.tsv [2023-03-15 09:49:20,795] [INFO] ===== GTDB Search completed ===== [2023-03-15 09:49:20,797] [INFO] DFAST_QC result json was written to OceanDNA-b10199/dqc_result.json [2023-03-15 09:49:20,797] [INFO] DFAST_QC completed! [2023-03-15 09:49:20,797] [INFO] Total running time: 0h1m2s