[2023-03-17 07:01:49,598] [INFO] DFAST_QC pipeline started. [2023-03-17 07:01:49,598] [INFO] DFAST_QC version: 0.5.7 [2023-03-17 07:01:49,598] [INFO] DQC Reference Directory: /var/lib/cwl/stg1ef090bd-14f6-4294-8b58-1a2fc14c7eca/dqc_reference [2023-03-17 07:01:51,533] [INFO] ===== Start taxonomy check using ANI ===== [2023-03-17 07:01:51,533] [INFO] Task started: Prodigal [2023-03-17 07:01:51,533] [INFO] Running command: cat /var/lib/cwl/stg8f5c8128-7fdf-4dcd-a5f2-b76877e54562/OceanDNA-b7344.fa | prodigal -d OceanDNA-b7344/cds.fna -a OceanDNA-b7344/protein.faa -g 11 -q > /dev/null [2023-03-17 07:02:03,069] [INFO] Task succeeded: Prodigal [2023-03-17 07:02:03,069] [INFO] Task started: HMMsearch [2023-03-17 07:02:03,069] [INFO] Running command: hmmsearch --tblout OceanDNA-b7344/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg1ef090bd-14f6-4294-8b58-1a2fc14c7eca/dqc_reference/reference_markers.hmm OceanDNA-b7344/protein.faa > /dev/null [2023-03-17 07:02:03,288] [INFO] Task succeeded: HMMsearch [2023-03-17 07:02:03,289] [INFO] Found 6/6 markers. [2023-03-17 07:02:03,301] [INFO] Query marker FASTA was written to OceanDNA-b7344/markers.fasta [2023-03-17 07:02:03,302] [INFO] Task started: Blastn [2023-03-17 07:02:03,302] [INFO] Running command: blastn -query OceanDNA-b7344/markers.fasta -db /var/lib/cwl/stg1ef090bd-14f6-4294-8b58-1a2fc14c7eca/dqc_reference/reference_markers.fasta -out OceanDNA-b7344/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-17 07:02:03,902] [INFO] Task succeeded: Blastn [2023-03-17 07:02:03,903] [INFO] Selected 31 target genomes. [2023-03-17 07:02:03,903] [INFO] Target genome list was writen to OceanDNA-b7344/target_genomes.txt [2023-03-17 07:02:03,919] [INFO] Task started: fastANI [2023-03-17 07:02:03,920] [INFO] Running command: fastANI --query /var/lib/cwl/stg8f5c8128-7fdf-4dcd-a5f2-b76877e54562/OceanDNA-b7344.fa --refList OceanDNA-b7344/target_genomes.txt --output OceanDNA-b7344/fastani_result.tsv --threads 1 [2023-03-17 07:02:22,306] [INFO] Task succeeded: fastANI [2023-03-17 07:02:22,306] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg1ef090bd-14f6-4294-8b58-1a2fc14c7eca/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-03-17 07:02:22,307] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg1ef090bd-14f6-4294-8b58-1a2fc14c7eca/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-03-17 07:02:22,322] [INFO] Found 28 fastANI hits (0 hits with ANI > threshold) [2023-03-17 07:02:22,322] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-03-17 07:02:22,322] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Algibacter pacificus strain=H164 GCA_008033385.1 2599389 2599389 type True 76.6188 73 572 95 below_threshold Mesoflavibacter profundi strain=YC1039 GCA_014764305.1 2708110 2708110 type True 76.5631 72 572 95 below_threshold Tamlana haliotis strain=B1N29 GCA_008806375.1 2614804 2614804 type True 76.5217 53 572 95 below_threshold Mariniflexile fucanivorans strain=DSM 18792 GCA_004341235.1 264023 264023 type True 76.4356 77 572 95 below_threshold Winogradskyella psychrotolerans strain=RS-3 GCA_000427335.1 1344585 1344585 type True 76.384 86 572 95 below_threshold Flavivirga rizhaonensis strain=RZ03 GCA_004791695.1 2559571 2559571 type True 76.3496 79 572 95 below_threshold Winogradskyella schleiferi strain=Z215 GCA_013394655.1 2686078 2686078 type True 76.3393 73 572 95 below_threshold Bizionia paragorgiae strain=DSM 23842 GCA_900107625.1 283786 283786 type True 76.3221 65 572 95 below_threshold Pontimicrobium aquaticum strain=CAU 1491 GCA_005047595.1 2565367 2565367 type True 76.3195 103 572 95 below_threshold Tamlana agarivorans strain=JW-26 GCA_001642835.1 481183 481183 type True 76.2989 74 572 95 below_threshold Algibacter amylolyticus strain=DSM 29199 GCA_014202225.1 1608400 1608400 type True 76.2715 72 572 95 below_threshold Winogradskyella thalassocola strain=DSM 15363 GCA_900099995.1 262004 262004 type True 76.2709 83 572 95 below_threshold Algibacter amylolyticus strain=RU-4-M-4 GCA_007559325.1 1608400 1608400 type True 76.2485 73 572 95 below_threshold Algibacter marinivivus strain=ZY111 GCA_003143755.1 2100723 2100723 type True 76.2384 90 572 95 below_threshold Hyunsoonleella pacifica strain=SW033 GCA_004310335.1 1080224 1080224 type True 76.2205 67 572 95 below_threshold Hyunsoonleella pacifica strain=CGMCC 1.11009 GCA_014636335.1 1080224 1080224 type True 76.2205 67 572 95 below_threshold Winogradskyella vidalii strain=HL634 GCA_013403955.1 2615024 2615024 type True 76.2199 86 572 95 below_threshold Flavivirga algicola strain=Y03 GCA_012910715.1 2729136 2729136 type True 76.1539 78 572 95 below_threshold Confluentibacter sediminis strain=DSL-48 GCA_003258355.1 2219045 2219045 type True 76.1345 93 572 95 below_threshold Arenitalea lutea strain=CGMCC 1.12213 GCA_900141715.1 1178825 1178825 type True 76.1325 95 572 95 below_threshold Yeosuana marina strain=JLT21 GCA_011762485.1 1565536 1565536 type True 76.1302 80 572 95 below_threshold Lacinutrix himadriensis strain=E4-9a GCA_001418105.1 641549 641549 type True 76.0627 81 572 95 below_threshold Mariniflexile gromovii strain=KCTC 12570 GCA_017814435.1 362523 362523 type True 75.9787 75 572 95 below_threshold Winogradskyella algicola strain=IMCC33238 GCA_005869935.1 2575815 2575815 type True 75.9634 57 572 95 below_threshold Kordia jejudonensis strain=SSK3-3 GCA_001005315.1 1348245 1348245 type True 75.9552 76 572 95 below_threshold Algibacter pectinivorans strain=DSM 25730 GCA_900112595.1 870482 870482 type True 75.9267 76 572 95 below_threshold Winogradskyella flava strain=KCTC 52348 GCA_014243395.1 1884876 1884876 type True 75.8366 52 572 95 below_threshold Aestuariivivens marinum strain=MT3-5-12 GCA_022662175.1 2913555 2913555 type True 75.6112 62 572 95 below_threshold -------------------------------------------------------------------------------- [2023-03-17 07:02:22,323] [INFO] DFAST Taxonomy check result was written to OceanDNA-b7344/tc_result.tsv [2023-03-17 07:02:22,324] [INFO] ===== Taxonomy check completed ===== [2023-03-17 07:02:22,324] [INFO] ===== Start completeness check using CheckM ===== [2023-03-17 07:02:22,324] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg1ef090bd-14f6-4294-8b58-1a2fc14c7eca/dqc_reference/checkm_data [2023-03-17 07:02:22,325] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-03-17 07:02:22,328] [INFO] Task started: CheckM [2023-03-17 07:02:22,328] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b7344/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b7344/checkm_input OceanDNA-b7344/checkm_result [2023-03-17 07:02:55,397] [INFO] Task succeeded: CheckM [2023-03-17 07:02:55,397] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 72.92% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-03-17 07:02:55,399] [INFO] ===== Completeness check finished ===== [2023-03-17 07:02:55,399] [INFO] ===== Start GTDB Search ===== [2023-03-17 07:02:55,399] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b7344/markers.fasta) [2023-03-17 07:02:55,399] [INFO] Task started: Blastn [2023-03-17 07:02:55,399] [INFO] Running command: blastn -query OceanDNA-b7344/markers.fasta -db /var/lib/cwl/stg1ef090bd-14f6-4294-8b58-1a2fc14c7eca/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b7344/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-17 07:02:56,278] [INFO] Task succeeded: Blastn [2023-03-17 07:02:56,279] [INFO] Selected 8 target genomes. [2023-03-17 07:02:56,279] [INFO] Target genome list was writen to OceanDNA-b7344/target_genomes_gtdb.txt [2023-03-17 07:02:56,372] [INFO] Task started: fastANI [2023-03-17 07:02:56,372] [INFO] Running command: fastANI --query /var/lib/cwl/stg8f5c8128-7fdf-4dcd-a5f2-b76877e54562/OceanDNA-b7344.fa --refList OceanDNA-b7344/target_genomes_gtdb.txt --output OceanDNA-b7344/fastani_result_gtdb.tsv --threads 1 [2023-03-17 07:02:59,709] [INFO] Task succeeded: fastANI [2023-03-17 07:02:59,715] [INFO] Found 8 fastANI hits (0 hits with ANI > circumscription radius) [2023-03-17 07:02:59,715] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_018608765.1 s__Hel1-33-131 sp018608765 81.1438 399 572 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Hel1-33-131 95.0 N/A N/A N/A N/A 1 - GCF_001735745.1 s__Hel1-33-131 sp001735745 80.9882 421 572 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Hel1-33-131 95.0 N/A N/A N/A N/A 1 - GCA_018608735.1 s__Hel1-33-131 sp018608735 80.7775 310 572 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Hel1-33-131 95.0 N/A N/A N/A N/A 1 - GCA_018608725.1 s__Hel1-33-131 sp018608725 80.5887 356 572 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Hel1-33-131 95.0 N/A N/A N/A N/A 1 - GCA_018673215.1 s__Hel1-33-131 sp018673215 80.2449 389 572 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Hel1-33-131 95.0 N/A N/A N/A N/A 1 - GCA_905479095.1 s__Hel1-33-131 sp905479095 79.9534 338 572 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Hel1-33-131 95.0 N/A N/A N/A N/A 1 - GCA_905181825.1 s__Hel1-33-131 sp905181825 78.4017 212 572 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Hel1-33-131 95.0 99.01 99.01 0.77 0.77 2 - GCA_018608625.1 s__Hel1-33-131 sp018608625 78.1957 228 572 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Hel1-33-131 95.0 98.93 97.98 0.88 0.82 3 - -------------------------------------------------------------------------------- [2023-03-17 07:02:59,715] [INFO] GTDB search result was written to OceanDNA-b7344/result_gtdb.tsv [2023-03-17 07:02:59,715] [INFO] ===== GTDB Search completed ===== [2023-03-17 07:02:59,717] [INFO] DFAST_QC result json was written to OceanDNA-b7344/dqc_result.json [2023-03-17 07:02:59,718] [INFO] DFAST_QC completed! [2023-03-17 07:02:59,718] [INFO] Total running time: 0h1m10s