[2023-03-17 03:11:36,329] [INFO] DFAST_QC pipeline started. [2023-03-17 03:11:36,329] [INFO] DFAST_QC version: 0.5.7 [2023-03-17 03:11:36,329] [INFO] DQC Reference Directory: /var/lib/cwl/stg94c4caed-fe0d-4f1e-8285-ff198c07f5c0/dqc_reference [2023-03-17 03:11:37,522] [INFO] ===== Start taxonomy check using ANI ===== [2023-03-17 03:11:37,522] [INFO] Task started: Prodigal [2023-03-17 03:11:37,522] [INFO] Running command: cat /var/lib/cwl/stg90ef2885-9da7-40ce-a663-2457742363a4/OceanDNA-b6820.fa | prodigal -d OceanDNA-b6820/cds.fna -a OceanDNA-b6820/protein.faa -g 11 -q > /dev/null [2023-03-17 03:11:48,054] [INFO] Task succeeded: Prodigal [2023-03-17 03:11:48,054] [INFO] Task started: HMMsearch [2023-03-17 03:11:48,054] [INFO] Running command: hmmsearch --tblout OceanDNA-b6820/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg94c4caed-fe0d-4f1e-8285-ff198c07f5c0/dqc_reference/reference_markers.hmm OceanDNA-b6820/protein.faa > /dev/null [2023-03-17 03:11:48,264] [INFO] Task succeeded: HMMsearch [2023-03-17 03:11:48,265] [WARNING] Found 5/6 markers. [/var/lib/cwl/stg90ef2885-9da7-40ce-a663-2457742363a4/OceanDNA-b6820.fa] [2023-03-17 03:11:48,294] [INFO] Query marker FASTA was written to OceanDNA-b6820/markers.fasta [2023-03-17 03:11:48,295] [INFO] Task started: Blastn [2023-03-17 03:11:48,295] [INFO] Running command: blastn -query OceanDNA-b6820/markers.fasta -db /var/lib/cwl/stg94c4caed-fe0d-4f1e-8285-ff198c07f5c0/dqc_reference/reference_markers.fasta -out OceanDNA-b6820/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-17 03:11:48,858] [INFO] Task succeeded: Blastn [2023-03-17 03:11:48,859] [INFO] Selected 17 target genomes. [2023-03-17 03:11:48,859] [INFO] Target genome list was writen to OceanDNA-b6820/target_genomes.txt [2023-03-17 03:11:48,935] [INFO] Task started: fastANI [2023-03-17 03:11:48,935] [INFO] Running command: fastANI --query /var/lib/cwl/stg90ef2885-9da7-40ce-a663-2457742363a4/OceanDNA-b6820.fa --refList OceanDNA-b6820/target_genomes.txt --output OceanDNA-b6820/fastani_result.tsv --threads 1 [2023-03-17 03:11:58,809] [INFO] Task succeeded: fastANI [2023-03-17 03:11:58,809] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg94c4caed-fe0d-4f1e-8285-ff198c07f5c0/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-03-17 03:11:58,809] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg94c4caed-fe0d-4f1e-8285-ff198c07f5c0/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-03-17 03:11:58,819] [INFO] Found 16 fastANI hits (0 hits with ANI > threshold) [2023-03-17 03:11:58,819] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-03-17 03:11:58,819] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Flavobacterium gelidilacus strain=DSM 15343 GCA_000422685.1 206041 206041 type True 81.9855 303 453 95 below_threshold Flavobacterium proteolyticum strain=1Y8A GCA_015223105.1 2911683 2911683 type True 78.0792 168 453 95 below_threshold Flavobacterium jejuense strain=EC11 GCA_006491595.2 1544455 1544455 type True 77.8796 192 453 95 below_threshold Flavobacterium celericrescens strain=TWA-26 GCA_011392075.1 2709780 2709780 type True 77.8121 165 453 95 below_threshold Flavobacterium channae strain=KSM-R2A30 GCA_021172165.1 2897181 2897181 type True 77.7892 183 453 95 below_threshold Flavobacterium sediminilitoris strain=YSM-43 GCA_023008245.1 2024526 2024526 type True 77.7731 183 453 95 below_threshold Flavobacterium cyclinae strain=KSM-R2A25 GCA_021172145.1 2895947 2895947 type True 77.766 170 453 95 below_threshold Flavobacterium terrigena strain=DSM 17934 GCA_900108955.1 402734 402734 type True 77.5166 128 453 95 below_threshold Flavobacterium cucumis strain=DSM 18830 GCA_900148835.1 416016 416016 type True 77.4703 148 453 95 below_threshold Flavobacterium profundi strain=TP390 GCA_009753805.1 1774945 1774945 type True 77.4279 183 453 95 below_threshold Flavobacterium urocaniciphilum strain=DSM 27078 GCA_900110615.1 1299341 1299341 type True 77.3752 128 453 95 below_threshold Flavobacterium jumunjinense strain=HME7102 GCA_021650975.2 998845 998845 type True 77.3354 171 453 95 below_threshold Flavobacterium bernardetii strain=F-372 GCA_011305415.1 2813823 2813823 type True 77.0103 136 453 95 below_threshold Flavobacterium amnicola strain=LLJ-11 GCA_004122165.1 2506422 2506422 type True 76.9607 83 453 95 below_threshold Flavobacterium terrae strain=DSM 18829 GCA_900142035.1 415425 415425 type True 76.5701 110 453 95 below_threshold Flavobacterium ranwuense strain=LB2P22 GCA_004349315.1 2541725 2541725 type True 76.2226 101 453 95 below_threshold -------------------------------------------------------------------------------- [2023-03-17 03:11:58,819] [INFO] DFAST Taxonomy check result was written to OceanDNA-b6820/tc_result.tsv [2023-03-17 03:11:58,820] [INFO] ===== Taxonomy check completed ===== [2023-03-17 03:11:58,820] [INFO] ===== Start completeness check using CheckM ===== [2023-03-17 03:11:58,820] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg94c4caed-fe0d-4f1e-8285-ff198c07f5c0/dqc_reference/checkm_data [2023-03-17 03:11:58,821] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-03-17 03:11:58,834] [INFO] Task started: CheckM [2023-03-17 03:11:58,834] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b6820/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b6820/checkm_input OceanDNA-b6820/checkm_result [2023-03-17 03:12:30,065] [INFO] Task succeeded: CheckM [2023-03-17 03:12:30,066] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 45.83% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-03-17 03:12:30,068] [INFO] ===== Completeness check finished ===== [2023-03-17 03:12:30,068] [INFO] ===== Start GTDB Search ===== [2023-03-17 03:12:30,068] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b6820/markers.fasta) [2023-03-17 03:12:30,069] [INFO] Task started: Blastn [2023-03-17 03:12:30,069] [INFO] Running command: blastn -query OceanDNA-b6820/markers.fasta -db /var/lib/cwl/stg94c4caed-fe0d-4f1e-8285-ff198c07f5c0/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b6820/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-17 03:12:30,821] [INFO] Task succeeded: Blastn [2023-03-17 03:12:30,822] [INFO] Selected 19 target genomes. [2023-03-17 03:12:30,822] [INFO] Target genome list was writen to OceanDNA-b6820/target_genomes_gtdb.txt [2023-03-17 03:12:31,008] [INFO] Task started: fastANI [2023-03-17 03:12:31,009] [INFO] Running command: fastANI --query /var/lib/cwl/stg90ef2885-9da7-40ce-a663-2457742363a4/OceanDNA-b6820.fa --refList OceanDNA-b6820/target_genomes_gtdb.txt --output OceanDNA-b6820/fastani_result_gtdb.tsv --threads 1 [2023-03-17 03:12:39,867] [INFO] Task succeeded: fastANI [2023-03-17 03:12:39,878] [INFO] Found 19 fastANI hits (0 hits with ANI > circumscription radius) [2023-03-17 03:12:39,878] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_000422685.1 s__Flavobacterium gelidilacus 81.9855 303 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 N/A N/A N/A N/A 1 - GCF_015223105.1 s__Flavobacterium aquaticum_A 78.0373 170 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 N/A N/A N/A N/A 1 - GCA_002862805.1 s__Flavobacterium sp002862805 78.0368 173 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 N/A N/A N/A N/A 1 - GCA_000169355.1 s__Flavobacterium sp000169355 77.9001 179 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 96.75 96.74 0.84 0.84 3 - GCA_003497625.1 s__Flavobacterium sp003497625 77.8837 170 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 99.98 99.98 0.98 0.98 2 - GCF_006491595.2 s__Flavobacterium jejuense 77.8796 192 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 N/A N/A N/A N/A 1 - GCF_000967805.1 s__Flavobacterium sp000967805 77.7131 180 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 N/A N/A N/A N/A 1 - GCA_012032075.1 s__Flavobacterium sp012032075 77.706 154 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 N/A N/A N/A N/A 1 - GCF_900108955.1 s__Flavobacterium terrigena 77.5137 128 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 N/A N/A N/A N/A 1 - GCF_006491645.1 s__Flavobacterium sp006491645 77.4279 183 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 100.00 100.00 1.00 1.00 2 - GCF_900110615.1 s__Flavobacterium urocaniciphilum 77.3981 127 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 N/A N/A N/A N/A 1 - GCA_017985075.1 s__Flavobacterium sp017985075 77.2498 111 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 N/A N/A N/A N/A 1 - GCA_903889745.1 s__Flavobacterium sp903889745 77.1759 114 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 N/A N/A N/A N/A 1 - GCA_017983975.1 s__Flavobacterium sp017983975 77.1275 114 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 N/A N/A N/A N/A 1 - GCA_002737105.1 s__Flavobacterium sp002737105 77.0207 50 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 N/A N/A N/A N/A 1 - GCF_004122165.1 s__Flavobacterium sp004122165 77.0147 82 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 N/A N/A N/A N/A 1 - GCF_003634825.1 s__Flavobacterium sp003634825 76.574 80 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 N/A N/A N/A N/A 1 - GCF_900142035.1 s__Flavobacterium terrae 76.5701 110 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 N/A N/A N/A N/A 1 - GCF_004349315.1 s__Flavobacterium sp004349315 76.2074 102 453 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flavobacterium 95.0 98.48 98.48 0.90 0.90 2 - -------------------------------------------------------------------------------- [2023-03-17 03:12:39,878] [INFO] GTDB search result was written to OceanDNA-b6820/result_gtdb.tsv [2023-03-17 03:12:39,878] [INFO] ===== GTDB Search completed ===== [2023-03-17 03:12:39,880] [INFO] DFAST_QC result json was written to OceanDNA-b6820/dqc_result.json [2023-03-17 03:12:39,880] [INFO] DFAST_QC completed! [2023-03-17 03:12:39,880] [INFO] Total running time: 0h1m4s