[2023-03-19 02:51:48,254] [INFO] DFAST_QC pipeline started. [2023-03-19 02:51:48,255] [INFO] DFAST_QC version: 0.5.7 [2023-03-19 02:51:48,255] [INFO] DQC Reference Directory: /var/lib/cwl/stg69ec5c3c-50c1-44e7-92dc-0033e8dcf7fa/dqc_reference [2023-03-19 02:51:49,359] [INFO] ===== Start taxonomy check using ANI ===== [2023-03-19 02:51:49,359] [INFO] Task started: Prodigal [2023-03-19 02:51:49,359] [INFO] Running command: cat /var/lib/cwl/stg7ccec734-c1ca-4208-b492-dfb8e5a12bad/OceanDNA-b23176.fa | prodigal -d OceanDNA-b23176/cds.fna -a OceanDNA-b23176/protein.faa -g 11 -q > /dev/null [2023-03-19 02:52:33,384] [INFO] Task succeeded: Prodigal [2023-03-19 02:52:33,384] [INFO] Task started: HMMsearch [2023-03-19 02:52:33,384] [INFO] Running command: hmmsearch --tblout OceanDNA-b23176/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg69ec5c3c-50c1-44e7-92dc-0033e8dcf7fa/dqc_reference/reference_markers.hmm OceanDNA-b23176/protein.faa > /dev/null [2023-03-19 02:52:33,703] [INFO] Task succeeded: HMMsearch [2023-03-19 02:52:33,704] [INFO] Found 6/6 markers. [2023-03-19 02:52:33,754] [INFO] Query marker FASTA was written to OceanDNA-b23176/markers.fasta [2023-03-19 02:52:33,755] [INFO] Task started: Blastn [2023-03-19 02:52:33,755] [INFO] Running command: blastn -query OceanDNA-b23176/markers.fasta -db /var/lib/cwl/stg69ec5c3c-50c1-44e7-92dc-0033e8dcf7fa/dqc_reference/reference_markers.fasta -out OceanDNA-b23176/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-19 02:52:34,329] [INFO] Task succeeded: Blastn [2023-03-19 02:52:34,348] [INFO] Selected 22 target genomes. [2023-03-19 02:52:34,348] [INFO] Target genome list was writen to OceanDNA-b23176/target_genomes.txt [2023-03-19 02:52:34,361] [INFO] Task started: fastANI [2023-03-19 02:52:34,361] [INFO] Running command: fastANI --query /var/lib/cwl/stg7ccec734-c1ca-4208-b492-dfb8e5a12bad/OceanDNA-b23176.fa --refList OceanDNA-b23176/target_genomes.txt --output OceanDNA-b23176/fastani_result.tsv --threads 1 [2023-03-19 02:52:58,418] [INFO] Task succeeded: fastANI [2023-03-19 02:52:58,419] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg69ec5c3c-50c1-44e7-92dc-0033e8dcf7fa/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-03-19 02:52:58,419] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg69ec5c3c-50c1-44e7-92dc-0033e8dcf7fa/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-03-19 02:52:58,431] [INFO] Found 21 fastANI hits (0 hits with ANI > threshold) [2023-03-19 02:52:58,431] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-03-19 02:52:58,431] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Pseudobythopirellula maris strain=Mal64 GCA_007859945.1 2527991 2527991 type True 74.9469 150 2626 95 below_threshold Posidoniimonas corsicana strain=KOR34 GCA_007859765.1 1938618 1938618 type True 74.8736 191 2626 95 below_threshold Paludisphaera soli strain=JC670 GCA_011064595.1 2712865 2712865 type True 74.8311 330 2626 95 below_threshold Aquisphaera giovannonii strain=OJF2 GCA_008087625.1 406548 406548 type True 74.8295 406 2626 95 below_threshold Actinomyces faecalis strain=ZJ34 GCA_013184985.2 2722820 2722820 type True 74.8012 61 2626 95 below_threshold Nonomuraea muscovyensis strain=DSM 45913 GCA_014207745.1 1124761 1124761 type True 74.7612 374 2626 95 below_threshold Nocardioides litoris strain=DSM 103718 GCA_006346315.1 1926648 1926648 type True 74.7601 391 2626 95 below_threshold Agromyces marinus strain=DSM 26151 GCA_021442325.1 1389020 1389020 type True 74.7575 187 2626 95 below_threshold Amycolatopsis vastitatis strain=H5 GCA_002234595.1 1905142 1905142 type True 74.7443 357 2626 95 below_threshold Phycicoccus endophyticus strain=CGMCC 4.7300 GCA_014646175.1 1690220 1690220 type True 74.7366 232 2626 95 below_threshold Amycolatopsis pretoriensis strain=NRRL B-24133 GCA_002156025.1 218821 218821 type True 74.7318 368 2626 95 below_threshold Streptomyces gossypiisoli strain=TRM 44567 GCA_013433285.1 2748864 2748864 type True 74.7206 279 2626 95 below_threshold Paludisphaera rhizosphaereae strain=JC665 GCA_011065895.1 2711216 2711216 type True 74.7199 188 2626 95 below_threshold Phycicoccus endophyticus strain=IP6SC6 GCA_011326735.1 1690220 1690220 type True 74.7198 238 2626 95 below_threshold Erythrobacter dokdonensis strain=DSW-74 GCA_001677335.1 328225 328225 type True 74.7181 53 2626 95 below_threshold Actinomadura oligospora strain=ATCC 43269 GCA_000518265.1 111804 111804 type True 74.7111 361 2626 95 below_threshold Streptacidiphilus anmyonensis strain=NBRC 103185 GCA_000787855.1 405782 405782 type True 74.7059 404 2626 95 below_threshold Streptomyces lichenis strain=LCR6-01 GCA_023218175.1 2306967 2306967 type True 74.6996 300 2626 95 below_threshold Nocardioides flavescens strain=YIM 123512 GCA_009823805.1 2691959 2691959 type True 74.6901 352 2626 95 below_threshold Nonomuraea typhae strain=p1410 GCA_009760925.1 2603600 2603600 type True 74.6888 432 2626 95 below_threshold Streptomyces azureus strain=ATCC 14921 GCA_001270025.1 146537 146537 type True 74.6823 266 2626 95 below_threshold -------------------------------------------------------------------------------- [2023-03-19 02:52:58,442] [INFO] DFAST Taxonomy check result was written to OceanDNA-b23176/tc_result.tsv [2023-03-19 02:52:58,464] [INFO] ===== Taxonomy check completed ===== [2023-03-19 02:52:58,464] [INFO] ===== Start completeness check using CheckM ===== [2023-03-19 02:52:58,464] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg69ec5c3c-50c1-44e7-92dc-0033e8dcf7fa/dqc_reference/checkm_data [2023-03-19 02:52:58,465] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-03-19 02:52:58,518] [INFO] Task started: CheckM [2023-03-19 02:52:58,518] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b23176/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b23176/checkm_input OceanDNA-b23176/checkm_result [2023-03-19 02:54:38,402] [INFO] Task succeeded: CheckM [2023-03-19 02:54:38,403] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 83.33% Contamintation: 4.17% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-03-19 02:54:38,458] [INFO] ===== Completeness check finished ===== [2023-03-19 02:54:38,458] [INFO] ===== Start GTDB Search ===== [2023-03-19 02:54:38,459] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b23176/markers.fasta) [2023-03-19 02:54:38,459] [INFO] Task started: Blastn [2023-03-19 02:54:38,459] [INFO] Running command: blastn -query OceanDNA-b23176/markers.fasta -db /var/lib/cwl/stg69ec5c3c-50c1-44e7-92dc-0033e8dcf7fa/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b23176/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-19 02:54:39,284] [INFO] Task succeeded: Blastn [2023-03-19 02:54:39,285] [INFO] Selected 25 target genomes. [2023-03-19 02:54:39,285] [INFO] Target genome list was writen to OceanDNA-b23176/target_genomes_gtdb.txt [2023-03-19 02:54:39,355] [INFO] Task started: fastANI [2023-03-19 02:54:39,355] [INFO] Running command: fastANI --query /var/lib/cwl/stg7ccec734-c1ca-4208-b492-dfb8e5a12bad/OceanDNA-b23176.fa --refList OceanDNA-b23176/target_genomes_gtdb.txt --output OceanDNA-b23176/fastani_result_gtdb.tsv --threads 1 [2023-03-19 02:54:58,825] [INFO] Task succeeded: fastANI [2023-03-19 02:54:58,837] [INFO] Found 21 fastANI hits (0 hits with ANI > circumscription radius) [2023-03-19 02:54:58,838] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_016765095.1 s__NORP233 sp016765095 77.1994 678 2626 d__Bacteria;p__Planctomycetota;c__J058;o__J058;f__J058;g__NORP233 95.0 N/A N/A N/A N/A 1 - GCA_003695705.1 s__J058 sp003695705 76.2884 604 2626 d__Bacteria;p__Planctomycetota;c__J058;o__J058;f__J058;g__J058 95.0 N/A N/A N/A N/A 1 - GCA_016200955.1 s__JACQDU01 sp016200955 75.2772 272 2626 d__Bacteria;p__Planctomycetota;c__J058;o__J058;f__JACQDU01;g__JACQDU01 95.0 N/A N/A N/A N/A 1 - GCA_003694875.1 s__J103 sp003694875 75.2388 166 2626 d__Bacteria;p__Planctomycetota;c__J058;o__J103;f__J103;g__J103 95.0 N/A N/A N/A N/A 1 - GCA_016872675.1 s__UBA2386 sp016872675 75.052 238 2626 d__Bacteria;p__Planctomycetota;c__UBA1135;o__UBA2386;f__UBA2386;g__UBA2386 95.0 N/A N/A N/A N/A 1 - GCA_016873195.1 s__VGXW01 sp016873195 75.0176 306 2626 d__Bacteria;p__Planctomycetota;c__UBA1135;o__UBA2386;f__UBA2386;g__VGXW01 95.0 N/A N/A N/A N/A 1 - GCA_900696455.1 s__UBA2376 sp900696455 74.9553 664 2626 d__Bacteria;p__Myxococcota;c__Polyangia;o__Haliangiales;f__Haliangiaceae;g__UBA2376 95.0 98.40 97.29 0.91 0.90 3 - GCF_008087625.1 s__Aquisphaera giovannonii 74.8218 407 2626 d__Bacteria;p__Planctomycetota;c__Planctomycetia;o__Isosphaerales;f__Isosphaeraceae;g__Aquisphaera 95.0 N/A N/A N/A N/A 1 - GCA_016874795.1 s__VGTG01 sp016874795 74.784 250 2626 d__Bacteria;p__Desulfobacterota_B;c__Binatia;o__UBA12015;f__UBA12015;g__VGTG01 95.0 N/A N/A N/A N/A 1 - GCA_016794765.1 s__JAEUJG01 sp016794765 74.7803 136 2626 d__Bacteria;p__Spirochaetota;c__Spirochaetia;o__Treponematales;f__UBA8932;g__JAEUJG01 95.0 N/A N/A N/A N/A 1 - GCA_005888095.1 s__DP-3 sp005888095 74.7487 297 2626 d__Bacteria;p__Desulfobacterota_B;c__Binatia;o__UTPRO1;f__DP-6;g__DP-3 95.0 98.46 98.46 0.92 0.92 2 - GCF_900119915.1 s__Thermophilibacter timonensis 74.7437 112 2626 d__Bacteria;p__Actinobacteriota;c__Coriobacteriia;o__Coriobacteriales;f__Atopobiaceae;g__Thermophilibacter 95.0 98.93 98.03 0.94 0.92 4 - GCA_002402755.1 s__UBA8898 sp002402755 74.7399 235 2626 d__Bacteria;p__Planctomycetota;c__UBA8108;o__UBA8890;f__UBA8898;g__UBA8898 95.0 99.51 99.51 0.84 0.84 2 - GCF_017592475.1 s__Glycomyces sp017592475 74.7385 242 2626 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Micromonosporaceae;g__Glycomyces 95.0 N/A N/A N/A N/A 1 - GCA_009839925.1 s__VXMT01 sp009839925 74.7361 166 2626 d__Bacteria;p__Chloroflexota;c__Dehalococcoidia;o__UBA2979;f__UBA2979;g__VXMT01 95.0 N/A N/A N/A N/A 1 - GCF_014084125.1 s__Streptacidiphilus sp014084125 74.7326 346 2626 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Streptomycetales;f__Streptomycetaceae;g__Streptacidiphilus 95.0 N/A N/A N/A N/A 1 - GCF_011326735.1 s__Phycicoccus endophyticus 74.7204 240 2626 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Dermatophilaceae;g__Phycicoccus 95.0 99.99 99.99 1.00 1.00 3 - GCF_002198675.1 s__R-H-3 sp002198675 74.6959 401 2626 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Kineosporiaceae;g__R-H-3 95.0 N/A N/A N/A N/A 1 - GCA_003244205.1 s__QHBW01 sp003244205 74.694 126 2626 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Solirubrobacterales;f__Solirubrobacteraceae;g__QHBW01 95.0 N/A N/A N/A N/A 1 - GCA_016875695.1 s__SHYP01 sp016875695 74.6658 101 2626 d__Bacteria;p__Chloroflexota;c__Dehalococcoidia;o__UBA2979;f__UBA2979;g__SHYP01 95.0 N/A N/A N/A N/A 1 - GCA_003157945.1 s__Bog-756 sp003157945 74.6128 92 2626 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__RAAP-2;g__Bog-756 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-03-19 02:54:58,838] [INFO] GTDB search result was written to OceanDNA-b23176/result_gtdb.tsv [2023-03-19 02:54:58,838] [INFO] ===== GTDB Search completed ===== [2023-03-19 02:54:58,840] [INFO] DFAST_QC result json was written to OceanDNA-b23176/dqc_result.json [2023-03-19 02:54:58,840] [INFO] DFAST_QC completed! [2023-03-19 02:54:58,840] [INFO] Total running time: 0h3m11s