[2023-03-14 13:46:33,108] [INFO] DFAST_QC pipeline started. [2023-03-14 13:46:33,108] [INFO] DFAST_QC version: 0.5.7 [2023-03-14 13:46:33,108] [INFO] DQC Reference Directory: /var/lib/cwl/stg3d33d782-29ea-474b-84aa-07460ee6fa96/dqc_reference [2023-03-14 13:46:34,234] [INFO] ===== Start taxonomy check using ANI ===== [2023-03-14 13:46:34,234] [INFO] Task started: Prodigal [2023-03-14 13:46:34,234] [INFO] Running command: cat /var/lib/cwl/stgf7fb9e3c-7657-4f49-8979-7f21d21dafac/OceanDNA-b29571.fa | prodigal -d OceanDNA-b29571/cds.fna -a OceanDNA-b29571/protein.faa -g 11 -q > /dev/null [2023-03-14 13:46:43,629] [INFO] Task succeeded: Prodigal [2023-03-14 13:46:43,629] [INFO] Task started: HMMsearch [2023-03-14 13:46:43,629] [INFO] Running command: hmmsearch --tblout OceanDNA-b29571/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg3d33d782-29ea-474b-84aa-07460ee6fa96/dqc_reference/reference_markers.hmm OceanDNA-b29571/protein.faa > /dev/null [2023-03-14 13:46:43,789] [INFO] Task succeeded: HMMsearch [2023-03-14 13:46:43,789] [WARNING] Found 3/6 markers. [/var/lib/cwl/stgf7fb9e3c-7657-4f49-8979-7f21d21dafac/OceanDNA-b29571.fa] [2023-03-14 13:46:43,807] [INFO] Query marker FASTA was written to OceanDNA-b29571/markers.fasta [2023-03-14 13:46:43,807] [INFO] Task started: Blastn [2023-03-14 13:46:43,808] [INFO] Running command: blastn -query OceanDNA-b29571/markers.fasta -db /var/lib/cwl/stg3d33d782-29ea-474b-84aa-07460ee6fa96/dqc_reference/reference_markers.fasta -out OceanDNA-b29571/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-14 13:46:44,276] [INFO] Task succeeded: Blastn [2023-03-14 13:46:44,276] [INFO] Selected 19 target genomes. [2023-03-14 13:46:44,277] [INFO] Target genome list was writen to OceanDNA-b29571/target_genomes.txt [2023-03-14 13:46:44,285] [INFO] Task started: fastANI [2023-03-14 13:46:44,285] [INFO] Running command: fastANI --query /var/lib/cwl/stgf7fb9e3c-7657-4f49-8979-7f21d21dafac/OceanDNA-b29571.fa --refList OceanDNA-b29571/target_genomes.txt --output OceanDNA-b29571/fastani_result.tsv --threads 1 [2023-03-14 13:46:54,660] [INFO] Task succeeded: fastANI [2023-03-14 13:46:54,660] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg3d33d782-29ea-474b-84aa-07460ee6fa96/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-03-14 13:46:54,660] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg3d33d782-29ea-474b-84aa-07460ee6fa96/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-03-14 13:46:54,670] [INFO] Found 16 fastANI hits (0 hits with ANI > threshold) [2023-03-14 13:46:54,670] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-03-14 13:46:54,670] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Halovulum marinum strain=2CG4 GCA_009697225.1 2662447 2662447 type True 77.5281 135 392 95 below_threshold Halovulum dunhuangense strain=YYQ-30 GCA_013093415.1 1505036 1505036 type True 77.3808 119 392 95 below_threshold Rhodovulum visakhapatnamense strain=JA181 GCA_004365965.1 364297 364297 type True 76.9888 66 392 95 below_threshold Rhodovulum adriaticum strain=DSM 2781 GCA_016583705.1 35804 35804 type True 76.9215 73 392 95 below_threshold Rhodovulum sulfidophilum strain=DSM 1374 GCA_001633165.1 35806 35806 type True 76.7658 64 392 95 below_threshold Rhodovulum sulfidophilum strain=DSM 1374 GCA_000520135.2 35806 35806 type True 76.7326 64 392 95 below_threshold Oceanomicrobium pacificus strain=KN286 GCA_009833495.1 2692916 2692916 type True 76.7185 79 392 95 below_threshold Brevirhabdus pacifica strain=DSM 27767 GCA_002797755.1 1267768 1267768 type True 76.6577 57 392 95 below_threshold Rhodovulum tesquicola strain=A-36s GCA_024128855.1 540254 540254 type True 76.6053 90 392 95 below_threshold Brevirhabdus pacifica strain=22DY15 GCA_002094875.1 1267768 1267768 type True 76.5948 54 392 95 below_threshold Frigidibacter mobilis strain=cai42 GCA_001620265.1 1335048 1335048 type True 76.536 65 392 95 below_threshold Cereibacter ovatus strain=JA234 GCA_900207575.1 439529 439529 type True 76.5177 56 392 95 below_threshold Salibaculum halophilum strain=WDS1C4 GCA_002094885.1 1914408 1914408 type True 76.5118 63 392 95 below_threshold Mangrovicoccus ximenensis strain=T1lg56 GCA_003056725.1 1911570 1911570 type True 76.4466 61 392 95 below_threshold Mangrovicoccus algicola strain=HB182678 GCA_014903745.1 2771008 2771008 type True 76.0909 65 392 95 below_threshold Tabrizicola alkalilacus strain=DJC GCA_003443995.1 2305252 2305252 type True 76.0784 54 392 95 below_threshold -------------------------------------------------------------------------------- [2023-03-14 13:46:54,670] [INFO] DFAST Taxonomy check result was written to OceanDNA-b29571/tc_result.tsv [2023-03-14 13:46:54,670] [INFO] ===== Taxonomy check completed ===== [2023-03-14 13:46:54,670] [INFO] ===== Start completeness check using CheckM ===== [2023-03-14 13:46:54,670] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg3d33d782-29ea-474b-84aa-07460ee6fa96/dqc_reference/checkm_data [2023-03-14 13:46:54,671] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-03-14 13:46:54,675] [INFO] Task started: CheckM [2023-03-14 13:46:54,675] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b29571/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b29571/checkm_input OceanDNA-b29571/checkm_result [2023-03-14 13:47:22,713] [INFO] Task succeeded: CheckM [2023-03-14 13:47:22,713] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 32.29% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-03-14 13:47:22,715] [INFO] ===== Completeness check finished ===== [2023-03-14 13:47:22,715] [INFO] ===== Start GTDB Search ===== [2023-03-14 13:47:22,715] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b29571/markers.fasta) [2023-03-14 13:47:22,716] [INFO] Task started: Blastn [2023-03-14 13:47:22,716] [INFO] Running command: blastn -query OceanDNA-b29571/markers.fasta -db /var/lib/cwl/stg3d33d782-29ea-474b-84aa-07460ee6fa96/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b29571/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-14 13:47:23,204] [INFO] Task succeeded: Blastn [2023-03-14 13:47:23,205] [INFO] Selected 20 target genomes. [2023-03-14 13:47:23,205] [INFO] Target genome list was writen to OceanDNA-b29571/target_genomes_gtdb.txt [2023-03-14 13:47:23,218] [INFO] Task started: fastANI [2023-03-14 13:47:23,218] [INFO] Running command: fastANI --query /var/lib/cwl/stgf7fb9e3c-7657-4f49-8979-7f21d21dafac/OceanDNA-b29571.fa --refList OceanDNA-b29571/target_genomes_gtdb.txt --output OceanDNA-b29571/fastani_result_gtdb.tsv --threads 1 [2023-03-14 13:47:35,169] [INFO] Task succeeded: fastANI [2023-03-14 13:47:35,178] [INFO] Found 17 fastANI hits (0 hits with ANI > circumscription radius) [2023-03-14 13:47:35,179] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_009697225.1 s__Halovulum marinum 77.5281 135 392 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Halovulum 95.0 N/A N/A N/A N/A 1 - GCF_004365965.1 s__Rhodovulum visakhapatnamense 76.9515 67 392 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum 95.0 98.55 98.44 0.90 0.89 5 - GCA_019061145.1 s__ASV31 sp019061145 76.8987 71 392 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__ASV31 95.0 N/A N/A N/A N/A 1 - GCF_004345735.1 s__Rhodovulum adriaticum 76.8643 73 392 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum 95.0 99.99 99.99 0.99 0.99 2 - GCF_001633165.1 s__Rhodovulum sulfidophilum 76.7658 64 392 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum 95.0 97.78 97.16 0.92 0.88 14 - GCF_003335585.1 s__Sulfitobacter sp003335585 76.7231 58 392 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter 95.0 N/A N/A N/A N/A 1 - GCF_009833495.1 s__Oceanomicrobium pacificus 76.7185 79 392 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Oceanomicrobium 95.0 N/A N/A N/A N/A 1 - GCF_002797755.1 s__Brevirhabdus pacifica 76.6577 57 392 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Brevirhabdus 95.0 99.99 99.97 0.99 0.98 4 - GCF_900142935.1 s__Rhodovulum sp900142935 76.6511 74 392 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum 95.0 N/A N/A N/A N/A 1 - GCA_003500165.1 s__UBA10424 sp003500165 76.6388 67 392 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__UBA10424 95.0 99.72 99.62 0.98 0.98 3 - GCF_009363655.1 s__Roseovarius sp009363655 76.5785 57 392 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius 95.0 98.13 98.13 0.93 0.93 2 - GCF_018139985.1 s__JAGSOU01 sp018139985 76.5529 77 392 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__JAGSOU01 95.0 N/A N/A N/A N/A 1 - GCF_002094885.1 s__Salibaculum halophilum 76.5118 63 392 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Salibaculum 95.0 N/A N/A N/A N/A 1 - GCA_014359745.1 s__JACIYW01 sp014359745 76.3419 53 392 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__JACIYW01 95.0 N/A N/A N/A N/A 1 - GCF_005870025.1 s__Mangrovicoccus sp005870025 76.2575 72 392 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Mangrovicoccus 95.0 N/A N/A N/A N/A 1 - GCF_014903745.1 s__Mangrovicoccus sp014903745 76.0909 65 392 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Mangrovicoccus 95.0 N/A N/A N/A N/A 1 - GCA_018058035.1 s__JAAKGP01 sp018058035 75.8397 61 392 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__JAAKGP01 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-03-14 13:47:35,179] [INFO] GTDB search result was written to OceanDNA-b29571/result_gtdb.tsv [2023-03-14 13:47:35,179] [INFO] ===== GTDB Search completed ===== [2023-03-14 13:47:35,181] [INFO] DFAST_QC result json was written to OceanDNA-b29571/dqc_result.json [2023-03-14 13:47:35,181] [INFO] DFAST_QC completed! [2023-03-14 13:47:35,181] [INFO] Total running time: 0h1m2s