[2023-06-27 11:41:29,733] [INFO] DFAST_QC pipeline started. [2023-06-27 11:41:29,736] [INFO] DFAST_QC version: 0.5.7 [2023-06-27 11:41:29,736] [INFO] DQC Reference Directory: /var/lib/cwl/stg229a52d9-57fa-48bb-822c-2d76a32d462b/dqc_reference [2023-06-27 11:41:30,909] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-27 11:41:30,910] [INFO] Task started: Prodigal [2023-06-27 11:41:30,910] [INFO] Running command: gunzip -c /var/lib/cwl/stg125ca3df-6531-4cd7-9982-a71c581f7f04/GCA_026982255.1_ASM2698225v1_genomic.fna.gz | prodigal -d GCA_026982255.1_ASM2698225v1_genomic.fna/cds.fna -a GCA_026982255.1_ASM2698225v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-27 11:41:37,261] [INFO] Task succeeded: Prodigal [2023-06-27 11:41:37,261] [INFO] Task started: HMMsearch [2023-06-27 11:41:37,261] [INFO] Running command: hmmsearch --tblout GCA_026982255.1_ASM2698225v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg229a52d9-57fa-48bb-822c-2d76a32d462b/dqc_reference/reference_markers.hmm GCA_026982255.1_ASM2698225v1_genomic.fna/protein.faa > /dev/null [2023-06-27 11:41:37,511] [INFO] Task succeeded: HMMsearch [2023-06-27 11:41:37,512] [INFO] Found 6/6 markers. [2023-06-27 11:41:37,535] [INFO] Query marker FASTA was written to GCA_026982255.1_ASM2698225v1_genomic.fna/markers.fasta [2023-06-27 11:41:37,535] [INFO] Task started: Blastn [2023-06-27 11:41:37,536] [INFO] Running command: blastn -query GCA_026982255.1_ASM2698225v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg229a52d9-57fa-48bb-822c-2d76a32d462b/dqc_reference/reference_markers.fasta -out GCA_026982255.1_ASM2698225v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-27 11:41:38,354] [INFO] Task succeeded: Blastn [2023-06-27 11:41:38,518] [INFO] Selected 29 target genomes. [2023-06-27 11:41:38,518] [INFO] Target genome list was writen to GCA_026982255.1_ASM2698225v1_genomic.fna/target_genomes.txt [2023-06-27 11:41:38,528] [INFO] Task started: fastANI [2023-06-27 11:41:38,528] [INFO] Running command: fastANI --query /var/lib/cwl/stg125ca3df-6531-4cd7-9982-a71c581f7f04/GCA_026982255.1_ASM2698225v1_genomic.fna.gz --refList GCA_026982255.1_ASM2698225v1_genomic.fna/target_genomes.txt --output GCA_026982255.1_ASM2698225v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-27 11:41:56,486] [INFO] Task succeeded: fastANI [2023-06-27 11:41:56,487] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg229a52d9-57fa-48bb-822c-2d76a32d462b/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-27 11:41:56,487] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg229a52d9-57fa-48bb-822c-2d76a32d462b/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-27 11:41:56,508] [INFO] Found 29 fastANI hits (0 hits with ANI > threshold) [2023-06-27 11:41:56,508] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-27 11:41:56,508] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Oceanomicrobium pacificus strain=KN286 GCA_009833495.1 2692916 2692916 type True 77.6562 220 665 95 below_threshold Halovulum dunhuangense strain=YYQ-30 GCA_013093415.1 1505036 1505036 type True 77.5782 215 665 95 below_threshold Halovulum marinum strain=2CG4 GCA_009697225.1 2662447 2662447 type True 77.5758 216 665 95 below_threshold Profundibacter amoris strain=BAR1 GCA_003544895.1 2171755 2171755 type True 77.5209 214 665 95 below_threshold Rhodovulum adriaticum strain=DSM 2781 GCA_004345735.1 35804 35804 type True 77.3235 181 665 95 below_threshold Rhodovulum adriaticum strain=DSM 2781 GCA_016583705.1 35804 35804 type True 77.2752 181 665 95 below_threshold Thalassobius aquimarinus strain=KMM 8518 GCA_018219815.1 2785917 2785917 type True 77.1871 195 665 95 below_threshold Brevirhabdus pacifica strain=DSM 27767 GCA_002797755.1 1267768 1267768 type True 77.1621 181 665 95 below_threshold Cribrihabitans marinus strain=CGMCC 1.13219 GCA_014640375.1 1227549 1227549 type True 77.1552 185 665 95 below_threshold Brevirhabdus pacifica strain=22DY15 GCA_002094875.1 1267768 1267768 type True 77.1484 168 665 95 below_threshold Cribrihabitans marinus strain=DSM 29340 GCA_900109035.1 1227549 1227549 type True 77.1427 186 665 95 below_threshold Phaeovulum vinaykumarii strain=JA123 GCA_900217755.1 407234 407234 type True 77.0882 181 665 95 below_threshold Phaeovulum vinaykumarii strain=DSM 18714 GCA_900156695.1 407234 407234 type True 77.0611 181 665 95 below_threshold Paracoccus yeei strain=ATCC BAA-599 GCA_000622145.1 147645 147645 type True 77.0422 167 665 95 below_threshold Pararhodobacter aggregans strain=DSM 18938 GCA_003054005.1 404875 404875 type True 76.9076 192 665 95 below_threshold Pararhodobacter aggregans strain=D1-19 GCA_003075525.1 404875 404875 type True 76.8963 194 665 95 below_threshold Litoreibacter arenae strain=DSM 19593 GCA_000442275.2 491388 491388 type True 76.8637 147 665 95 below_threshold Mangrovicoccus algicola strain=HB182678 GCA_014903745.1 2771008 2771008 type True 76.8636 160 665 95 below_threshold Paracoccus shandongensis strain=wg2 GCA_017315735.1 2816048 2816048 type True 76.8319 163 665 95 below_threshold Dinoroseobacter shibae strain=DFL 12 GCA_000018145.1 215813 215813 type True 76.8148 176 665 95 below_threshold Amaricoccus solimangrovi strain=HB172011 GCA_006385685.1 2589815 2589815 type True 76.8126 145 665 95 below_threshold Actibacterium atlanticum strain=22II-S11-z10 GCA_000671395.1 1461693 1461693 type True 76.7785 119 665 95 below_threshold Pararhodobacter zhoushanensis strain=ZQ420 GCA_003990445.1 2479545 2479545 type True 76.7456 167 665 95 below_threshold Paracoccus salsus strain=EGI L200073 GCA_021556615.1 2911061 2911061 type True 76.7156 132 665 95 below_threshold Roseovarius aestuariivivens strain=GHTF-24 GCA_004761875.1 1888910 1888910 type True 76.6713 117 665 95 below_threshold Mangrovicoccus ximenensis strain=T1lg56 GCA_003056725.1 1911570 1911570 type True 76.5776 163 665 95 below_threshold Aliiroseovarius halocynthiae strain=MA1-10 GCA_007004645.1 985055 985055 type True 76.3519 87 665 95 below_threshold Palleronia rufa strain=MOLA 401 GCA_000743715.1 1530186 1530186 type True 76.3492 163 665 95 below_threshold Mesorhizobium silamurunense strain=CCBAU 01550 GCA_014843825.1 499528 499528 type True 76.1789 65 665 95 below_threshold -------------------------------------------------------------------------------- [2023-06-27 11:41:56,510] [INFO] DFAST Taxonomy check result was written to GCA_026982255.1_ASM2698225v1_genomic.fna/tc_result.tsv [2023-06-27 11:41:56,511] [INFO] ===== Taxonomy check completed ===== [2023-06-27 11:41:56,511] [INFO] ===== Start completeness check using CheckM ===== [2023-06-27 11:41:56,511] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg229a52d9-57fa-48bb-822c-2d76a32d462b/dqc_reference/checkm_data [2023-06-27 11:41:56,512] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-27 11:41:56,537] [INFO] Task started: CheckM [2023-06-27 11:41:56,538] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_026982255.1_ASM2698225v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_026982255.1_ASM2698225v1_genomic.fna/checkm_input GCA_026982255.1_ASM2698225v1_genomic.fna/checkm_result [2023-06-27 11:42:20,393] [INFO] Task succeeded: CheckM [2023-06-27 11:42:20,394] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 97.92% Contamintation: 0.52% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-27 11:42:20,418] [INFO] ===== Completeness check finished ===== [2023-06-27 11:42:20,418] [INFO] ===== Start GTDB Search ===== [2023-06-27 11:42:20,418] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_026982255.1_ASM2698225v1_genomic.fna/markers.fasta) [2023-06-27 11:42:20,418] [INFO] Task started: Blastn [2023-06-27 11:42:20,419] [INFO] Running command: blastn -query GCA_026982255.1_ASM2698225v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg229a52d9-57fa-48bb-822c-2d76a32d462b/dqc_reference/reference_markers_gtdb.fasta -out GCA_026982255.1_ASM2698225v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-27 11:42:21,912] [INFO] Task succeeded: Blastn [2023-06-27 11:42:21,918] [INFO] Selected 22 target genomes. [2023-06-27 11:42:21,918] [INFO] Target genome list was writen to GCA_026982255.1_ASM2698225v1_genomic.fna/target_genomes_gtdb.txt [2023-06-27 11:42:21,928] [INFO] Task started: fastANI [2023-06-27 11:42:21,928] [INFO] Running command: fastANI --query /var/lib/cwl/stg125ca3df-6531-4cd7-9982-a71c581f7f04/GCA_026982255.1_ASM2698225v1_genomic.fna.gz --refList GCA_026982255.1_ASM2698225v1_genomic.fna/target_genomes_gtdb.txt --output GCA_026982255.1_ASM2698225v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-27 11:42:35,737] [INFO] Task succeeded: fastANI [2023-06-27 11:42:35,754] [INFO] Found 22 fastANI hits (0 hits with ANI > circumscription radius) [2023-06-27 11:42:35,755] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_015485375.1 s__UBA3077 sp015485375 86.6626 445 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__UBA3077 95.0 N/A N/A N/A N/A 1 - GCA_015487165.1 s__UBA5972 sp015487165 78.007 190 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__UBA5972 95.0 N/A N/A N/A N/A 1 - GCF_009833495.1 s__Oceanomicrobium pacificus 77.6696 219 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Oceanomicrobium 95.0 N/A N/A N/A N/A 1 - GCF_013093415.1 s__Halovulum dunhuangense 77.5957 213 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Halovulum 95.0 N/A N/A N/A N/A 1 - GCA_009697225.1 s__Halovulum marinum 77.5617 217 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Halovulum 95.0 N/A N/A N/A N/A 1 - GCA_002366645.1 s__UBA3077 sp002366645 77.436 138 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__UBA3077 95.0 N/A N/A N/A N/A 1 - GCF_001482405.1 s__Ponticoccus marisrubri 77.3989 177 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Ponticoccus 95.0 N/A N/A N/A N/A 1 - GCA_002746395.1 s__UBA3077 sp002746395 77.3806 168 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__UBA3077 95.0 N/A N/A N/A N/A 1 - GCF_002285435.1 s__Actibacterium ureilyticum 77.291 217 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Actibacterium 95.0 N/A N/A N/A N/A 1 - GCF_018219815.1 s__Thalassobius aquimarinus 77.2015 194 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Thalassobius 95.0 N/A N/A N/A N/A 1 - GCF_002797755.1 s__Brevirhabdus pacifica 77.1775 180 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Brevirhabdus 95.0 99.99 99.97 0.99 0.98 4 - GCF_000744955.1 s__Actibacterium sp000744955 77.1599 200 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Actibacterium 95.0 N/A N/A N/A N/A 1 - GCF_013031405.1 s__Ruegeria sp013031405 77.0745 175 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Ruegeria 95.0 N/A N/A N/A N/A 1 - GCF_001975705.1 s__Salipiger abyssi 77.0663 186 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Salipiger 95.0 N/A N/A N/A N/A 1 - GCA_001650895.1 s__EhC02 sp001650895 77.0488 177 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__EhC02 95.0 N/A N/A N/A N/A 1 - GCF_016888515.1 s__Shimia_A biformata 77.0119 173 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Shimia_A 95.0 N/A N/A N/A N/A 1 - GCF_900110775.1 s__Litorimicrobium taeanense 76.9542 156 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Litorimicrobium 95.0 N/A N/A N/A N/A 1 - GCA_002162875.1 s__UBA3077 sp002162875 76.9318 146 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__UBA3077 95.0 N/A N/A N/A N/A 1 - GCF_014903745.1 s__Mangrovicoccus sp014903745 76.8952 158 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Mangrovicoccus 95.0 N/A N/A N/A N/A 1 - GCF_014638275.1 s__Pseudooceanicola flagellatus 76.8932 171 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Pseudooceanicola 95.0 N/A N/A N/A N/A 1 - GCA_016765235.1 s__UBA3077 sp016765235 76.8809 161 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__UBA3077 95.0 N/A N/A N/A N/A 1 - GCF_000743715.1 s__Palleronia rufa 76.3738 161 665 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Palleronia 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-27 11:42:35,757] [INFO] GTDB search result was written to GCA_026982255.1_ASM2698225v1_genomic.fna/result_gtdb.tsv [2023-06-27 11:42:35,760] [INFO] ===== GTDB Search completed ===== [2023-06-27 11:42:35,765] [INFO] DFAST_QC result json was written to GCA_026982255.1_ASM2698225v1_genomic.fna/dqc_result.json [2023-06-27 11:42:35,766] [INFO] DFAST_QC completed! [2023-06-27 11:42:35,766] [INFO] Total running time: 0h1m6s