[2023-06-29 03:47:05,346] [INFO] DFAST_QC pipeline started. [2023-06-29 03:47:05,349] [INFO] DFAST_QC version: 0.5.7 [2023-06-29 03:47:05,349] [INFO] DQC Reference Directory: /var/lib/cwl/stg13be81c1-78b2-4ac9-93fc-9c41bcd9f101/dqc_reference [2023-06-29 03:47:06,591] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-29 03:47:06,592] [INFO] Task started: Prodigal [2023-06-29 03:47:06,592] [INFO] Running command: gunzip -c /var/lib/cwl/stg475a5524-87d5-4e37-be94-30be8626b837/GCA_027013095.1_ASM2701309v1_genomic.fna.gz | prodigal -d GCA_027013095.1_ASM2701309v1_genomic.fna/cds.fna -a GCA_027013095.1_ASM2701309v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-29 03:47:15,527] [INFO] Task succeeded: Prodigal [2023-06-29 03:47:15,527] [INFO] Task started: HMMsearch [2023-06-29 03:47:15,527] [INFO] Running command: hmmsearch --tblout GCA_027013095.1_ASM2701309v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg13be81c1-78b2-4ac9-93fc-9c41bcd9f101/dqc_reference/reference_markers.hmm GCA_027013095.1_ASM2701309v1_genomic.fna/protein.faa > /dev/null [2023-06-29 03:47:15,821] [INFO] Task succeeded: HMMsearch [2023-06-29 03:47:15,822] [INFO] Found 6/6 markers. [2023-06-29 03:47:15,865] [INFO] Query marker FASTA was written to GCA_027013095.1_ASM2701309v1_genomic.fna/markers.fasta [2023-06-29 03:47:15,866] [INFO] Task started: Blastn [2023-06-29 03:47:15,866] [INFO] Running command: blastn -query GCA_027013095.1_ASM2701309v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg13be81c1-78b2-4ac9-93fc-9c41bcd9f101/dqc_reference/reference_markers.fasta -out GCA_027013095.1_ASM2701309v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-29 03:47:16,707] [INFO] Task succeeded: Blastn [2023-06-29 03:47:16,711] [INFO] Selected 31 target genomes. [2023-06-29 03:47:16,711] [INFO] Target genome list was writen to GCA_027013095.1_ASM2701309v1_genomic.fna/target_genomes.txt [2023-06-29 03:47:16,729] [INFO] Task started: fastANI [2023-06-29 03:47:16,730] [INFO] Running command: fastANI --query /var/lib/cwl/stg475a5524-87d5-4e37-be94-30be8626b837/GCA_027013095.1_ASM2701309v1_genomic.fna.gz --refList GCA_027013095.1_ASM2701309v1_genomic.fna/target_genomes.txt --output GCA_027013095.1_ASM2701309v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-29 03:47:40,958] [INFO] Task succeeded: fastANI [2023-06-29 03:47:40,958] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg13be81c1-78b2-4ac9-93fc-9c41bcd9f101/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-29 03:47:40,959] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg13be81c1-78b2-4ac9-93fc-9c41bcd9f101/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-29 03:47:40,981] [INFO] Found 31 fastANI hits (0 hits with ANI > threshold) [2023-06-29 03:47:40,981] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-29 03:47:40,981] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Brevirhabdus pacifica strain=22DY15 GCA_002094875.1 1267768 1267768 type True 77.013 163 939 95 below_threshold Roseovarius faecimaris strain=MME-070 GCA_009762325.1 2494550 2494550 type True 76.9674 165 939 95 below_threshold Brevirhabdus pacifica strain=DSM 27767 GCA_002797755.1 1267768 1267768 type True 76.939 175 939 95 below_threshold Roseovarius mucosus strain=DSM 17069 GCA_000768555.3 215743 215743 type True 76.9156 161 939 95 below_threshold Roseibacterium elongatum strain=DFL-43 GCA_000590925.1 159346 159346 type True 76.8812 177 939 95 below_threshold Ruegeria pomeroyi strain=DSS-3 GCA_000011965.2 89184 89184 suspected-type True 76.8453 223 939 95 below_threshold Rhodovulum euryhalinum strain=DSM 4868 GCA_004342445.1 35805 35805 type True 76.8408 187 939 95 below_threshold Ruegeria intermedia strain=DSM 29341 GCA_900129345.1 996115 996115 type True 76.8269 199 939 95 below_threshold Antarcticimicrobium luteum strain=318-1 GCA_004358185.1 2547397 2547397 type True 76.8219 244 939 95 below_threshold Roseovarius bejariae strain=A21 GCA_009669325.1 2576383 2576383 type True 76.7781 164 939 95 below_threshold Rhodovulum visakhapatnamense strain=JA181 GCA_004365965.1 364297 364297 type True 76.7673 195 939 95 below_threshold Halovulum dunhuangense strain=YYQ-30 GCA_013093415.1 1505036 1505036 type True 76.7473 143 939 95 below_threshold Frigidibacter albus strain=CGMCC 1.13995 GCA_014640395.1 1465486 1465486 type True 76.736 169 939 95 below_threshold Alkalilacustris brevis strain=34079 GCA_003350345.1 2026338 2026338 type True 76.6766 160 939 95 below_threshold Frigidibacter albus strain=SP32 GCA_009881095.1 1465486 1465486 type True 76.6718 165 939 95 below_threshold Frigidibacter albus strain=SP32 GCA_009908165.1 1465486 1465486 type True 76.6493 169 939 95 below_threshold Pacificitalea manganoxidans strain=DY25 GCA_002504165.1 1411902 1411902 type True 76.6437 142 939 95 below_threshold Frigidibacter albus strain=SP32 GCA_010993795.1 1465486 1465486 type True 76.6337 169 939 95 below_threshold Phaeovulum veldkampii strain=DSM 11550 GCA_003034995.1 33049 33049 type True 76.6153 181 939 95 below_threshold Rhodovulum strictum strain=DSM 11289 GCA_009649175.1 58314 58314 type True 76.6016 207 939 95 below_threshold Halovulum marinum strain=2CG4 GCA_009697225.1 2662447 2662447 type True 76.5843 171 939 95 below_threshold Rhodovulum tesquicola strain=A-36s GCA_024128855.1 540254 540254 type True 76.581 195 939 95 below_threshold Oceanomicrobium pacificus strain=KN286 GCA_009833495.1 2692916 2692916 type True 76.5749 155 939 95 below_threshold Salipiger marinus strain=DSM 26424 GCA_900100085.1 555512 555512 type True 76.5551 179 939 95 below_threshold Rhodovulum steppense strain=DSM 21153 GCA_004339675.1 540251 540251 type True 76.5304 193 939 95 below_threshold Mangrovicoccus algicola strain=HB182678 GCA_014903745.1 2771008 2771008 type True 76.4882 152 939 95 below_threshold Rhabdonatronobacter sediminivivens strain=IM2376 GCA_013415485.1 2743469 2743469 type True 76.4165 183 939 95 below_threshold Nioella sediminis strain=JS7-11 GCA_001879695.1 1912092 1912092 type True 76.2966 167 939 95 below_threshold Amaricoccus solimangrovi strain=HB172011 GCA_006385685.1 2589815 2589815 type True 76.2571 90 939 95 below_threshold Ruegeria haliotis strain=B1Z28 GCA_013377785.1 2747601 2747601 type True 76.2241 118 939 95 below_threshold Roseivivax isoporae strain=LMG 25204 GCA_000521865.1 591206 591206 type True 75.9821 114 939 95 below_threshold -------------------------------------------------------------------------------- [2023-06-29 03:47:41,004] [INFO] DFAST Taxonomy check result was written to GCA_027013095.1_ASM2701309v1_genomic.fna/tc_result.tsv [2023-06-29 03:47:41,005] [INFO] ===== Taxonomy check completed ===== [2023-06-29 03:47:41,005] [INFO] ===== Start completeness check using CheckM ===== [2023-06-29 03:47:41,006] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg13be81c1-78b2-4ac9-93fc-9c41bcd9f101/dqc_reference/checkm_data [2023-06-29 03:47:41,007] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-29 03:47:41,042] [INFO] Task started: CheckM [2023-06-29 03:47:41,042] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_027013095.1_ASM2701309v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_027013095.1_ASM2701309v1_genomic.fna/checkm_input GCA_027013095.1_ASM2701309v1_genomic.fna/checkm_result [2023-06-29 03:48:11,943] [INFO] Task succeeded: CheckM [2023-06-29 03:48:11,946] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-29 03:48:11,967] [INFO] ===== Completeness check finished ===== [2023-06-29 03:48:11,968] [INFO] ===== Start GTDB Search ===== [2023-06-29 03:48:11,968] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_027013095.1_ASM2701309v1_genomic.fna/markers.fasta) [2023-06-29 03:48:11,969] [INFO] Task started: Blastn [2023-06-29 03:48:11,969] [INFO] Running command: blastn -query GCA_027013095.1_ASM2701309v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg13be81c1-78b2-4ac9-93fc-9c41bcd9f101/dqc_reference/reference_markers_gtdb.fasta -out GCA_027013095.1_ASM2701309v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-29 03:48:13,399] [INFO] Task succeeded: Blastn [2023-06-29 03:48:13,405] [INFO] Selected 29 target genomes. [2023-06-29 03:48:13,405] [INFO] Target genome list was writen to GCA_027013095.1_ASM2701309v1_genomic.fna/target_genomes_gtdb.txt [2023-06-29 03:48:13,442] [INFO] Task started: fastANI [2023-06-29 03:48:13,442] [INFO] Running command: fastANI --query /var/lib/cwl/stg475a5524-87d5-4e37-be94-30be8626b837/GCA_027013095.1_ASM2701309v1_genomic.fna.gz --refList GCA_027013095.1_ASM2701309v1_genomic.fna/target_genomes_gtdb.txt --output GCA_027013095.1_ASM2701309v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-29 03:48:30,348] [INFO] Task succeeded: fastANI [2023-06-29 03:48:30,385] [INFO] Found 29 fastANI hits (0 hits with ANI > circumscription radius) [2023-06-29 03:48:30,385] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_015487165.1 s__UBA5972 sp015487165 78.7589 364 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__UBA5972 95.0 N/A N/A N/A N/A 1 - GCA_013151535.1 s__UBA5972 sp013151535 77.2199 260 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__UBA5972 95.0 N/A N/A N/A N/A 1 - GCF_003122205.1 s__QEYE01 sp003122205 76.9867 170 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__QEYE01 95.0 N/A N/A N/A N/A 1 - GCA_015491895.1 s__WFVA01 sp015491895 76.9757 159 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__WFVA01 95.0 99.60 99.57 0.92 0.91 3 - GCA_013151335.1 s__UBA5972 sp013151335 76.9342 246 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__UBA5972 95.0 N/A N/A N/A N/A 1 - GCF_002797755.1 s__Brevirhabdus pacifica 76.9251 176 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Brevirhabdus 95.0 99.99 99.97 0.99 0.98 4 - GCF_000744955.1 s__Actibacterium sp000744955 76.9068 212 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Actibacterium 95.0 N/A N/A N/A N/A 1 - GCF_002285435.1 s__Actibacterium ureilyticum 76.9042 236 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Actibacterium 95.0 N/A N/A N/A N/A 1 - GCF_018139985.1 s__JAGSOU01 sp018139985 76.8601 232 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__JAGSOU01 95.0 N/A N/A N/A N/A 1 - GCF_000011965.2 s__Ruegeria_B pomeroyi 76.8453 223 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Ruegeria_B 95.0 99.97 99.92 0.99 0.98 5 - GCA_014359745.1 s__JACIYW01 sp014359745 76.8042 163 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__JACIYW01 95.0 N/A N/A N/A N/A 1 - GCF_004365965.1 s__Rhodovulum visakhapatnamense 76.7793 195 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum 95.0 98.55 98.44 0.90 0.89 5 - GCF_013093415.1 s__Halovulum dunhuangense 76.7473 143 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Halovulum 95.0 N/A N/A N/A N/A 1 - GCA_003172915.1 s__Fluviibacterium sp003172915 76.7383 108 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Fluviibacterium 95.0 N/A N/A N/A N/A 1 - GCF_002237555.1 s__Antarctobacter heliothermus_B 76.7297 131 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Antarctobacter 95.0 97.31 97.29 0.90 0.89 3 - GCF_003344785.1 s__Puniceibacterium profundi 76.6995 149 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Puniceibacterium 95.0 N/A N/A N/A N/A 1 - GCF_003350345.1 s__Rhodobaculum breve 76.6627 161 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodobaculum 95.0 N/A N/A N/A N/A 1 - GCF_009908165.1 s__Frigidibacter albus 76.6607 168 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Frigidibacter 95.0 100.00 100.00 1.00 0.99 4 - GCF_002504165.1 s__Pacificitalea manganoxidans 76.6437 142 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Pacificitalea 95.0 98.03 97.97 0.89 0.82 4 - GCA_011048875.1 s__Rhodovulum sp011048875 76.6085 148 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum 95.0 N/A N/A N/A N/A 1 - GCF_009649175.1 s__Rhodovulum strictum 76.6024 207 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum 95.0 N/A N/A N/A N/A 1 - GCF_004339675.1 s__Rhodovulum steppense 76.575 189 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum 95.0 N/A N/A N/A N/A 1 - GCF_009833495.1 s__Oceanomicrobium pacificus 76.5749 155 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Oceanomicrobium 95.0 N/A N/A N/A N/A 1 - GCF_003993775.1 s__Frigidibacter sp003993775 76.51 182 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Frigidibacter 95.0 N/A N/A N/A N/A 1 - GCF_001078595.1 s__Rhodobacter_B lobularis 76.3417 166 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodobacter_B 95.0 N/A N/A N/A N/A 1 - GCA_002430125.1 s__UBA5972 sp002430125 76.337 146 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__UBA5972 95.0 99.40 99.40 0.79 0.79 2 - GCF_001719615.1 s__Amylibacter sediminis 76.2657 127 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Amylibacter 95.0 N/A N/A N/A N/A 1 - GCA_001314805.1 s__Roseicyclus sp001314805 76.0916 136 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseicyclus 95.0 N/A N/A N/A N/A 1 - GCA_007128865.1 s__Rhodobaculum sp007128865 75.9751 88 939 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodobaculum 95.0 97.54 97.54 0.66 0.66 2 - -------------------------------------------------------------------------------- [2023-06-29 03:48:30,388] [INFO] GTDB search result was written to GCA_027013095.1_ASM2701309v1_genomic.fna/result_gtdb.tsv [2023-06-29 03:48:30,388] [INFO] ===== GTDB Search completed ===== [2023-06-29 03:48:30,396] [INFO] DFAST_QC result json was written to GCA_027013095.1_ASM2701309v1_genomic.fna/dqc_result.json [2023-06-29 03:48:30,397] [INFO] DFAST_QC completed! [2023-06-29 03:48:30,397] [INFO] Total running time: 0h1m25s