[2023-06-30 19:04:47,471] [INFO] DFAST_QC pipeline started. [2023-06-30 19:04:47,475] [INFO] DFAST_QC version: 0.5.7 [2023-06-30 19:04:47,476] [INFO] DQC Reference Directory: /var/lib/cwl/stg628f9293-ae32-48fa-afa0-05f0952b0dbb/dqc_reference [2023-06-30 19:04:49,114] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-30 19:04:49,115] [INFO] Task started: Prodigal [2023-06-30 19:04:49,116] [INFO] Running command: gunzip -c /var/lib/cwl/stg4a6a26cf-4de8-40e9-8152-717d7e38f690/GCA_025993695.1_ASM2599369v1_genomic.fna.gz | prodigal -d GCA_025993695.1_ASM2599369v1_genomic.fna/cds.fna -a GCA_025993695.1_ASM2599369v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-30 19:04:56,087] [INFO] Task succeeded: Prodigal [2023-06-30 19:04:56,088] [INFO] Task started: HMMsearch [2023-06-30 19:04:56,088] [INFO] Running command: hmmsearch --tblout GCA_025993695.1_ASM2599369v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg628f9293-ae32-48fa-afa0-05f0952b0dbb/dqc_reference/reference_markers.hmm GCA_025993695.1_ASM2599369v1_genomic.fna/protein.faa > /dev/null [2023-06-30 19:04:56,344] [INFO] Task succeeded: HMMsearch [2023-06-30 19:04:56,345] [WARNING] Found 5/6 markers. [/var/lib/cwl/stg4a6a26cf-4de8-40e9-8152-717d7e38f690/GCA_025993695.1_ASM2599369v1_genomic.fna.gz] [2023-06-30 19:04:56,409] [INFO] Query marker FASTA was written to GCA_025993695.1_ASM2599369v1_genomic.fna/markers.fasta [2023-06-30 19:04:56,410] [INFO] Task started: Blastn [2023-06-30 19:04:56,410] [INFO] Running command: blastn -query GCA_025993695.1_ASM2599369v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg628f9293-ae32-48fa-afa0-05f0952b0dbb/dqc_reference/reference_markers.fasta -out GCA_025993695.1_ASM2599369v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-30 19:04:57,216] [INFO] Task succeeded: Blastn [2023-06-30 19:04:57,220] [INFO] Selected 29 target genomes. [2023-06-30 19:04:57,221] [INFO] Target genome list was writen to GCA_025993695.1_ASM2599369v1_genomic.fna/target_genomes.txt [2023-06-30 19:04:57,224] [INFO] Task started: fastANI [2023-06-30 19:04:57,224] [INFO] Running command: fastANI --query /var/lib/cwl/stg4a6a26cf-4de8-40e9-8152-717d7e38f690/GCA_025993695.1_ASM2599369v1_genomic.fna.gz --refList GCA_025993695.1_ASM2599369v1_genomic.fna/target_genomes.txt --output GCA_025993695.1_ASM2599369v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-30 19:05:21,207] [INFO] Task succeeded: fastANI [2023-06-30 19:05:21,208] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg628f9293-ae32-48fa-afa0-05f0952b0dbb/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-30 19:05:21,208] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg628f9293-ae32-48fa-afa0-05f0952b0dbb/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-30 19:05:21,233] [INFO] Found 29 fastANI hits (0 hits with ANI > threshold) [2023-06-30 19:05:21,233] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-30 19:05:21,234] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Cupriavidus gilardii strain=CCUG 38401 GCA_008801915.1 82541 82541 type True 77.0695 216 636 95 below_threshold Cupriavidus gilardii strain=ATCC 700815 GCA_013004615.1 82541 82541 type True 77.0645 215 636 95 below_threshold Thauera chlorobenzoica strain=3CB1 GCA_001922305.1 96773 96773 type True 77.0581 150 636 95 below_threshold Rubrivivax benzoatilyticus strain=JA2 GCA_000190375.2 316997 316997 type True 77.0387 234 636 95 below_threshold Thauera aromatica strain=K172 GCA_003030465.1 59405 59405 type True 77.0086 145 636 95 below_threshold Cupriavidus cauae strain=MKL-01 GCA_008632125.1 2608999 2608999 type True 76.9961 230 636 95 below_threshold Thauera chlorobenzoica strain=3CB-1 GCA_900108255.1 96773 96773 type True 76.993 148 636 95 below_threshold Sphaerotilus sulfidivorans strain=D-501 GCA_013426975.1 639200 639200 type True 76.8655 193 636 95 below_threshold Ralstonia pseudosolanacearum strain=LMG 9673 GCA_919586305.1 1310165 1310165 type True 76.8428 181 636 95 below_threshold Rubrivivax gelatinosus strain=DSM 1709 GCA_016583525.1 28068 28068 suspected-type True 76.77 240 636 95 below_threshold Rubrivivax gelatinosus strain=DSM 1709 GCA_004340905.1 28068 28068 suspected-type True 76.7222 251 636 95 below_threshold Achromobacter pulmonis strain=LMG 26696 GCA_902859765.1 1389932 1389932 type True 76.657 167 636 95 below_threshold Cupriavidus respiraculi strain=LMG 21510 GCA_914271545.1 195930 195930 type True 76.6126 178 636 95 below_threshold Sphaerotilus natans strain=ATCC 13338 GCA_900156335.1 34103 34103 type True 76.6075 213 636 95 below_threshold Ralstonia insidiosa strain=CCUG 46789 GCA_008801405.1 190721 190721 type True 76.6035 119 636 95 below_threshold Achromobacter insuavis strain=LMG 26845 GCA_902859645.1 1287735 1287735 type True 76.5514 208 636 95 below_threshold Cupriavidus malaysiensis strain=USMAA1020 GCA_001854325.1 367825 367825 type True 76.5341 227 636 95 below_threshold Sphaerotilus natans subsp. natans strain=DSM 6575 GCA_000689195.1 882627 34103 type True 76.5279 206 636 95 below_threshold Burkholderia glumae strain=LMG 2196 GCA_902832765.1 337 337 type True 76.5232 238 636 95 below_threshold Achromobacter denitrificans strain=NBRC 15125 GCA_001571365.1 32002 32002 type True 76.4911 171 636 95 below_threshold Derxia lacustris strain=HL-12 GCA_002105195.1 764842 764842 type True 76.4909 205 636 95 below_threshold Achromobacter veterisilvae strain=LMG 30378 GCA_900496975.1 2069367 2069367 type True 76.4834 179 636 95 below_threshold Pseudoduganella lurida strain=CGMCC 1.10822 GCA_007830455.1 1036180 1036180 type True 76.4236 154 636 95 below_threshold Pseudoduganella albidiflava strain=DSM 17472 GCA_004322755.1 321983 321983 type True 76.3649 166 636 95 below_threshold Paraburkholderia phosphatilytica strain=7QSK02 GCA_003443895.1 2282883 2282883 type True 76.3437 175 636 95 below_threshold Burkholderia paludis strain=MSh1 GCA_000732615.1 1506587 1506587 type True 76.3341 232 636 95 below_threshold Bordetella pseudohinzii strain=8-296-03 GCA_000657795.2 1331258 1331258 type True 76.3284 137 636 95 below_threshold Pseudoduganella armeniaca strain=ZMN-3 GCA_003028855.1 2072590 2072590 type True 76.318 167 636 95 below_threshold Rugamonas fusca strain=FT3S GCA_014042365.1 2758568 2758568 type True 76.3159 138 636 95 below_threshold -------------------------------------------------------------------------------- [2023-06-30 19:05:21,236] [INFO] DFAST Taxonomy check result was written to GCA_025993695.1_ASM2599369v1_genomic.fna/tc_result.tsv [2023-06-30 19:05:21,236] [INFO] ===== Taxonomy check completed ===== [2023-06-30 19:05:21,236] [INFO] ===== Start completeness check using CheckM ===== [2023-06-30 19:05:21,237] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg628f9293-ae32-48fa-afa0-05f0952b0dbb/dqc_reference/checkm_data [2023-06-30 19:05:21,238] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-30 19:05:21,269] [INFO] Task started: CheckM [2023-06-30 19:05:21,269] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_025993695.1_ASM2599369v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_025993695.1_ASM2599369v1_genomic.fna/checkm_input GCA_025993695.1_ASM2599369v1_genomic.fna/checkm_result [2023-06-30 19:05:47,196] [INFO] Task succeeded: CheckM [2023-06-30 19:05:47,197] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 45.83% Contamintation: 2.08% Strain heterogeneity: 100.00% -------------------------------------------------------------------------------- [2023-06-30 19:05:47,220] [INFO] ===== Completeness check finished ===== [2023-06-30 19:05:47,221] [INFO] ===== Start GTDB Search ===== [2023-06-30 19:05:47,221] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_025993695.1_ASM2599369v1_genomic.fna/markers.fasta) [2023-06-30 19:05:47,222] [INFO] Task started: Blastn [2023-06-30 19:05:47,222] [INFO] Running command: blastn -query GCA_025993695.1_ASM2599369v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg628f9293-ae32-48fa-afa0-05f0952b0dbb/dqc_reference/reference_markers_gtdb.fasta -out GCA_025993695.1_ASM2599369v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-30 19:05:48,548] [INFO] Task succeeded: Blastn [2023-06-30 19:05:48,553] [INFO] Selected 9 target genomes. [2023-06-30 19:05:48,554] [INFO] Target genome list was writen to GCA_025993695.1_ASM2599369v1_genomic.fna/target_genomes_gtdb.txt [2023-06-30 19:05:48,556] [INFO] Task started: fastANI [2023-06-30 19:05:48,557] [INFO] Running command: fastANI --query /var/lib/cwl/stg4a6a26cf-4de8-40e9-8152-717d7e38f690/GCA_025993695.1_ASM2599369v1_genomic.fna.gz --refList GCA_025993695.1_ASM2599369v1_genomic.fna/target_genomes_gtdb.txt --output GCA_025993695.1_ASM2599369v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-30 19:05:55,328] [INFO] Task succeeded: fastANI [2023-06-30 19:05:55,341] [INFO] Found 9 fastANI hits (1 hits with ANI > circumscription radius) [2023-06-30 19:05:55,341] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_001724855.1 s__SCN-69-89 sp001724855 98.3228 552 636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__SCN-69-89 95.0 N/A N/A N/A N/A 1 conclusive GCA_003577305.1 s__SCN-69-89 sp003577305 87.9343 438 636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__SCN-69-89 95.0 99.77 99.54 0.93 0.87 4 - GCA_001898645.1 s__SCN-69-89 sp001898645 82.3708 435 636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__SCN-69-89 95.0 99.96 99.96 0.96 0.96 2 - GCA_003576795.1 s__SCN-69-89 sp003576795 80.7988 310 636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__SCN-69-89 95.0 N/A N/A N/A N/A 1 - GCA_001724315.1 s__SCN-69-89 sp001724315 80.7607 254 636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__SCN-69-89 95.0 N/A N/A N/A N/A 1 - GCF_008039575.1 s__SCN-69-89 sp008039575 80.2938 349 636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__SCN-69-89 95.0 N/A N/A N/A N/A 1 - GCA_012513825.1 s__SCN-69-89 sp012513825 79.3986 296 636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__SCN-69-89 95.0 N/A N/A N/A N/A 1 - GCA_017307915.1 s__SCN-69-89 sp017307915 79.358 298 636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__SCN-69-89 95.0 N/A N/A N/A N/A 1 - GCF_014202215.1 s__Quisquiliibacterium transsilvanicum 79.0972 289 636 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Quisquiliibacterium 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-30 19:05:55,343] [INFO] GTDB search result was written to GCA_025993695.1_ASM2599369v1_genomic.fna/result_gtdb.tsv [2023-06-30 19:05:55,344] [INFO] ===== GTDB Search completed ===== [2023-06-30 19:05:55,349] [INFO] DFAST_QC result json was written to GCA_025993695.1_ASM2599369v1_genomic.fna/dqc_result.json [2023-06-30 19:05:55,350] [INFO] DFAST_QC completed! [2023-06-30 19:05:55,350] [INFO] Total running time: 0h1m8s