[2023-06-27 14:43:38,111] [INFO] DFAST_QC pipeline started. [2023-06-27 14:43:38,114] [INFO] DFAST_QC version: 0.5.7 [2023-06-27 14:43:38,114] [INFO] DQC Reference Directory: /var/lib/cwl/stg3780c56a-06a8-4dcb-95e4-ecbfd6175789/dqc_reference [2023-06-27 14:43:39,424] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-27 14:43:39,425] [INFO] Task started: Prodigal [2023-06-27 14:43:39,425] [INFO] Running command: gunzip -c /var/lib/cwl/stgb961330f-ed4b-4eb9-ac63-9748b392b674/GCA_026710005.1_ASM2671000v1_genomic.fna.gz | prodigal -d GCA_026710005.1_ASM2671000v1_genomic.fna/cds.fna -a GCA_026710005.1_ASM2671000v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-27 14:43:46,316] [INFO] Task succeeded: Prodigal [2023-06-27 14:43:46,316] [INFO] Task started: HMMsearch [2023-06-27 14:43:46,316] [INFO] Running command: hmmsearch --tblout GCA_026710005.1_ASM2671000v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg3780c56a-06a8-4dcb-95e4-ecbfd6175789/dqc_reference/reference_markers.hmm GCA_026710005.1_ASM2671000v1_genomic.fna/protein.faa > /dev/null [2023-06-27 14:43:46,553] [INFO] Task succeeded: HMMsearch [2023-06-27 14:43:46,555] [INFO] Found 6/6 markers. [2023-06-27 14:43:46,580] [INFO] Query marker FASTA was written to GCA_026710005.1_ASM2671000v1_genomic.fna/markers.fasta [2023-06-27 14:43:46,581] [INFO] Task started: Blastn [2023-06-27 14:43:46,581] [INFO] Running command: blastn -query GCA_026710005.1_ASM2671000v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg3780c56a-06a8-4dcb-95e4-ecbfd6175789/dqc_reference/reference_markers.fasta -out GCA_026710005.1_ASM2671000v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-27 14:43:47,305] [INFO] Task succeeded: Blastn [2023-06-27 14:43:47,310] [INFO] Selected 32 target genomes. [2023-06-27 14:43:47,311] [INFO] Target genome list was writen to GCA_026710005.1_ASM2671000v1_genomic.fna/target_genomes.txt [2023-06-27 14:43:47,330] [INFO] Task started: fastANI [2023-06-27 14:43:47,330] [INFO] Running command: fastANI --query /var/lib/cwl/stgb961330f-ed4b-4eb9-ac63-9748b392b674/GCA_026710005.1_ASM2671000v1_genomic.fna.gz --refList GCA_026710005.1_ASM2671000v1_genomic.fna/target_genomes.txt --output GCA_026710005.1_ASM2671000v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-27 14:44:06,370] [INFO] Task succeeded: fastANI [2023-06-27 14:44:06,371] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg3780c56a-06a8-4dcb-95e4-ecbfd6175789/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-27 14:44:06,372] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg3780c56a-06a8-4dcb-95e4-ecbfd6175789/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-27 14:44:06,394] [INFO] Found 24 fastANI hits (0 hits with ANI > threshold) [2023-06-27 14:44:06,395] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-27 14:44:06,395] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Halomonas lactosivorans strain=KCTC 52281 GCA_003254665.1 2185141 2185141 type True 76.6547 83 711 95 below_threshold Halomonas denitrificans strain=DSM 18045 GCA_003056305.1 370769 370769 type True 76.6295 89 711 95 below_threshold Pseudomonas thermotolerans strain=DSM 14292 GCA_000364625.1 157784 157784 type True 76.4445 78 711 95 below_threshold Pseudomonas mangiferae strain=DMKU BBB3-04 GCA_007109405.1 2593654 2593654 type True 76.3815 91 711 95 below_threshold Pseudomonas lalucatii strain=R1b54 GCA_018398425.1 1424203 1424203 type True 76.3785 104 711 95 below_threshold Pseudomonas aromaticivorans strain=MAP12 GCA_019097855.1 2849492 2849492 type True 76.375 89 711 95 below_threshold Halomonas sulfidoxydans strain=MCCC 1A11059 GCA_017868775.1 2733484 2733484 type True 76.2438 78 711 95 below_threshold Azotobacter beijerinckii strain=DSM 378 GCA_900110885.1 170623 170623 type True 76.201 87 711 95 below_threshold Pseudomonas oryzae strain=KCTC 32247 GCA_900104805.1 1392877 1392877 type True 76.1785 117 711 95 below_threshold Pseudomonas cavernae strain=K2W31S-8 GCA_003595175.1 2320867 2320867 type True 76.1297 84 711 95 below_threshold Pseudohaliea rubra strain=DSM 19751 GCA_000764025.1 475795 475795 type True 76.1059 62 711 95 below_threshold Halomonas ethanolica strain=MCCC 1A11081 GCA_021404305.1 2733486 2733486 type True 76.0897 74 711 95 below_threshold Halorhodospira neutriphila strain=DSM 15116 GCA_016584055.1 168379 168379 type True 76.0474 69 711 95 below_threshold [Pseudomonas] nosocomialis strain=A31/70 GCA_005876855.1 1056496 1056496 type True 75.993 82 711 95 below_threshold Marichromatium gracile strain=DSM 203 GCA_004343155.1 1048 1048 type True 75.9268 92 711 95 below_threshold Marichromatium purpuratum strain=984 GCA_000224005.3 37487 37487 type True 75.8323 82 711 95 below_threshold Pseudomonas insulae strain=UL073 GCA_016901015.1 2809017 2809017 type True 75.8248 78 711 95 below_threshold Pseudoxanthomonas jiangsuensis strain=DSM 22398 GCA_010093185.1 619688 619688 type True 75.6949 71 711 95 below_threshold Arenimonas fontis strain=3729k GCA_008386465.1 2608255 2608255 type True 75.6534 55 711 95 below_threshold Pseudoxanthomonas broegbernensis strain=DSM 12573 GCA_014202435.1 83619 83619 type True 75.6209 67 711 95 below_threshold Luteimonas padinae strain=KCTC 52403 GCA_014652935.1 1714359 1714359 type True 75.5729 74 711 95 below_threshold Luteimonas colneyensis strain=Sa2BVA3 GCA_014836665.1 2762230 2762230 type True 75.4567 73 711 95 below_threshold Ralstonia pseudosolanacearum strain=LMG 9673 GCA_024925465.1 1310165 1310165 type True 75.3452 62 711 95 below_threshold Ralstonia pseudosolanacearum strain=LMG 9673 GCA_919586305.1 1310165 1310165 type True 75.3452 62 711 95 below_threshold -------------------------------------------------------------------------------- [2023-06-27 14:44:06,397] [INFO] DFAST Taxonomy check result was written to GCA_026710005.1_ASM2671000v1_genomic.fna/tc_result.tsv [2023-06-27 14:44:06,398] [INFO] ===== Taxonomy check completed ===== [2023-06-27 14:44:06,398] [INFO] ===== Start completeness check using CheckM ===== [2023-06-27 14:44:06,398] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg3780c56a-06a8-4dcb-95e4-ecbfd6175789/dqc_reference/checkm_data [2023-06-27 14:44:06,399] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-27 14:44:06,435] [INFO] Task started: CheckM [2023-06-27 14:44:06,436] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_026710005.1_ASM2671000v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_026710005.1_ASM2671000v1_genomic.fna/checkm_input GCA_026710005.1_ASM2671000v1_genomic.fna/checkm_result [2023-06-27 14:44:31,599] [INFO] Task succeeded: CheckM [2023-06-27 14:44:31,600] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 95.83% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-27 14:44:31,626] [INFO] ===== Completeness check finished ===== [2023-06-27 14:44:31,627] [INFO] ===== Start GTDB Search ===== [2023-06-27 14:44:31,627] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_026710005.1_ASM2671000v1_genomic.fna/markers.fasta) [2023-06-27 14:44:31,627] [INFO] Task started: Blastn [2023-06-27 14:44:31,628] [INFO] Running command: blastn -query GCA_026710005.1_ASM2671000v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg3780c56a-06a8-4dcb-95e4-ecbfd6175789/dqc_reference/reference_markers_gtdb.fasta -out GCA_026710005.1_ASM2671000v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-27 14:44:32,707] [INFO] Task succeeded: Blastn [2023-06-27 14:44:32,714] [INFO] Selected 27 target genomes. [2023-06-27 14:44:32,714] [INFO] Target genome list was writen to GCA_026710005.1_ASM2671000v1_genomic.fna/target_genomes_gtdb.txt [2023-06-27 14:44:32,756] [INFO] Task started: fastANI [2023-06-27 14:44:32,756] [INFO] Running command: fastANI --query /var/lib/cwl/stgb961330f-ed4b-4eb9-ac63-9748b392b674/GCA_026710005.1_ASM2671000v1_genomic.fna.gz --refList GCA_026710005.1_ASM2671000v1_genomic.fna/target_genomes_gtdb.txt --output GCA_026710005.1_ASM2671000v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-27 14:44:47,717] [INFO] Task succeeded: fastANI [2023-06-27 14:44:47,739] [INFO] Found 22 fastANI hits (0 hits with ANI > circumscription radius) [2023-06-27 14:44:47,739] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_009844085.1 s__UBA9145 sp009844085 80.2926 445 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudohongiellaceae;g__UBA9145 95.0 98.17 97.62 0.91 0.85 13 - GCA_007571115.1 s__UBA9145 sp007571115 78.8276 234 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudohongiellaceae;g__UBA9145 95.0 98.87 98.87 0.74 0.74 2 - GCF_900105885.1 s__Pseudomonas_K guangdongensis 76.582 99 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_K 95.0 98.99 98.99 0.99 0.99 2 - GCF_900115715.1 s__Pseudomonas_K sagittaria 76.4949 98 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_K 95.0 N/A N/A N/A N/A 1 - GCF_003205495.1 s__Pseudomonas_E alcaligenes_B 76.3673 96 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E 95.0 N/A N/A N/A N/A 1 - GCF_015992245.1 s__Halomonas sp015992245 76.3565 76 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Halomonadaceae;g__Halomonas 95.0 99.35 99.35 0.94 0.94 2 - GCF_003202205.1 s__Halomonas sp003202205 76.3215 89 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Halomonadaceae;g__Halomonas 95.0 N/A N/A N/A N/A 1 - GCF_900109175.1 s__Pseudomonas_K linyingensis 76.3076 105 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_K 95.0 N/A N/A N/A N/A 1 - GCA_016720425.1 s__CAIWHR01 sp016720425 76.2127 71 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Casimicrobiaceae;g__CAIWHR01 95.0 98.60 97.38 0.93 0.90 10 - GCA_004551485.1 s__Halomonas azerbaijanica 76.1814 91 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Halomonadaceae;g__Halomonas 95.0 N/A N/A N/A N/A 1 - GCA_015494165.1 s__S144-34 sp015494165 76.1396 73 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__S144-34;f__S144-34;g__S144-34 95.0 N/A N/A N/A N/A 1 - GCF_000764025.1 s__Pseudohaliea rubra 76.0794 63 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Halieaceae;g__Pseudohaliea 95.0 N/A N/A N/A N/A 1 - GCF_000380335.1 s__Azotobacter vinelandii 75.9731 93 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Azotobacter 95.0 99.84 99.43 0.98 0.95 5 - GCF_000510725.1 s__Pseudoxanthomonas sp000510725 75.907 72 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Pseudoxanthomonas 95.0 99.33 99.13 0.95 0.92 5 - GCA_018240295.1 s__Plasticicumulans sp003962905 75.7878 90 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Competibacterales;f__Competibacteraceae;g__Plasticicumulans 95.0 98.50 98.50 0.84 0.84 2 - GCF_900112865.1 s__Dyella marensis 75.7661 68 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Rhodanobacteraceae;g__Dyella 95.0 99.99 99.99 1.00 1.00 2 - GCA_011389995.1 s__JAABTG01 sp011389995 75.6826 52 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__JAABTG01;f__JAABTG01;g__JAABTG01 95.0 N/A N/A N/A N/A 1 - GCF_010093185.1 s__Pseudoxanthomonas jiangsuensis 75.6597 73 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Pseudoxanthomonas 95.0 N/A N/A N/A N/A 1 - GCA_015709695.1 s__QUBU01 sp015709695 75.6474 54 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__GCA-2729495;f__GCA-2729495;g__QUBU01 95.0 N/A N/A N/A N/A 1 - GCF_014652935.1 s__Luteimonas padinae 75.5729 74 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Luteimonas 95.0 N/A N/A N/A N/A 1 - GCF_014836665.1 s__Luteimonas sp014836665 75.4567 73 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Luteimonas 95.0 N/A N/A N/A N/A 1 - GCA_007694905.1 s__SLTB01 sp007694905 75.3259 58 711 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__HTCC2089;g__SLTB01 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-27 14:44:47,741] [INFO] GTDB search result was written to GCA_026710005.1_ASM2671000v1_genomic.fna/result_gtdb.tsv [2023-06-27 14:44:47,742] [INFO] ===== GTDB Search completed ===== [2023-06-27 14:44:47,748] [INFO] DFAST_QC result json was written to GCA_026710005.1_ASM2671000v1_genomic.fna/dqc_result.json [2023-06-27 14:44:47,749] [INFO] DFAST_QC completed! [2023-06-27 14:44:47,749] [INFO] Total running time: 0h1m10s