[2023-06-27 01:35:30,781] [INFO] DFAST_QC pipeline started. [2023-06-27 01:35:30,784] [INFO] DFAST_QC version: 0.5.7 [2023-06-27 01:35:30,784] [INFO] DQC Reference Directory: /var/lib/cwl/stgea62f973-8cd3-4b52-a298-db8943979c7b/dqc_reference [2023-06-27 01:35:32,165] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-27 01:35:32,166] [INFO] Task started: Prodigal [2023-06-27 01:35:32,166] [INFO] Running command: gunzip -c /var/lib/cwl/stgb0e42cea-19f6-4d19-839b-5958cecf1a65/GCA_026708055.1_ASM2670805v1_genomic.fna.gz | prodigal -d GCA_026708055.1_ASM2670805v1_genomic.fna/cds.fna -a GCA_026708055.1_ASM2670805v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-27 01:35:44,125] [INFO] Task succeeded: Prodigal [2023-06-27 01:35:44,126] [INFO] Task started: HMMsearch [2023-06-27 01:35:44,126] [INFO] Running command: hmmsearch --tblout GCA_026708055.1_ASM2670805v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgea62f973-8cd3-4b52-a298-db8943979c7b/dqc_reference/reference_markers.hmm GCA_026708055.1_ASM2670805v1_genomic.fna/protein.faa > /dev/null [2023-06-27 01:35:44,505] [INFO] Task succeeded: HMMsearch [2023-06-27 01:35:44,506] [INFO] Found 6/6 markers. [2023-06-27 01:35:44,545] [INFO] Query marker FASTA was written to GCA_026708055.1_ASM2670805v1_genomic.fna/markers.fasta [2023-06-27 01:35:44,545] [INFO] Task started: Blastn [2023-06-27 01:35:44,546] [INFO] Running command: blastn -query GCA_026708055.1_ASM2670805v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgea62f973-8cd3-4b52-a298-db8943979c7b/dqc_reference/reference_markers.fasta -out GCA_026708055.1_ASM2670805v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-27 01:35:45,329] [INFO] Task succeeded: Blastn [2023-06-27 01:35:45,333] [INFO] Selected 28 target genomes. [2023-06-27 01:35:45,333] [INFO] Target genome list was writen to GCA_026708055.1_ASM2670805v1_genomic.fna/target_genomes.txt [2023-06-27 01:35:45,338] [INFO] Task started: fastANI [2023-06-27 01:35:45,339] [INFO] Running command: fastANI --query /var/lib/cwl/stgb0e42cea-19f6-4d19-839b-5958cecf1a65/GCA_026708055.1_ASM2670805v1_genomic.fna.gz --refList GCA_026708055.1_ASM2670805v1_genomic.fna/target_genomes.txt --output GCA_026708055.1_ASM2670805v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-27 01:36:09,015] [INFO] Task succeeded: fastANI [2023-06-27 01:36:09,016] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgea62f973-8cd3-4b52-a298-db8943979c7b/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-27 01:36:09,017] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgea62f973-8cd3-4b52-a298-db8943979c7b/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-27 01:36:09,045] [INFO] Found 24 fastANI hits (0 hits with ANI > threshold) [2023-06-27 01:36:09,045] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-27 01:36:09,046] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Rhabdothermincola salaria strain=EGI L10124 GCA_021246445.1 2903142 2903142 type True 76.4841 200 1264 95 below_threshold Rhabdothermincola sediminis strain=SYSU G02662 GCA_014805525.1 2751370 2751370 type True 76.4471 162 1264 95 below_threshold Actinomarinicola tropica strain=SCSIO 58843 GCA_009650215.1 2789776 2789776 type True 76.4415 199 1264 95 below_threshold Desertimonas flava strain=SYSU D60003 GCA_003426815.1 2064846 2064846 type True 75.8011 199 1264 95 below_threshold Ilumatobacter nonamiensis strain=YM16-303 GCA_000350145.1 467093 467093 type True 75.7302 85 1264 95 below_threshold Ilumatobacter fluminis strain=DSM 18936 GCA_004364865.1 467091 467091 type True 75.6944 151 1264 95 below_threshold Ilumatobacter coccineus strain=YM16-304 GCA_000348785.1 467094 467094 type True 75.6549 136 1264 95 below_threshold Ornithinimicrobium sediminis strain=EGI L100131 GCA_021272345.1 2904603 2904603 type True 75.4282 85 1264 95 below_threshold Streptomyces parmotrematis strain=Ptm05 GCA_019890615.1 2873249 2873249 type True 75.405 171 1264 95 below_threshold Nocardioides pelophilus strain=CGMCC 4.7388 GCA_014180685.1 2172019 2172019 type True 75.3415 107 1264 95 below_threshold Micrococcus flavus strain=DSM 19079 GCA_022348285.1 384602 384602 type True 75.2619 89 1264 95 below_threshold Euzebya pacifica strain=DY32-46 GCA_003344865.1 1608957 1608957 type True 75.2479 156 1264 95 below_threshold Thermomonospora echinospora strain=DSM 43163 GCA_900108175.1 1992 1992 type True 75.1419 211 1264 95 below_threshold Pseudonocardia endophytica strain=DSM 44969 GCA_004339565.1 401976 401976 type True 75.0837 184 1264 95 below_threshold Microbacterium oryzae strain=MB-10 GCA_009735645.1 743009 743009 type True 75.0531 80 1264 95 below_threshold Actinomycetospora chiangmaiensis strain=DSM 45062 GCA_000379625.1 402650 402650 type True 75.0173 154 1264 95 below_threshold Actinomycetospora soli strain=SF1 GCA_021026295.1 2893887 2893887 type True 74.988 188 1264 95 below_threshold Amycolatopsis thailandensis strain=JCM 16380 GCA_002234405.1 589330 589330 type True 74.9635 149 1264 95 below_threshold Thioalbus denitrificans strain=DSM 26407 GCA_003337735.1 547122 547122 type True 74.9625 90 1264 95 below_threshold Rhodopseudomonas pentothenatexigens strain=JA575 GCA_003385925.1 999699 999699 type True 74.9351 88 1264 95 below_threshold Rubrivivax benzoatilyticus strain=JA2 GCA_000190375.2 316997 316997 type True 74.9333 110 1264 95 below_threshold Rhodopseudomonas pentothenatexigens strain=JA575 GCA_900218015.1 999699 999699 type True 74.9313 89 1264 95 below_threshold Saccharothrix espanaensis strain=type strain: DSM 44229 GCA_000328705.1 103731 103731 type True 74.9043 196 1264 95 below_threshold Caulobacter flavus strain=CGMCC1 15093 GCA_002858845.1 1679497 1679497 type True 74.6866 111 1264 95 below_threshold -------------------------------------------------------------------------------- [2023-06-27 01:36:09,048] [INFO] DFAST Taxonomy check result was written to GCA_026708055.1_ASM2670805v1_genomic.fna/tc_result.tsv [2023-06-27 01:36:09,049] [INFO] ===== Taxonomy check completed ===== [2023-06-27 01:36:09,049] [INFO] ===== Start completeness check using CheckM ===== [2023-06-27 01:36:09,050] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgea62f973-8cd3-4b52-a298-db8943979c7b/dqc_reference/checkm_data [2023-06-27 01:36:09,052] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-27 01:36:09,096] [INFO] Task started: CheckM [2023-06-27 01:36:09,097] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_026708055.1_ASM2670805v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_026708055.1_ASM2670805v1_genomic.fna/checkm_input GCA_026708055.1_ASM2670805v1_genomic.fna/checkm_result [2023-06-27 01:36:49,651] [INFO] Task succeeded: CheckM [2023-06-27 01:36:49,653] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-27 01:36:49,675] [INFO] ===== Completeness check finished ===== [2023-06-27 01:36:49,675] [INFO] ===== Start GTDB Search ===== [2023-06-27 01:36:49,676] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_026708055.1_ASM2670805v1_genomic.fna/markers.fasta) [2023-06-27 01:36:49,676] [INFO] Task started: Blastn [2023-06-27 01:36:49,676] [INFO] Running command: blastn -query GCA_026708055.1_ASM2670805v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgea62f973-8cd3-4b52-a298-db8943979c7b/dqc_reference/reference_markers_gtdb.fasta -out GCA_026708055.1_ASM2670805v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-27 01:36:50,835] [INFO] Task succeeded: Blastn [2023-06-27 01:36:50,839] [INFO] Selected 20 target genomes. [2023-06-27 01:36:50,839] [INFO] Target genome list was writen to GCA_026708055.1_ASM2670805v1_genomic.fna/target_genomes_gtdb.txt [2023-06-27 01:36:50,847] [INFO] Task started: fastANI [2023-06-27 01:36:50,847] [INFO] Running command: fastANI --query /var/lib/cwl/stgb0e42cea-19f6-4d19-839b-5958cecf1a65/GCA_026708055.1_ASM2670805v1_genomic.fna.gz --refList GCA_026708055.1_ASM2670805v1_genomic.fna/target_genomes_gtdb.txt --output GCA_026708055.1_ASM2670805v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-27 01:37:02,093] [INFO] Task succeeded: fastANI [2023-06-27 01:37:02,115] [INFO] Found 20 fastANI hits (0 hits with ANI > circumscription radius) [2023-06-27 01:37:02,116] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_009835395.1 s__VXNF01 sp009835395 89.714 822 1264 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Bin134;g__VXNF01 95.0 99.94 99.92 0.97 0.97 3 - GCA_009840735.1 s__VXNF01 sp009840735 84.6103 677 1264 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Bin134;g__VXNF01 95.0 98.98 98.92 0.89 0.88 3 - GCA_016845305.1 s__UBA9410 sp016845305 77.5438 211 1264 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__MedAcidi-G1;g__UBA9410 95.0 N/A N/A N/A N/A 1 - GCA_009837445.1 s__VYCW01 sp009837445 77.3633 272 1264 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__UBA11606;g__VYCW01 95.0 99.43 98.40 0.97 0.95 9 - GCA_009843915.1 s__VYCW01 sp009843915 77.2959 227 1264 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__UBA11606;g__VYCW01 95.0 97.45 96.57 0.82 0.78 5 - GCA_002727895.1 s__UBA9410 sp002727895 76.9182 124 1264 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__MedAcidi-G1;g__UBA9410 95.0 97.12 97.12 0.93 0.93 2 - GCA_014381945.1 s__S20-B6 sp014381945 76.902 150 1264 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__MedAcidi-G1;g__S20-B6 95.0 N/A N/A N/A N/A 1 - GCA_016716005.1 s__JADJXE01 sp016716005 76.5784 211 1264 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__JADJXE01;g__JADJXE01 95.0 N/A N/A N/A N/A 1 - GCA_011051915.1 s__DRKF01 sp011051915 76.5739 112 1264 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__TK06;g__DRKF01 95.0 N/A N/A N/A N/A 1 - GCA_017577565.1 s__ZC4RG19 sp017577565 76.3531 226 1264 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__JACDCH01;g__ZC4RG19 95.0 N/A N/A N/A N/A 1 - GCA_902805555.1 s__CADCSY01 sp902805555 76.2955 133 1264 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__CADCSY01;g__CADCSY01 95.0 N/A N/A N/A N/A 1 - GCA_903868545.1 s__CAIPVR01 sp903868545 76.1999 168 1264 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Microtrichaceae;g__CAIPVR01 95.0 99.51 99.50 0.87 0.86 3 - GCA_902805665.1 s__CADCTB01 sp902805665 76.1893 125 1264 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__AC-14;g__CADCTB01 95.0 N/A N/A N/A N/A 1 - GCA_016185275.1 s__JACPNX01 sp016185275 76.1422 165 1264 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__JAAYBP01;g__JACPNX01 95.0 N/A N/A N/A N/A 1 - GCA_902805655.1 s__CADCTF01 sp902805655 76.1173 111 1264 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__CADCTF01;g__CADCTF01 95.0 N/A N/A N/A N/A 1 - GCA_012730075.1 s__JAAYBP01 sp012730075 75.7437 164 1264 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__JAAYBP01;g__JAAYBP01 95.0 N/A N/A N/A N/A 1 - GCA_016936035.1 s__ZC4RG19 sp016936035 75.4613 150 1264 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__JACDCH01;g__ZC4RG19 95.0 N/A N/A N/A N/A 1 - GCF_014180685.1 s__Nocardioides pelophilus 75.3204 107 1264 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides 95.0 N/A N/A N/A N/A 1 - GCA_016793875.1 s__SSA4 sp016793875 74.9812 59 1264 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__UXAT02;g__SSA4 95.0 N/A N/A N/A N/A 1 - GCF_007097155.1 s__Glacieibacterium frigidum 74.8712 64 1264 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Glacieibacterium 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-27 01:37:02,118] [INFO] GTDB search result was written to GCA_026708055.1_ASM2670805v1_genomic.fna/result_gtdb.tsv [2023-06-27 01:37:02,119] [INFO] ===== GTDB Search completed ===== [2023-06-27 01:37:02,128] [INFO] DFAST_QC result json was written to GCA_026708055.1_ASM2670805v1_genomic.fna/dqc_result.json [2023-06-27 01:37:02,128] [INFO] DFAST_QC completed! [2023-06-27 01:37:02,129] [INFO] Total running time: 0h1m31s