[2024-01-25 19:58:50,751] [INFO] DFAST_QC pipeline started.
[2024-01-25 19:58:50,752] [INFO] DFAST_QC version: 0.5.7
[2024-01-25 19:58:50,753] [INFO] DQC Reference Directory: /var/lib/cwl/stgbbe38403-283b-41b9-bf48-1fc7af541216/dqc_reference
[2024-01-25 19:58:51,999] [INFO] ===== Start taxonomy check using ANI =====
[2024-01-25 19:58:52,000] [INFO] Task started: Prodigal
[2024-01-25 19:58:52,000] [INFO] Running command: gunzip -c /var/lib/cwl/stge4a8c956-6e7b-4400-8e6d-b9ca5f4995e7/GCF_009707515.1_ASM970751v1_genomic.fna.gz | prodigal -d GCF_009707515.1_ASM970751v1_genomic.fna/cds.fna -a GCF_009707515.1_ASM970751v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2024-01-25 19:59:10,303] [INFO] Task succeeded: Prodigal
[2024-01-25 19:59:10,303] [INFO] Task started: HMMsearch
[2024-01-25 19:59:10,303] [INFO] Running command: hmmsearch --tblout GCF_009707515.1_ASM970751v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgbbe38403-283b-41b9-bf48-1fc7af541216/dqc_reference/reference_markers.hmm GCF_009707515.1_ASM970751v1_genomic.fna/protein.faa > /dev/null
[2024-01-25 19:59:10,630] [INFO] Task succeeded: HMMsearch
[2024-01-25 19:59:10,632] [INFO] Found 6/6 markers.
[2024-01-25 19:59:10,682] [INFO] Query marker FASTA was written to GCF_009707515.1_ASM970751v1_genomic.fna/markers.fasta
[2024-01-25 19:59:10,682] [INFO] Task started: Blastn
[2024-01-25 19:59:10,682] [INFO] Running command: blastn -query GCF_009707515.1_ASM970751v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgbbe38403-283b-41b9-bf48-1fc7af541216/dqc_reference/reference_markers.fasta -out GCF_009707515.1_ASM970751v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-25 19:59:11,448] [INFO] Task succeeded: Blastn
[2024-01-25 19:59:11,451] [INFO] Selected 24 target genomes.
[2024-01-25 19:59:11,451] [INFO] Target genome list was writen to GCF_009707515.1_ASM970751v1_genomic.fna/target_genomes.txt
[2024-01-25 19:59:11,472] [INFO] Task started: fastANI
[2024-01-25 19:59:11,472] [INFO] Running command: fastANI --query /var/lib/cwl/stge4a8c956-6e7b-4400-8e6d-b9ca5f4995e7/GCF_009707515.1_ASM970751v1_genomic.fna.gz --refList GCF_009707515.1_ASM970751v1_genomic.fna/target_genomes.txt --output GCF_009707515.1_ASM970751v1_genomic.fna/fastani_result.tsv --threads 1
[2024-01-25 19:59:43,827] [INFO] Task succeeded: fastANI
[2024-01-25 19:59:43,827] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgbbe38403-283b-41b9-bf48-1fc7af541216/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2024-01-25 19:59:43,827] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgbbe38403-283b-41b9-bf48-1fc7af541216/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2024-01-25 19:59:43,841] [INFO] Found 23 fastANI hits (1 hits with ANI > threshold)
[2024-01-25 19:59:43,841] [INFO] The taxonomy check result is classified as 'conclusive'.
[2024-01-25 19:59:43,841] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Pseudomonas karstica	strain=CCM 7891	GCA_009707515.1	1055468	1055468	type	True	100.0	1874	1876	95	conclusive
Pseudomonas spelaei	strain=CCM 7893	GCA_009724245.1	1055469	1055469	type	True	88.1107	1413	1876	95	below_threshold
Pseudomonas yamanorum	strain=LMG 27247	GCA_900105735.1	515393	515393	suspected-type	True	86.4202	1398	1876	95	below_threshold
Pseudomonas brenneri	strain=JCM 13307	GCA_014646715.1	129817	129817	type	True	85.9769	1306	1876	95	below_threshold
Pseudomonas brenneri	strain=DSM 15294	GCA_007858285.1	129817	129817	type	True	85.9388	1310	1876	95	below_threshold
Pseudomonas gessardii	strain=DSM 17152	GCA_009671285.1	78544	78544	type	True	85.8661	1273	1876	95	below_threshold
Pseudomonas gessardii	strain=DSM 17152	GCA_001983165.1	78544	78544	type	True	85.85	1275	1876	95	below_threshold
Pseudomonas proteolytica	strain=DSM 15321	GCA_007858275.1	219574	219574	type	True	85.8381	1269	1876	95	below_threshold
Pseudomonas proteolytica	strain=CCUG 51515T	GCA_008692865.1	219574	219574	type	True	85.7826	1282	1876	95	below_threshold
Pseudomonas gessardii		GCA_900625085.1	78544	78544	type	True	85.7807	1290	1876	95	below_threshold
Pseudomonas marginalis	strain=DSM 13124	GCA_007858155.1	298	298	suspected-type	True	85.5582	1301	1876	95	below_threshold
Pseudomonas simiae	strain=CCUG 50988	GCA_001730615.1	321846	321846	type	True	85.2685	1287	1876	95	below_threshold
Pseudomonas pergaminensis	strain=1008	GCA_024112395.1	2853159	2853159	type	True	85.2151	1363	1876	95	below_threshold
Pseudomonas lurida	strain=LMG 21995	GCA_002563895.1	244566	244566	type	True	85.1692	1339	1876	95	below_threshold
Pseudomonas fluorescens	strain=DSM 50090	GCA_007858165.1	294	294	suspected-type	True	85.1638	1316	1876	95	below_threshold
Pseudomonas simiae	strain=CCUG 50988	GCA_900111895.1	321846	321846	type	True	85.1551	1297	1876	95	below_threshold
Pseudomonas fluorescens	strain=NCTC10038	GCA_900475215.1	294	294	suspected-type	True	85.1466	1326	1876	95	below_threshold
Pseudomonas palleroniana	strain=LMG 23076	GCA_003031675.1	191390	191390	type	True	84.7842	1282	1876	95	below_threshold
Pseudomonas kielensis	strain=MBT-1	GCA_014236655.1	2762577	2762577	type	True	83.4899	1047	1876	95	below_threshold
Pseudomonas izuensis	strain=IzPS43_3003	GCA_009861505.1	2684212	2684212	type	True	82.929	1061	1876	95	below_threshold
Actinoplanes digitatis	strain=DSM 43149	GCA_014205335.1	1868	1868	type	True	74.8339	56	1876	95	below_threshold
Actinoplanes digitatis	strain=NBRC 12512	GCA_016862155.1	1868	1868	type	True	74.8119	55	1876	95	below_threshold
Actinoplanes utahensis	strain=NBRC 13244	GCA_016862455.1	1869	1869	type	True	74.5796	50	1876	95	below_threshold
--------------------------------------------------------------------------------
[2024-01-25 19:59:43,842] [INFO] DFAST Taxonomy check result was written to GCF_009707515.1_ASM970751v1_genomic.fna/tc_result.tsv
[2024-01-25 19:59:43,843] [INFO] ===== Taxonomy check completed =====
[2024-01-25 19:59:43,843] [INFO] ===== Start completeness check using CheckM =====
[2024-01-25 19:59:43,843] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgbbe38403-283b-41b9-bf48-1fc7af541216/dqc_reference/checkm_data
[2024-01-25 19:59:43,844] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2024-01-25 19:59:43,898] [INFO] Task started: CheckM
[2024-01-25 19:59:43,898] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_009707515.1_ASM970751v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_009707515.1_ASM970751v1_genomic.fna/checkm_input GCF_009707515.1_ASM970751v1_genomic.fna/checkm_result
[2024-01-25 20:00:39,494] [INFO] Task succeeded: CheckM
[2024-01-25 20:00:39,497] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2024-01-25 20:00:39,521] [INFO] ===== Completeness check finished =====
[2024-01-25 20:00:39,521] [INFO] ===== Start GTDB Search =====
[2024-01-25 20:00:39,522] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_009707515.1_ASM970751v1_genomic.fna/markers.fasta)
[2024-01-25 20:00:39,522] [INFO] Task started: Blastn
[2024-01-25 20:00:39,522] [INFO] Running command: blastn -query GCF_009707515.1_ASM970751v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgbbe38403-283b-41b9-bf48-1fc7af541216/dqc_reference/reference_markers_gtdb.fasta -out GCF_009707515.1_ASM970751v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-25 20:00:40,840] [INFO] Task succeeded: Blastn
[2024-01-25 20:00:40,843] [INFO] Selected 17 target genomes.
[2024-01-25 20:00:40,843] [INFO] Target genome list was writen to GCF_009707515.1_ASM970751v1_genomic.fna/target_genomes_gtdb.txt
[2024-01-25 20:00:40,876] [INFO] Task started: fastANI
[2024-01-25 20:00:40,877] [INFO] Running command: fastANI --query /var/lib/cwl/stge4a8c956-6e7b-4400-8e6d-b9ca5f4995e7/GCF_009707515.1_ASM970751v1_genomic.fna.gz --refList GCF_009707515.1_ASM970751v1_genomic.fna/target_genomes_gtdb.txt --output GCF_009707515.1_ASM970751v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2024-01-25 20:01:07,093] [INFO] Task succeeded: fastANI
[2024-01-25 20:01:07,104] [INFO] Found 17 fastANI hits (1 hits with ANI > circumscription radius)
[2024-01-25 20:01:07,104] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_009707515.1	s__Pseudomonas_E sp009707515	100.0	1874	1876	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	N/A	N/A	N/A	N/A	1	conclusive
GCF_018614655.1	s__Pseudomonas_E fluorescens_BX	88.1924	1355	1876	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	N/A	N/A	N/A	N/A	1	-
GCF_013403585.1	s__Pseudomonas_E yamanorum_B	88.0438	1415	1876	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	98.31	98.31	0.89	0.89	2	-
GCF_012935695.1	s__Pseudomonas_E sp012935695	88.0147	1433	1876	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	N/A	N/A	N/A	N/A	1	-
GCF_013386765.1	s__Pseudomonas_E yamanorum_A	86.6298	1413	1876	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	98.70	98.16	0.94	0.91	5	-
GCF_008369295.1	s__Pseudomonas_E sp008369295	86.6156	1284	1876	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	99.98	99.98	0.99	0.99	2	-
GCF_012935715.1	s__Pseudomonas_E sp000242655	86.5463	1413	1876	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	98.42	98.02	0.94	0.92	3	-
GCF_002874965.1	s__Pseudomonas_E sp002874965	86.5238	1406	1876	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	98.77	98.46	0.92	0.88	48	-
GCF_013386825.1	s__Pseudomonas_E sp013386825	86.4839	1388	1876	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	99.99	99.99	0.99	0.99	2	-
GCF_007858285.1	s__Pseudomonas_E brenneri	85.9492	1309	1876	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	99.23	97.33	0.94	0.88	10	-
GCF_003626995.1	s__Pseudomonas_E fluorescens_BA	85.9484	1301	1876	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	98.95	98.57	0.93	0.88	19	-
GCF_012985465.1	s__Pseudomonas_E sp012985465	85.9014	1315	1876	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	99.03	98.94	0.91	0.90	5	-
GCF_007858275.1	s__Pseudomonas_E proteolytica	85.8649	1266	1876	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	98.69	98.38	0.90	0.86	19	-
GCF_000612585.1	s__Pseudomonas_E sp000612585	85.4768	1320	1876	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001186335.1	s__Pseudomonas_E trivialis_B	85.3443	1303	1876	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	95.34	95.31	0.86	0.86	8	-
GCF_001439735.1	s__Pseudomonas_E paralactis	84.9335	1291	1876	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	98.61	98.52	0.94	0.92	4	-
GCF_001439805.1	s__Pseudomonas_E trivialis	84.6594	1194	1876	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	98.47	95.41	0.96	0.87	4	-
--------------------------------------------------------------------------------
[2024-01-25 20:01:07,105] [INFO] GTDB search result was written to GCF_009707515.1_ASM970751v1_genomic.fna/result_gtdb.tsv
[2024-01-25 20:01:07,106] [INFO] ===== GTDB Search completed =====
[2024-01-25 20:01:07,109] [INFO] DFAST_QC result json was written to GCF_009707515.1_ASM970751v1_genomic.fna/dqc_result.json
[2024-01-25 20:01:07,110] [INFO] DFAST_QC completed!
[2024-01-25 20:01:07,110] [INFO] Total running time: 0h2m16s
