[2024-01-24 12:06:13,404] [INFO] DFAST_QC pipeline started.
[2024-01-24 12:06:13,409] [INFO] DFAST_QC version: 0.5.7
[2024-01-24 12:06:13,410] [INFO] DQC Reference Directory: /var/lib/cwl/stg193d32b8-9001-4307-b01a-669232984661/dqc_reference
[2024-01-24 12:06:14,792] [INFO] ===== Start taxonomy check using ANI =====
[2024-01-24 12:06:14,793] [INFO] Task started: Prodigal
[2024-01-24 12:06:14,793] [INFO] Running command: gunzip -c /var/lib/cwl/stg53e1d252-016f-4892-8763-f5df552641fd/GCF_022811525.1_ASM2281152v1_genomic.fna.gz | prodigal -d GCF_022811525.1_ASM2281152v1_genomic.fna/cds.fna -a GCF_022811525.1_ASM2281152v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2024-01-24 12:06:24,534] [INFO] Task succeeded: Prodigal
[2024-01-24 12:06:24,534] [INFO] Task started: HMMsearch
[2024-01-24 12:06:24,534] [INFO] Running command: hmmsearch --tblout GCF_022811525.1_ASM2281152v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg193d32b8-9001-4307-b01a-669232984661/dqc_reference/reference_markers.hmm GCF_022811525.1_ASM2281152v1_genomic.fna/protein.faa > /dev/null
[2024-01-24 12:06:24,814] [INFO] Task succeeded: HMMsearch
[2024-01-24 12:06:24,815] [INFO] Found 6/6 markers.
[2024-01-24 12:06:24,840] [INFO] Query marker FASTA was written to GCF_022811525.1_ASM2281152v1_genomic.fna/markers.fasta
[2024-01-24 12:06:24,841] [INFO] Task started: Blastn
[2024-01-24 12:06:24,841] [INFO] Running command: blastn -query GCF_022811525.1_ASM2281152v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg193d32b8-9001-4307-b01a-669232984661/dqc_reference/reference_markers.fasta -out GCF_022811525.1_ASM2281152v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 12:06:25,644] [INFO] Task succeeded: Blastn
[2024-01-24 12:06:25,648] [INFO] Selected 32 target genomes.
[2024-01-24 12:06:25,648] [INFO] Target genome list was writen to GCF_022811525.1_ASM2281152v1_genomic.fna/target_genomes.txt
[2024-01-24 12:06:25,660] [INFO] Task started: fastANI
[2024-01-24 12:06:25,660] [INFO] Running command: fastANI --query /var/lib/cwl/stg53e1d252-016f-4892-8763-f5df552641fd/GCF_022811525.1_ASM2281152v1_genomic.fna.gz --refList GCF_022811525.1_ASM2281152v1_genomic.fna/target_genomes.txt --output GCF_022811525.1_ASM2281152v1_genomic.fna/fastani_result.tsv --threads 1
[2024-01-24 12:06:45,291] [INFO] Task succeeded: fastANI
[2024-01-24 12:06:45,291] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg193d32b8-9001-4307-b01a-669232984661/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2024-01-24 12:06:45,292] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg193d32b8-9001-4307-b01a-669232984661/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2024-01-24 12:06:45,304] [INFO] Found 14 fastANI hits (0 hits with ANI > threshold)
[2024-01-24 12:06:45,304] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2024-01-24 12:06:45,305] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Bordetella bronchialis	strain=AU3182	GCA_001676705.1	463025	463025	type	True	77.8944	52	934	95	below_threshold
Achromobacter denitrificans	strain=NCTC8582	GCA_900444675.1	32002	32002	type	True	77.661	55	934	95	below_threshold
Bordetella parapertussis	strain=FDAARGOS_1541	GCA_020735925.1	519	519	suspected-type	True	77.4204	68	934	95	below_threshold
Bordetella pertussis	strain=FDAARGOS_1543	GCA_020736265.1	520	520	type	True	77.3349	66	934	95	below_threshold
Achromobacter denitrificans	strain=FDAARGOS_786	GCA_013267395.1	32002	32002	type	True	77.3341	54	934	95	below_threshold
Orrella marina	strain=HZ20	GCA_003058465.1	2163011	2163011	type	True	77.258	82	934	95	below_threshold
Bordetella bronchiseptica	strain=NCTC452	GCA_900445725.1	518	518	suspected-type	True	77.252	68	934	95	below_threshold
Bordetella pertussis	strain=18323	GCA_000306945.1	520	520	type	True	77.2442	67	934	95	below_threshold
Achromobacter anxifer	strain=LMG 26857	GCA_903652925.1	1287737	1287737	type	True	76.9281	57	934	95	below_threshold
Candidimonas nitroreducens	strain=SC-089	GCA_002209565.1	683354	683354	type	True	76.8477	54	934	95	below_threshold
Achromobacter denitrificans	strain=LMG 1231	GCA_902859715.1	32002	32002	type	True	76.6725	52	934	95	below_threshold
Achromobacter denitrificans	strain=NBRC 15125	GCA_001571365.1	32002	32002	type	True	76.42	53	934	95	below_threshold
Bordetella bronchiseptica	strain=NBRC 13691	GCA_001598655.1	518	518	suspected-type	True	76.3772	63	934	95	below_threshold
Candidimonas humi	strain=DSM 25336	GCA_019166065.1	683355	683355	type	True	76.1371	53	934	95	below_threshold
--------------------------------------------------------------------------------
[2024-01-24 12:06:45,307] [INFO] DFAST Taxonomy check result was written to GCF_022811525.1_ASM2281152v1_genomic.fna/tc_result.tsv
[2024-01-24 12:06:45,307] [INFO] ===== Taxonomy check completed =====
[2024-01-24 12:06:45,307] [INFO] ===== Start completeness check using CheckM =====
[2024-01-24 12:06:45,308] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg193d32b8-9001-4307-b01a-669232984661/dqc_reference/checkm_data
[2024-01-24 12:06:45,309] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2024-01-24 12:06:45,339] [INFO] Task started: CheckM
[2024-01-24 12:06:45,339] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_022811525.1_ASM2281152v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_022811525.1_ASM2281152v1_genomic.fna/checkm_input GCF_022811525.1_ASM2281152v1_genomic.fna/checkm_result
[2024-01-24 12:07:19,750] [INFO] Task succeeded: CheckM
[2024-01-24 12:07:19,751] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2024-01-24 12:07:19,773] [INFO] ===== Completeness check finished =====
[2024-01-24 12:07:19,774] [INFO] ===== Start GTDB Search =====
[2024-01-24 12:07:19,774] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_022811525.1_ASM2281152v1_genomic.fna/markers.fasta)
[2024-01-24 12:07:19,775] [INFO] Task started: Blastn
[2024-01-24 12:07:19,775] [INFO] Running command: blastn -query GCF_022811525.1_ASM2281152v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg193d32b8-9001-4307-b01a-669232984661/dqc_reference/reference_markers_gtdb.fasta -out GCF_022811525.1_ASM2281152v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 12:07:20,896] [INFO] Task succeeded: Blastn
[2024-01-24 12:07:20,902] [INFO] Selected 33 target genomes.
[2024-01-24 12:07:20,903] [INFO] Target genome list was writen to GCF_022811525.1_ASM2281152v1_genomic.fna/target_genomes_gtdb.txt
[2024-01-24 12:07:20,956] [INFO] Task started: fastANI
[2024-01-24 12:07:20,957] [INFO] Running command: fastANI --query /var/lib/cwl/stg53e1d252-016f-4892-8763-f5df552641fd/GCF_022811525.1_ASM2281152v1_genomic.fna.gz --refList GCF_022811525.1_ASM2281152v1_genomic.fna/target_genomes_gtdb.txt --output GCF_022811525.1_ASM2281152v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2024-01-24 12:07:42,950] [INFO] Task succeeded: fastANI
[2024-01-24 12:07:42,970] [INFO] Found 18 fastANI hits (0 hits with ANI > circumscription radius)
[2024-01-24 12:07:42,970] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_002261215.1	s__Bordetella_C sp002261215	78.2424	55	934	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Bordetella_C	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900078705.1	s__Bordetella_B ansorpii_A	77.6303	52	934	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Bordetella_B	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900445835.1	s__Bordetella avium	77.5681	62	934	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Bordetella	95.0	99.81	99.47	0.98	0.95	26	-
GCF_900089455.2	s__Orrella dioscoreae	77.491	50	934	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Orrella	95.0	N/A	N/A	N/A	N/A	1	-
GCF_016127315.1	s__Achromobacter insuavis_A	77.4563	64	934	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Achromobacter	95.6014	97.14	97.13	0.90	0.90	4	-
GCF_003058465.1	s__Algicoccus marinus	77.4368	83	934	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Algicoccus	95.0	N/A	N/A	N/A	N/A	1	-
GCA_007116135.1	s__Algicoccus sp007116135	77.4003	119	934	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Algicoccus	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000306945.1	s__Bordetella pertussis	77.233	67	934	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Bordetella	95.0	99.42	95.13	0.93	0.83	983	-
GCF_002902905.1	s__Achromobacter sp002902905	77.2232	69	934	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Achromobacter	95.0	96.15	96.15	0.87	0.87	2	-
GCF_003994415.1	s__Achromobacter spanius_C	77.1591	63	934	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Achromobacter	95.0	98.60	98.60	0.94	0.94	2	-
GCF_009763255.1	s__Bordetella_A sp009763255	76.8848	58	934	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Bordetella_A	95.0	97.90	97.90	0.93	0.93	2	-
GCF_002209565.1	s__Candidimonas nitroreducens	76.8477	54	934	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Candidimonas	95.0	N/A	N/A	N/A	N/A	1	-
GCA_017857895.1	s__Algicoccus sp017857895	76.8471	76	934	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Algicoccus	95.0	N/A	N/A	N/A	N/A	1	-
GCA_903894155.1	s__NBD-18 sp903894155	76.6521	61	934	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__NBD-18	95.0	99.08	98.85	0.85	0.83	4	-
GCA_903821335.1	s__NBD-18 sp903821335	76.3489	59	934	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__NBD-18	95.0	99.43	99.29	0.86	0.84	5	-
GCF_019166065.1	s__Candidimonas humi	76.1371	53	934	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Candidimonas	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900078335.1	s__Bordetella trematum	75.8868	60	934	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Bordetella	95.0	99.15	97.27	0.96	0.93	14	-
GCF_002188635.1	s__Pigmentiphaga sp002188635	75.8831	58	934	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Pigmentiphaga	95.0	99.65	99.31	0.97	0.93	3	-
--------------------------------------------------------------------------------
[2024-01-24 12:07:42,978] [INFO] GTDB search result was written to GCF_022811525.1_ASM2281152v1_genomic.fna/result_gtdb.tsv
[2024-01-24 12:07:42,979] [INFO] ===== GTDB Search completed =====
[2024-01-24 12:07:42,985] [INFO] DFAST_QC result json was written to GCF_022811525.1_ASM2281152v1_genomic.fna/dqc_result.json
[2024-01-24 12:07:42,985] [INFO] DFAST_QC completed!
[2024-01-24 12:07:42,986] [INFO] Total running time: 0h1m30s
