[2023-06-13 01:40:16,084] [INFO] DFAST_QC pipeline started.
[2023-06-13 01:40:16,087] [INFO] DFAST_QC version: 0.5.7
[2023-06-13 01:40:16,087] [INFO] DQC Reference Directory: /var/lib/cwl/stg4605fae5-84b2-4c1a-994a-342d2e0fd3be/dqc_reference
[2023-06-13 01:40:17,272] [INFO] ===== Start taxonomy check using ANI =====
[2023-06-13 01:40:17,273] [INFO] Task started: Prodigal
[2023-06-13 01:40:17,273] [INFO] Running command: gunzip -c /var/lib/cwl/stg6b3c4b28-8538-4159-a562-f6bab67cfe00/GCA_022841585.1_ASM2284158v1_genomic.fna.gz | prodigal -d GCA_022841585.1_ASM2284158v1_genomic.fna/cds.fna -a GCA_022841585.1_ASM2284158v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2023-06-13 01:40:25,631] [INFO] Task succeeded: Prodigal
[2023-06-13 01:40:25,631] [INFO] Task started: HMMsearch
[2023-06-13 01:40:25,631] [INFO] Running command: hmmsearch --tblout GCA_022841585.1_ASM2284158v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg4605fae5-84b2-4c1a-994a-342d2e0fd3be/dqc_reference/reference_markers.hmm GCA_022841585.1_ASM2284158v1_genomic.fna/protein.faa > /dev/null
[2023-06-13 01:40:25,890] [INFO] Task succeeded: HMMsearch
[2023-06-13 01:40:25,891] [INFO] Found 6/6 markers.
[2023-06-13 01:40:25,927] [INFO] Query marker FASTA was written to GCA_022841585.1_ASM2284158v1_genomic.fna/markers.fasta
[2023-06-13 01:40:25,927] [INFO] Task started: Blastn
[2023-06-13 01:40:25,928] [INFO] Running command: blastn -query GCA_022841585.1_ASM2284158v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg4605fae5-84b2-4c1a-994a-342d2e0fd3be/dqc_reference/reference_markers.fasta -out GCA_022841585.1_ASM2284158v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-06-13 01:40:26,864] [INFO] Task succeeded: Blastn
[2023-06-13 01:40:26,869] [INFO] Selected 34 target genomes.
[2023-06-13 01:40:26,870] [INFO] Target genome list was writen to GCA_022841585.1_ASM2284158v1_genomic.fna/target_genomes.txt
[2023-06-13 01:40:26,906] [INFO] Task started: fastANI
[2023-06-13 01:40:26,906] [INFO] Running command: fastANI --query /var/lib/cwl/stg6b3c4b28-8538-4159-a562-f6bab67cfe00/GCA_022841585.1_ASM2284158v1_genomic.fna.gz --refList GCA_022841585.1_ASM2284158v1_genomic.fna/target_genomes.txt --output GCA_022841585.1_ASM2284158v1_genomic.fna/fastani_result.tsv --threads 1
[2023-06-13 01:40:57,370] [INFO] Task succeeded: fastANI
[2023-06-13 01:40:57,370] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg4605fae5-84b2-4c1a-994a-342d2e0fd3be/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-06-13 01:40:57,371] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg4605fae5-84b2-4c1a-994a-342d2e0fd3be/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-06-13 01:40:57,394] [INFO] Found 34 fastANI hits (0 hits with ANI > threshold)
[2023-06-13 01:40:57,395] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-06-13 01:40:57,395] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Afipia felis	strain=ATCC 53690	GCA_000314735.2	1035	1035	type	True	77.4251	172	978	95	below_threshold
Variibacter gotjawalensis	strain=GJW-30	GCA_002355335.1	1333996	1333996	type	True	77.4117	220	978	95	below_threshold
Afipia felis	strain=NCTC12499	GCA_900445155.1	1035	1035	type	True	77.4086	179	978	95	below_threshold
Rhodopseudomonas rhenobacensis	strain=DSM 12706	GCA_014203125.1	87461	87461	type	True	77.3466	194	978	95	below_threshold
Bradyrhizobium ottawaense	strain=OO99	GCA_002278135.2	931866	931866	type	True	77.312	229	978	95	below_threshold
Bradyrhizobium shewense	strain=ERR11	GCA_900094605.1	1761772	1761772	type	True	77.25	239	978	95	below_threshold
Bradyrhizobium sediminis	strain=S2-20-1	GCA_018736085.1	2840469	2840469	type	True	77.2334	223	978	95	below_threshold
Bradyrhizobium niftali	strain=CNPSo 3448	GCA_004571025.1	2560055	2560055	type	True	77.2238	250	978	95	below_threshold
Blastochloris tepida	strain=GI	GCA_003966715.1	2233851	2233851	type	True	77.2142	163	978	95	below_threshold
Bradyrhizobium lablabi	strain=CCBAU 23086	GCA_001440475.1	722472	722472	suspected-type	True	77.2031	224	978	95	below_threshold
Afipia carboxidovorans	strain=OM5; ATCC 49405	GCA_000021365.1	40137	40137	type	True	77.1489	198	978	95	below_threshold
Afipia carboxidovorans	strain=OM5	GCA_000218565.1	40137	40137	type	True	77.1256	198	978	95	below_threshold
Bradyrhizobium canariense	strain=BTA-1	GCA_019402665.1	255045	255045	suspected-type	True	77.1223	213	978	95	below_threshold
Bradyrhizobium centrolobii	strain=BR 10245	GCA_001641635.1	1505087	1505087	type	True	77.1045	231	978	95	below_threshold
Bradyrhizobium aeschynomenes	strain=83002	GCA_013178945.1	2734909	2734909	type	True	77.1029	205	978	95	below_threshold
Bradyrhizobium betae	strain=CECT 5829	GCA_024806875.1	244734	244734	type	True	77.0778	219	978	95	below_threshold
Bradyrhizobium neotropicale	strain=BR 10247	GCA_001641695.1	1497615	1497615	type	True	77.0297	239	978	95	below_threshold
Bradyrhizobium daqingense	strain=CCBAU 15774	GCA_021044685.1	993502	993502	type	True	76.9998	234	978	95	below_threshold
Ancylobacter dichloromethanicus	strain=VKM B-2484	GCA_018390645.1	518825	518825	type	True	76.9811	166	978	95	below_threshold
Bradyrhizobium vignae	strain=LMG 28791	GCA_004114425.1	1549949	1549949	type	True	76.9236	233	978	95	below_threshold
Chelatococcus caeni	strain=DSM 103737	GCA_014196925.1	1348468	1348468	type	True	76.8789	157	978	95	below_threshold
Methylobacterium terricola	strain=17Sr1-39	GCA_006151805.1	2583531	2583531	type	True	76.8589	150	978	95	below_threshold
Blastochloris viridis	strain=DSM 133	GCA_001548155.2	1079	1079	type	True	76.7914	156	978	95	below_threshold
Blastochloris viridis	strain=ATCC 19567	GCA_001402875.1	1079	1079	type	True	76.7519	159	978	95	below_threshold
Phreatobacter stygius	strain=KCTC 52518	GCA_005144885.1	1940610	1940610	type	True	76.7477	194	978	95	below_threshold
Phreatobacter cathodiphilus	strain=S-12	GCA_003008515.1	1868589	1868589	type	True	76.6751	182	978	95	below_threshold
Chelatococcus composti	strain=DSM 101465	GCA_018398355.1	1743235	1743235	type	True	76.5659	130	978	95	below_threshold
Mesorhizobium atlanticum	strain=CNPSo 3140	GCA_003289965.1	2233532	2233532	type	True	76.551	136	978	95	below_threshold
Pannonibacter phragmitetus	strain=NCTC13350	GCA_900454465.1	121719	121719	suspected-type	True	76.5219	106	978	95	below_threshold
Pannonibacter phragmitetus	strain=DSM 14782	GCA_000382365.1	121719	121719	suspected-type	True	76.5142	104	978	95	below_threshold
Mesorhizobium qingshengii	strain=CGMCC 1.12097	GCA_900103325.1	1165689	1165689	type	True	76.4886	138	978	95	below_threshold
Salinarimonas ramus	strain=CGMCC 1.9161	GCA_014645695.1	690164	690164	type	True	76.4665	134	978	95	below_threshold
Methylobacterium haplocladii	strain=NBRC 107714	GCA_007992175.1	1176176	1176176	type	True	76.4554	121	978	95	below_threshold
Oceanibaculum nanhaiense	strain=L54-1-50	GCA_002148795.1	1909734	1909734	type	True	76.4084	81	978	95	below_threshold
--------------------------------------------------------------------------------
[2023-06-13 01:40:57,397] [INFO] DFAST Taxonomy check result was written to GCA_022841585.1_ASM2284158v1_genomic.fna/tc_result.tsv
[2023-06-13 01:40:57,397] [INFO] ===== Taxonomy check completed =====
[2023-06-13 01:40:57,398] [INFO] ===== Start completeness check using CheckM =====
[2023-06-13 01:40:57,398] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg4605fae5-84b2-4c1a-994a-342d2e0fd3be/dqc_reference/checkm_data
[2023-06-13 01:40:57,399] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-06-13 01:40:57,433] [INFO] Task started: CheckM
[2023-06-13 01:40:57,433] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_022841585.1_ASM2284158v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_022841585.1_ASM2284158v1_genomic.fna/checkm_input GCA_022841585.1_ASM2284158v1_genomic.fna/checkm_result
[2023-06-13 01:41:26,925] [INFO] Task succeeded: CheckM
[2023-06-13 01:41:26,927] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 4.63%
Strain heterogeneity: 100.00%
--------------------------------------------------------------------------------
[2023-06-13 01:41:26,953] [INFO] ===== Completeness check finished =====
[2023-06-13 01:41:26,953] [INFO] ===== Start GTDB Search =====
[2023-06-13 01:41:26,954] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_022841585.1_ASM2284158v1_genomic.fna/markers.fasta)
[2023-06-13 01:41:26,954] [INFO] Task started: Blastn
[2023-06-13 01:41:26,954] [INFO] Running command: blastn -query GCA_022841585.1_ASM2284158v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg4605fae5-84b2-4c1a-994a-342d2e0fd3be/dqc_reference/reference_markers_gtdb.fasta -out GCA_022841585.1_ASM2284158v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-06-13 01:41:28,770] [INFO] Task succeeded: Blastn
[2023-06-13 01:41:28,776] [INFO] Selected 20 target genomes.
[2023-06-13 01:41:28,776] [INFO] Target genome list was writen to GCA_022841585.1_ASM2284158v1_genomic.fna/target_genomes_gtdb.txt
[2023-06-13 01:41:28,789] [INFO] Task started: fastANI
[2023-06-13 01:41:28,789] [INFO] Running command: fastANI --query /var/lib/cwl/stg6b3c4b28-8538-4159-a562-f6bab67cfe00/GCA_022841585.1_ASM2284158v1_genomic.fna.gz --refList GCA_022841585.1_ASM2284158v1_genomic.fna/target_genomes_gtdb.txt --output GCA_022841585.1_ASM2284158v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2023-06-13 01:41:47,661] [INFO] Task succeeded: fastANI
[2023-06-13 01:41:47,677] [INFO] Found 20 fastANI hits (1 hits with ANI > circumscription radius)
[2023-06-13 01:41:47,678] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCA_001464215.1	s__Ga0077548 sp001464215	99.7827	791	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Ga0077548	95.0	N/A	N/A	N/A	N/A	1	conclusive
GCA_016713235.1	s__Ga0077548 sp016713235	83.18	729	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Ga0077548	95.0	N/A	N/A	N/A	N/A	1	-
GCA_018335895.1	s__Ga0077548 sp018335895	82.3088	639	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Ga0077548	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001579845.1	s__Z2-YC6860 sp001579845	77.5247	234	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Z2-YC6860	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003367395.1	s__Pseudolabrys taiwanensis	77.3162	237	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudolabrys	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003966715.1	s__Blastochloris tepida	77.2142	163	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Blastochloris	95.0	N/A	N/A	N/A	N/A	1	-
GCA_001464835.1	s__Pseudolabrys sp001464835	77.1858	205	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudolabrys	95.0	N/A	N/A	N/A	N/A	1	-
GCA_004297465.1	s__SCUB01 sp004297465	77.1832	196	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__SCUB01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016215605.1	s__Rhodopseudomonas palustris_L	77.1432	206	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Rhodopseudomonas	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016201085.1	s__Pseudolabrys sp016201085	77.1412	168	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudolabrys	95.0	N/A	N/A	N/A	N/A	1	-
GCA_003135415.1	s__Palsa-892 sp003135415	77.1203	262	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Palsa-892	95.0	96.11	96.04	0.87	0.83	3	-
GCA_004799445.1	s__Bradyrhizobium sp004799445	77.099	187	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Bradyrhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000473045.1	s__Bradyrhizobium sp000473045	77.0527	232	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Bradyrhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000938305.1	s__Bradyrhizobium sp000938305	77.0259	225	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Bradyrhizobium	95.0	95.61	95.61	0.87	0.87	2	-
GCF_018130955.1	s__Bradyrhizobium liaoningense_C	77.0143	223	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Bradyrhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900129675.1	s__Bradyrhizobium erythrophlei_A	77.0107	227	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Bradyrhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_903929045.1	s__Pseudolabrys sp903929045	76.9737	196	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudolabrys	95.0	99.84	99.84	0.87	0.87	2	-
GCF_014196925.1	s__Chelatococcus_A caeni	76.8792	156	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Beijerinckiaceae;g__Chelatococcus_A	95.0	98.75	98.72	0.96	0.95	3	-
GCF_005144885.1	s__Phreatobacter stygius	76.7488	194	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Phreatobacteraceae;g__Phreatobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014199915.1	s__Prosthecomicrobium pneumaticum	76.7238	164	978	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Kaistiaceae;g__Prosthecomicrobium	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-06-13 01:41:47,681] [INFO] GTDB search result was written to GCA_022841585.1_ASM2284158v1_genomic.fna/result_gtdb.tsv
[2023-06-13 01:41:47,682] [INFO] ===== GTDB Search completed =====
[2023-06-13 01:41:47,687] [INFO] DFAST_QC result json was written to GCA_022841585.1_ASM2284158v1_genomic.fna/dqc_result.json
[2023-06-13 01:41:47,687] [INFO] DFAST_QC completed!
[2023-06-13 01:41:47,687] [INFO] Total running time: 0h1m32s
