[2024-01-24 14:19:14,945] [INFO] DFAST_QC pipeline started.
[2024-01-24 14:19:14,951] [INFO] DFAST_QC version: 0.5.7
[2024-01-24 14:19:14,951] [INFO] DQC Reference Directory: /var/lib/cwl/stg26563b22-9735-43e9-a8e5-eee5607b7920/dqc_reference
[2024-01-24 14:19:16,106] [INFO] ===== Start taxonomy check using ANI =====
[2024-01-24 14:19:16,107] [INFO] Task started: Prodigal
[2024-01-24 14:19:16,107] [INFO] Running command: gunzip -c /var/lib/cwl/stge20d2c51-7c7a-4435-90da-07fd420b63a0/GCF_000621665.1_ASM62166v1_genomic.fna.gz | prodigal -d GCF_000621665.1_ASM62166v1_genomic.fna/cds.fna -a GCF_000621665.1_ASM62166v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2024-01-24 14:19:28,828] [INFO] Task succeeded: Prodigal
[2024-01-24 14:19:28,829] [INFO] Task started: HMMsearch
[2024-01-24 14:19:28,829] [INFO] Running command: hmmsearch --tblout GCF_000621665.1_ASM62166v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg26563b22-9735-43e9-a8e5-eee5607b7920/dqc_reference/reference_markers.hmm GCF_000621665.1_ASM62166v1_genomic.fna/protein.faa > /dev/null
[2024-01-24 14:19:29,042] [INFO] Task succeeded: HMMsearch
[2024-01-24 14:19:29,044] [INFO] Found 6/6 markers.
[2024-01-24 14:19:29,076] [INFO] Query marker FASTA was written to GCF_000621665.1_ASM62166v1_genomic.fna/markers.fasta
[2024-01-24 14:19:29,077] [INFO] Task started: Blastn
[2024-01-24 14:19:29,077] [INFO] Running command: blastn -query GCF_000621665.1_ASM62166v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg26563b22-9735-43e9-a8e5-eee5607b7920/dqc_reference/reference_markers.fasta -out GCF_000621665.1_ASM62166v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 14:19:30,001] [INFO] Task succeeded: Blastn
[2024-01-24 14:19:30,004] [INFO] Selected 27 target genomes.
[2024-01-24 14:19:30,005] [INFO] Target genome list was writen to GCF_000621665.1_ASM62166v1_genomic.fna/target_genomes.txt
[2024-01-24 14:19:30,020] [INFO] Task started: fastANI
[2024-01-24 14:19:30,021] [INFO] Running command: fastANI --query /var/lib/cwl/stge20d2c51-7c7a-4435-90da-07fd420b63a0/GCF_000621665.1_ASM62166v1_genomic.fna.gz --refList GCF_000621665.1_ASM62166v1_genomic.fna/target_genomes.txt --output GCF_000621665.1_ASM62166v1_genomic.fna/fastani_result.tsv --threads 1
[2024-01-24 14:19:53,399] [INFO] Task succeeded: fastANI
[2024-01-24 14:19:53,400] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg26563b22-9735-43e9-a8e5-eee5607b7920/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2024-01-24 14:19:53,400] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg26563b22-9735-43e9-a8e5-eee5607b7920/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2024-01-24 14:19:53,425] [INFO] Found 27 fastANI hits (0 hits with ANI > threshold)
[2024-01-24 14:19:53,425] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2024-01-24 14:19:53,425] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Rhizobium taibaishanense	strain=DSM 100021	GCA_014196735.1	887144	887144	type	True	79.5439	667	1354	95	below_threshold
Rhizobium taibaishanense	strain=14971	GCA_001938985.1	887144	887144	type	True	79.49	681	1354	95	below_threshold
Ciceribacter selenitireducens	strain=ATCC BAA-1503	GCA_000518785.1	448181	448181	type	True	78.9886	509	1354	95	below_threshold
Rhizobium rosettiformans	strain=DSM 26376	GCA_014202175.1	1368430	1368430	type	True	78.8354	479	1354	95	below_threshold
Rhizobium rosettiformans	strain=W3	GCA_004912135.1	1368430	1368430	type	True	78.7961	483	1354	95	below_threshold
Agrobacterium leguminum	strain=MOPV5	GCA_015704895.1	2792015	2792015	type	True	78.7178	533	1354	95	below_threshold
Agrobacterium salinitolerans	strain=YIC 5082	GCA_002008225.1	1183413	1183413	type	True	78.6888	500	1354	95	below_threshold
Rhizobium daejeonense	strain=CCBAU10050	GCA_011045155.1	240521	240521	type	True	78.6456	469	1354	95	below_threshold
Shinella zoogloeoides	strain=DSM 287	GCA_009826855.1	352475	352475	type	True	78.6293	442	1354	95	below_threshold
Rhizobium daejeonense	strain=L61	GCA_014280875.1	240521	240521	type	True	78.6183	474	1354	95	below_threshold
Ciceribacter lividus	strain=DSM 25528	GCA_003337715.1	1197950	1197950	type	True	78.595	407	1354	95	below_threshold
Rhizobium glycinendophyticum	strain=CL12	GCA_006443685.1	2589807	2589807	type	True	78.5838	467	1354	95	below_threshold
Ciceribacter ferrooxidans	strain=F8825	GCA_004137355.1	2509717	2509717	type	True	78.4895	414	1354	95	below_threshold
Ciceribacter thiooxidans	strain=F43B	GCA_014126615.1	1969821	1969821	type	True	78.3931	412	1354	95	below_threshold
Rhizobium croatiense	strain=13T	GCA_019793465.1	2867516	2867516	type	True	78.3716	474	1354	95	below_threshold
Rhizobium rhizoryzae	strain=DSM 29514	GCA_011046895.1	451876	451876	type	True	78.3381	384	1354	95	below_threshold
Rhizobium rhizoryzae	strain=DSM 29514	GCA_014196605.1	451876	451876	type	True	78.3155	383	1354	95	below_threshold
Rhizobium phaseoli	strain=ATCC 14482	GCA_003985125.1	396	396	type	True	78.225	470	1354	95	below_threshold
Rhizobium sophoriradicis	strain=CCBAU 03470	GCA_003939025.1	1535245	1535245	type	True	78.2202	475	1354	95	below_threshold
Rhizobium binae	strain=BLR195	GCA_017357225.1	1138190	1138190	type	True	78.2032	466	1354	95	below_threshold
Rhizobium binae	strain=BLR195	GCA_019684455.1	1138190	1138190	type	True	78.1896	471	1354	95	below_threshold
Rhizobium lentis	strain=BLR27	GCA_017352135.1	1138194	1138194	type	True	78.1063	449	1354	95	below_threshold
Rhizobium lentis	strain=BLR27	GCA_019684715.1	1138194	1138194	type	True	78.1052	444	1354	95	below_threshold
Rhizobium petrolearium	strain=DSM 26482	GCA_017873175.1	515361	515361	type	True	78.0743	396	1354	95	below_threshold
Rhizobium etli	strain=CFN 42	GCA_000092045.1	29449	29449	suspected-type	True	78.0632	460	1354	95	below_threshold
Rhizobium skierniewicense	strain=DSM 26438	GCA_014196515.1	984260	984260	type	True	77.8859	347	1354	95	below_threshold
Rhizobium skierniewicense	strain=Ch11	GCA_023757665.1	984260	984260	type	True	77.828	346	1354	95	below_threshold
--------------------------------------------------------------------------------
[2024-01-24 14:19:53,428] [INFO] DFAST Taxonomy check result was written to GCF_000621665.1_ASM62166v1_genomic.fna/tc_result.tsv
[2024-01-24 14:19:53,428] [INFO] ===== Taxonomy check completed =====
[2024-01-24 14:19:53,429] [INFO] ===== Start completeness check using CheckM =====
[2024-01-24 14:19:53,429] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg26563b22-9735-43e9-a8e5-eee5607b7920/dqc_reference/checkm_data
[2024-01-24 14:19:53,430] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2024-01-24 14:19:53,471] [INFO] Task started: CheckM
[2024-01-24 14:19:53,471] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_000621665.1_ASM62166v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_000621665.1_ASM62166v1_genomic.fna/checkm_input GCF_000621665.1_ASM62166v1_genomic.fna/checkm_result
[2024-01-24 14:20:36,323] [INFO] Task succeeded: CheckM
[2024-01-24 14:20:36,325] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2024-01-24 14:20:36,343] [INFO] ===== Completeness check finished =====
[2024-01-24 14:20:36,343] [INFO] ===== Start GTDB Search =====
[2024-01-24 14:20:36,343] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_000621665.1_ASM62166v1_genomic.fna/markers.fasta)
[2024-01-24 14:20:36,344] [INFO] Task started: Blastn
[2024-01-24 14:20:36,344] [INFO] Running command: blastn -query GCF_000621665.1_ASM62166v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg26563b22-9735-43e9-a8e5-eee5607b7920/dqc_reference/reference_markers_gtdb.fasta -out GCF_000621665.1_ASM62166v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 14:20:38,174] [INFO] Task succeeded: Blastn
[2024-01-24 14:20:38,177] [INFO] Selected 24 target genomes.
[2024-01-24 14:20:38,178] [INFO] Target genome list was writen to GCF_000621665.1_ASM62166v1_genomic.fna/target_genomes_gtdb.txt
[2024-01-24 14:20:38,227] [INFO] Task started: fastANI
[2024-01-24 14:20:38,228] [INFO] Running command: fastANI --query /var/lib/cwl/stge20d2c51-7c7a-4435-90da-07fd420b63a0/GCF_000621665.1_ASM62166v1_genomic.fna.gz --refList GCF_000621665.1_ASM62166v1_genomic.fna/target_genomes_gtdb.txt --output GCF_000621665.1_ASM62166v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2024-01-24 14:21:02,567] [INFO] Task succeeded: fastANI
[2024-01-24 14:21:02,585] [INFO] Found 24 fastANI hits (1 hits with ANI > circumscription radius)
[2024-01-24 14:21:02,586] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_000621665.1	s__Allorhizobium undicola	100.0	1352	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	N/A	N/A	N/A	N/A	1	conclusive
GCF_009744115.1	s__Allorhizobium vitis_E	79.7237	697	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001758275.2	s__Allorhizobium vitis_B	79.6315	683	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	98.87	95.34	0.94	0.85	14	-
GCF_001541345.2	s__Allorhizobium vitis	79.5085	666	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	98.04	96.12	0.92	0.85	14	-
GCF_001938985.1	s__Allorhizobium taibaishanense	79.4951	681	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	100.00	100.00	1.00	1.00	2	-
GCF_013426735.1	s__Allorhizobium vitis_D	79.48	683	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	97.49	97.49	0.89	0.89	2	-
GCF_014237835.1	s__Allorhizobium sp014237835	79.0367	467	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000300855.1	s__Allorhizobium albertimagni	78.9712	449	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002600635.1	s__Allorhizobium sp002600635	78.9363	515	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002796995.1	s__Allorhizobium sp002796995	78.8141	536	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	96.22	96.20	0.94	0.93	7	-
GCF_004912135.1	s__Allorhizobium rosettiformans	78.8095	481	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	96.40	95.42	0.87	0.84	7	-
GCF_001429245.1	s__Allorhizobium sp001429245	78.7095	473	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	100.00	100.00	1.00	1.00	2	-
GCF_900156055.1	s__Allorhizobium sp900156055	78.6315	452	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_900466945.1	s__Allorhizobium sp900466945	78.5703	452	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	100.00	100.00	1.00	1.00	2	-
GCA_900473805.1	s__Allorhizobium sp900473805	78.5589	478	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	99.45	97.28	0.98	0.92	6	-
GCF_014196535.1	s__Rhizobium_B borbori	78.5178	455	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Rhizobium_B	95.0	98.16	98.16	0.88	0.88	2	-
GCF_004801395.1	s__Allorhizobium terrae	78.4295	531	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	98.05	98.05	0.96	0.96	2	-
GCF_002211305.1	s__Rhizobium sp002211305	78.4228	472	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Rhizobium	95.0	96.97	96.97	0.89	0.89	2	-
GCF_001939045.1	s__Allorhizobium oryziradicis	78.3331	523	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002531955.1	s__Rhizobium sp002531955	78.2933	478	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Rhizobium	95.0	96.59	95.76	0.89	0.85	5	-
GCF_017599385.1	s__Allorhizobium sp017599385	78.2476	395	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	97.20	97.20	0.96	0.96	2	-
GCF_003985125.1	s__Rhizobium phaseoli	78.2366	469	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Rhizobium	95.0	97.51	96.92	0.91	0.87	41	-
GCF_001662075.1	s__Rhizobium bangladeshense_B	78.1369	471	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Rhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000092045.1	s__Rhizobium etli	78.0643	459	1354	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Rhizobium	95.0	98.53	98.53	0.90	0.89	3	-
--------------------------------------------------------------------------------
[2024-01-24 14:21:02,588] [INFO] GTDB search result was written to GCF_000621665.1_ASM62166v1_genomic.fna/result_gtdb.tsv
[2024-01-24 14:21:02,588] [INFO] ===== GTDB Search completed =====
[2024-01-24 14:21:02,593] [INFO] DFAST_QC result json was written to GCF_000621665.1_ASM62166v1_genomic.fna/dqc_result.json
[2024-01-24 14:21:02,593] [INFO] DFAST_QC completed!
[2024-01-24 14:21:02,593] [INFO] Total running time: 0h1m48s
