[2024-01-24 13:22:02,331] [INFO] DFAST_QC pipeline started.
[2024-01-24 13:22:02,333] [INFO] DFAST_QC version: 0.5.7
[2024-01-24 13:22:02,333] [INFO] DQC Reference Directory: /var/lib/cwl/stgf497cda5-e140-4b13-ba92-05aefb021135/dqc_reference
[2024-01-24 13:22:03,843] [INFO] ===== Start taxonomy check using ANI =====
[2024-01-24 13:22:03,844] [INFO] Task started: Prodigal
[2024-01-24 13:22:03,844] [INFO] Running command: gunzip -c /var/lib/cwl/stg78b346ce-ab56-4651-ae13-fb9fff9d4d5f/GCF_016722925.1_ASM1672292v1_genomic.fna.gz | prodigal -d GCF_016722925.1_ASM1672292v1_genomic.fna/cds.fna -a GCF_016722925.1_ASM1672292v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2024-01-24 13:22:21,563] [INFO] Task succeeded: Prodigal
[2024-01-24 13:22:21,563] [INFO] Task started: HMMsearch
[2024-01-24 13:22:21,563] [INFO] Running command: hmmsearch --tblout GCF_016722925.1_ASM1672292v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgf497cda5-e140-4b13-ba92-05aefb021135/dqc_reference/reference_markers.hmm GCF_016722925.1_ASM1672292v1_genomic.fna/protein.faa > /dev/null
[2024-01-24 13:22:21,910] [INFO] Task succeeded: HMMsearch
[2024-01-24 13:22:21,911] [INFO] Found 6/6 markers.
[2024-01-24 13:22:21,957] [INFO] Query marker FASTA was written to GCF_016722925.1_ASM1672292v1_genomic.fna/markers.fasta
[2024-01-24 13:22:21,958] [INFO] Task started: Blastn
[2024-01-24 13:22:21,958] [INFO] Running command: blastn -query GCF_016722925.1_ASM1672292v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgf497cda5-e140-4b13-ba92-05aefb021135/dqc_reference/reference_markers.fasta -out GCF_016722925.1_ASM1672292v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 13:22:22,985] [INFO] Task succeeded: Blastn
[2024-01-24 13:22:22,989] [INFO] Selected 34 target genomes.
[2024-01-24 13:22:22,989] [INFO] Target genome list was writen to GCF_016722925.1_ASM1672292v1_genomic.fna/target_genomes.txt
[2024-01-24 13:22:23,008] [INFO] Task started: fastANI
[2024-01-24 13:22:23,009] [INFO] Running command: fastANI --query /var/lib/cwl/stg78b346ce-ab56-4651-ae13-fb9fff9d4d5f/GCF_016722925.1_ASM1672292v1_genomic.fna.gz --refList GCF_016722925.1_ASM1672292v1_genomic.fna/target_genomes.txt --output GCF_016722925.1_ASM1672292v1_genomic.fna/fastani_result.tsv --threads 1
[2024-01-24 13:23:05,795] [INFO] Task succeeded: fastANI
[2024-01-24 13:23:05,796] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgf497cda5-e140-4b13-ba92-05aefb021135/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2024-01-24 13:23:05,796] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgf497cda5-e140-4b13-ba92-05aefb021135/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2024-01-24 13:23:05,828] [INFO] Found 34 fastANI hits (0 hits with ANI > threshold)
[2024-01-24 13:23:05,828] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2024-01-24 13:23:05,829] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Shinella curvata	strain=C3	GCA_022899935.1	1817964	1817964	type	True	78.8435	572	1851	95	below_threshold
Ciceribacter selenitireducens	strain=ATCC BAA-1503	GCA_000518785.1	448181	448181	type	True	78.8235	486	1851	95	below_threshold
Shinella yambaruensis	strain=DSM 18801	GCA_022899355.1	415996	415996	type	True	78.8109	618	1851	95	below_threshold
Shinella fusca	strain=DSM 21319	GCA_014203155.1	544480	544480	type	True	78.8034	513	1851	95	below_threshold
Shinella zoogloeoides	strain=ATCC 19623	GCA_020883495.1	352475	352475	type	True	78.7911	528	1851	95	below_threshold
Shinella pollutisoli	strain=KCTC 52677	GCA_024609765.1	2250594	2250594	type	True	78.7908	493	1851	95	below_threshold
Shinella sumterensis	strain=MEC087	GCA_004514425.2	1967501	1967501	type	True	78.7541	499	1851	95	below_threshold
Rhizobium giardinii	strain=H152	GCA_000379605.1	56731	56731	type	True	78.6578	533	1851	95	below_threshold
Rhizobium sophoriradicis	strain=CCBAU 03470	GCA_003939025.1	1535245	1535245	type	True	78.6561	470	1851	95	below_threshold
Rhizobium vallis	strain=CCBAU 65647	GCA_003985155.1	634290	634290	type	True	78.6091	505	1851	95	below_threshold
Rhizobium pisi	strain=DSM 30132	GCA_003938655.1	574561	574561	type	True	78.5554	492	1851	95	below_threshold
Rhizobium metallidurans	strain=DSM 26575	GCA_014196505.1	1265931	1265931	type	True	78.5312	461	1851	95	below_threshold
Rhizobium wuzhouense	strain=W44	GCA_003205195.1	1986026	1986026	type	True	78.513	428	1851	95	below_threshold
Rhizobium phaseoli	strain=ATCC 14482	GCA_003985125.1	396	396	type	True	78.5054	501	1851	95	below_threshold
Neorhizobium vignae	strain=CCBAU 05176	GCA_000732195.1	690585	690585	type	True	78.4507	469	1851	95	below_threshold
Rhizobium tropici	strain=CIAT 899	GCA_000330885.1	398	398	type	True	78.4266	471	1851	95	below_threshold
Agrobacterium rhizogenes	strain=LMG150	GCA_007002985.1	359	359	type	True	78.4113	467	1851	95	below_threshold
Rhizobium binae	strain=BLR195	GCA_019684455.1	1138190	1138190	type	True	78.3895	480	1851	95	below_threshold
Agrobacterium rhizogenes	strain=NBRC 13257	GCA_000696095.1	359	359	type	True	78.3463	477	1851	95	below_threshold
Rhizobium binae	strain=BLR195	GCA_017357225.1	1138190	1138190	type	True	78.3328	500	1851	95	below_threshold
Rhizobium multihospitium	strain=HAMBI 2975	GCA_900094585.1	410764	410764	type	True	78.2895	503	1851	95	below_threshold
Rhizobium hainanense	strain=CCBAU 57015	GCA_900094555.1	52131	52131	type	True	78.2793	479	1851	95	below_threshold
Rhizobium glycinendophyticum	strain=CL12	GCA_006443685.1	2589807	2589807	type	True	78.2499	391	1851	95	below_threshold
Ensifer mexicanus	strain=ITTG R7	GCA_013488225.1	375549	375549	type	True	78.2427	468	1851	95	below_threshold
Shinella daejeonensis	strain=JCM 16236	GCA_024281235.1	659017	659017	type	True	78.2266	412	1851	95	below_threshold
Rhizobium jaguaris	strain=CCGE525	GCA_003627755.1	1312183	1312183	type	True	78.2249	473	1851	95	below_threshold
Sinorhizobium meliloti	strain=NBRC 14782	GCA_006539625.1	382	382	type	True	78.2039	457	1851	95	below_threshold
Ensifer psoraleae	strain=CCBAU 65732	GCA_013283645.1	520838	520838	type	True	78.1981	468	1851	95	below_threshold
Neorhizobium huautlense	strain=DSM 21817	GCA_002968575.1	67774	67774	type	True	78.1458	348	1851	95	below_threshold
Ensifer mexicanus	strain=DSM 18446	GCA_017873105.1	375549	375549	type	True	78.1402	472	1851	95	below_threshold
Ciceribacter ferrooxidans	strain=F8825	GCA_004137355.1	2509717	2509717	type	True	78.1309	410	1851	95	below_threshold
Neorhizobium alkalisoli	strain=DSM 21826	GCA_002968635.1	528178	528178	type	True	78.1122	402	1851	95	below_threshold
Rhizobium altiplani	strain=BR 10423	GCA_001542405.1	1864509	1864509	type	True	78.0572	411	1851	95	below_threshold
Rhizobium mongolense subsp. loessense	strain=CGMCC 1.3401	GCA_900099775.1	158890	57676	type	True	77.9404	421	1851	95	below_threshold
--------------------------------------------------------------------------------
[2024-01-24 13:23:05,830] [INFO] DFAST Taxonomy check result was written to GCF_016722925.1_ASM1672292v1_genomic.fna/tc_result.tsv
[2024-01-24 13:23:05,831] [INFO] ===== Taxonomy check completed =====
[2024-01-24 13:23:05,831] [INFO] ===== Start completeness check using CheckM =====
[2024-01-24 13:23:05,831] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgf497cda5-e140-4b13-ba92-05aefb021135/dqc_reference/checkm_data
[2024-01-24 13:23:05,832] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2024-01-24 13:23:05,883] [INFO] Task started: CheckM
[2024-01-24 13:23:05,884] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_016722925.1_ASM1672292v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_016722925.1_ASM1672292v1_genomic.fna/checkm_input GCF_016722925.1_ASM1672292v1_genomic.fna/checkm_result
[2024-01-24 13:23:56,412] [INFO] Task succeeded: CheckM
[2024-01-24 13:23:56,413] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2024-01-24 13:23:56,434] [INFO] ===== Completeness check finished =====
[2024-01-24 13:23:56,434] [INFO] ===== Start GTDB Search =====
[2024-01-24 13:23:56,435] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_016722925.1_ASM1672292v1_genomic.fna/markers.fasta)
[2024-01-24 13:23:56,436] [INFO] Task started: Blastn
[2024-01-24 13:23:56,436] [INFO] Running command: blastn -query GCF_016722925.1_ASM1672292v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgf497cda5-e140-4b13-ba92-05aefb021135/dqc_reference/reference_markers_gtdb.fasta -out GCF_016722925.1_ASM1672292v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 13:23:58,271] [INFO] Task succeeded: Blastn
[2024-01-24 13:23:58,276] [INFO] Selected 18 target genomes.
[2024-01-24 13:23:58,276] [INFO] Target genome list was writen to GCF_016722925.1_ASM1672292v1_genomic.fna/target_genomes_gtdb.txt
[2024-01-24 13:23:58,341] [INFO] Task started: fastANI
[2024-01-24 13:23:58,341] [INFO] Running command: fastANI --query /var/lib/cwl/stg78b346ce-ab56-4651-ae13-fb9fff9d4d5f/GCF_016722925.1_ASM1672292v1_genomic.fna.gz --refList GCF_016722925.1_ASM1672292v1_genomic.fna/target_genomes_gtdb.txt --output GCF_016722925.1_ASM1672292v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2024-01-24 13:24:17,517] [INFO] Task succeeded: fastANI
[2024-01-24 13:24:17,537] [INFO] Found 18 fastANI hits (1 hits with ANI > circumscription radius)
[2024-01-24 13:24:17,538] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_016722925.1	s__FKL33 sp016722925	100.0	1846	1851	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__FKL33	95.0	N/A	N/A	N/A	N/A	1	conclusive
GCF_014158035.1	s__FKL33 sp014158035	79.5866	705	1851	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__FKL33	95.0	N/A	N/A	N/A	N/A	1	-
GCF_005222915.1	s__FKL33 sp005222915	79.5829	722	1851	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__FKL33	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014203155.1	s__Shinella fusca	78.8034	513	1851	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Shinella	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900155885.1	s__RU20A sp900155885	78.6995	439	1851	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__RU20A	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014195095.1	s__Rhizobium sp014195095	78.5824	518	1851	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Rhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001651875.1	s__Sinorhizobium saheli	78.5726	495	1851	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Sinorhizobium	95.0	99.97	99.97	0.96	0.96	2	-
GCA_900466575.1	s__Ensifer_A sp900466575	78.5301	413	1851	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Ensifer_A	95.0	100.00	100.00	1.00	1.00	2	-
GCF_900215285.1	s__Ensifer_A adhaerens_A	78.5259	405	1851	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Ensifer_A	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014193025.1	s__Rhizobium sp014193025	78.4655	530	1851	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Rhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_902706285.1	s__Rhizobium sp902706285	78.4444	476	1851	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Rhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003627795.1	s__Rhizobium sp003627795	78.3786	482	1851	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Rhizobium	95.0	99.80	99.80	0.98	0.98	2	-
GCF_001296045.1	s__Allorhizobium sp001296045	78.2679	372	1851	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002968845.1	s__Neorhizobium tomejilense	78.2436	479	1851	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Neorhizobium	95.0	97.02	95.22	0.90	0.84	9	-
GCF_013283645.1	s__Sinorhizobium psoraleae	78.1801	471	1851	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Sinorhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_900469175.1	s__Rhizobium sp900469175	78.1473	471	1851	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Rhizobium	95.0	98.45	95.56	0.94	0.85	48	-
GCF_002968575.1	s__Neorhizobium huautlense	78.1384	348	1851	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Neorhizobium	95.0	98.14	97.99	0.92	0.90	5	-
GCF_002968635.1	s__Neorhizobium alkalisoli	78.079	405	1851	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Neorhizobium	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2024-01-24 13:24:17,539] [INFO] GTDB search result was written to GCF_016722925.1_ASM1672292v1_genomic.fna/result_gtdb.tsv
[2024-01-24 13:24:17,539] [INFO] ===== GTDB Search completed =====
[2024-01-24 13:24:17,544] [INFO] DFAST_QC result json was written to GCF_016722925.1_ASM1672292v1_genomic.fna/dqc_result.json
[2024-01-24 13:24:17,544] [INFO] DFAST_QC completed!
[2024-01-24 13:24:17,544] [INFO] Total running time: 0h2m15s
