[2024-01-24 14:39:00,557] [INFO] DFAST_QC pipeline started.
[2024-01-24 14:39:00,559] [INFO] DFAST_QC version: 0.5.7
[2024-01-24 14:39:00,560] [INFO] DQC Reference Directory: /var/lib/cwl/stgfc1d05f0-64d8-40de-89c0-d7f7300c9ee5/dqc_reference
[2024-01-24 14:39:01,876] [INFO] ===== Start taxonomy check using ANI =====
[2024-01-24 14:39:01,877] [INFO] Task started: Prodigal
[2024-01-24 14:39:01,877] [INFO] Running command: gunzip -c /var/lib/cwl/stg843edd70-0cff-4393-8306-36eead5af112/GCF_004358105.1_ASM435810v1_genomic.fna.gz | prodigal -d GCF_004358105.1_ASM435810v1_genomic.fna/cds.fna -a GCF_004358105.1_ASM435810v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2024-01-24 14:39:15,212] [INFO] Task succeeded: Prodigal
[2024-01-24 14:39:15,212] [INFO] Task started: HMMsearch
[2024-01-24 14:39:15,212] [INFO] Running command: hmmsearch --tblout GCF_004358105.1_ASM435810v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgfc1d05f0-64d8-40de-89c0-d7f7300c9ee5/dqc_reference/reference_markers.hmm GCF_004358105.1_ASM435810v1_genomic.fna/protein.faa > /dev/null
[2024-01-24 14:39:15,577] [INFO] Task succeeded: HMMsearch
[2024-01-24 14:39:15,578] [INFO] Found 6/6 markers.
[2024-01-24 14:39:15,621] [INFO] Query marker FASTA was written to GCF_004358105.1_ASM435810v1_genomic.fna/markers.fasta
[2024-01-24 14:39:15,622] [INFO] Task started: Blastn
[2024-01-24 14:39:15,622] [INFO] Running command: blastn -query GCF_004358105.1_ASM435810v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgfc1d05f0-64d8-40de-89c0-d7f7300c9ee5/dqc_reference/reference_markers.fasta -out GCF_004358105.1_ASM435810v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 14:39:16,343] [INFO] Task succeeded: Blastn
[2024-01-24 14:39:16,347] [INFO] Selected 24 target genomes.
[2024-01-24 14:39:16,347] [INFO] Target genome list was writen to GCF_004358105.1_ASM435810v1_genomic.fna/target_genomes.txt
[2024-01-24 14:39:16,372] [INFO] Task started: fastANI
[2024-01-24 14:39:16,372] [INFO] Running command: fastANI --query /var/lib/cwl/stg843edd70-0cff-4393-8306-36eead5af112/GCF_004358105.1_ASM435810v1_genomic.fna.gz --refList GCF_004358105.1_ASM435810v1_genomic.fna/target_genomes.txt --output GCF_004358105.1_ASM435810v1_genomic.fna/fastani_result.tsv --threads 1
[2024-01-24 14:39:36,721] [INFO] Task succeeded: fastANI
[2024-01-24 14:39:36,722] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgfc1d05f0-64d8-40de-89c0-d7f7300c9ee5/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2024-01-24 14:39:36,722] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgfc1d05f0-64d8-40de-89c0-d7f7300c9ee5/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2024-01-24 14:39:36,738] [INFO] Found 20 fastANI hits (1 hits with ANI > threshold)
[2024-01-24 14:39:36,738] [INFO] The taxonomy check result is classified as 'conclusive'.
[2024-01-24 14:39:36,738] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Sapientia aquatica	strain=SA-152	GCA_004358105.1	1549640	1549640	type	True	100.0	1450	1452	95	conclusive
Solimicrobium silvestre	strain=S20-91	GCA_002976435.1	2099400	2099400	type	True	78.9071	401	1452	95	below_threshold
Undibacterium fentianense	strain=FT137W	GCA_018139545.1	2828728	2828728	type	True	78.1473	120	1452	95	below_threshold
Undibacterium seohonense	strain=KACC 16656	GCA_014284305.1	1344950	1344950	type	True	77.7938	143	1452	95	below_threshold
Herminiimonas fonticola	strain=DSM 18555	GCA_004361795.1	303380	303380	type	True	77.6441	110	1452	95	below_threshold
Herminiimonas fonticola	strain=S-94	GCA_003293745.1	303380	303380	type	True	77.634	109	1452	95	below_threshold
Undibacterium baiyunense	strain=BYS107W	GCA_018139645.1	2828731	2828731	type	True	77.6291	142	1452	95	below_threshold
Undibacterium pigrum	strain=DSM 19792	GCA_003201815.1	401470	401470	type	True	77.516	118	1452	95	below_threshold
Undibacterium terreum	strain=CGMCC 1.10998	GCA_014636315.1	1224302	1224302	type	True	77.4956	131	1452	95	below_threshold
Undibacterium umbellatum	strain=NL8W	GCA_014284125.1	2762300	2762300	type	True	77.4696	115	1452	95	below_threshold
Undibacterium hunanense	strain=CY18W	GCA_014284335.1	2762292	2762292	type	True	77.2998	129	1452	95	below_threshold
Undibacterium rivi	strain=FT147W	GCA_018139565.1	2828729	2828729	type	True	77.2489	147	1452	95	below_threshold
Undibacterium aquatile	strain=CCTCC AB 2015119	GCA_014699135.1	1537398	1537398	type	True	77.2423	148	1452	95	below_threshold
Janthinobacterium lividum	strain=H-24	GCA_001758635.1	29581	29581	suspected-type	True	77.0063	83	1452	95	below_threshold
Collimonas arenae	strain=Ter10	GCA_001584165.1	279058	279058	type	True	76.9582	99	1452	95	below_threshold
Janthinobacterium lividum	strain=ATCC 12473	GCA_020858175.1	29581	29581	suspected-type	True	76.9021	84	1452	95	below_threshold
Glaciimonas soli	strain=GS1	GCA_009497155.1	2590999	2590999	type	True	76.8163	125	1452	95	below_threshold
Herbaspirillum autotrophicum	strain=IAM 14942	GCA_001189915.1	180195	180195	type	True	76.5535	104	1452	95	below_threshold
Rugamonas apoptosis	strain=LX47W	GCA_014042355.1	2758570	2758570	type	True	76.4232	74	1452	95	below_threshold
Duganella phyllosphaerae	strain=T54	GCA_001758785.1	762836	762836	type	True	76.1913	84	1452	95	below_threshold
--------------------------------------------------------------------------------
[2024-01-24 14:39:36,740] [INFO] DFAST Taxonomy check result was written to GCF_004358105.1_ASM435810v1_genomic.fna/tc_result.tsv
[2024-01-24 14:39:36,741] [INFO] ===== Taxonomy check completed =====
[2024-01-24 14:39:36,741] [INFO] ===== Start completeness check using CheckM =====
[2024-01-24 14:39:36,741] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgfc1d05f0-64d8-40de-89c0-d7f7300c9ee5/dqc_reference/checkm_data
[2024-01-24 14:39:36,742] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2024-01-24 14:39:36,787] [INFO] Task started: CheckM
[2024-01-24 14:39:36,787] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_004358105.1_ASM435810v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_004358105.1_ASM435810v1_genomic.fna/checkm_input GCF_004358105.1_ASM435810v1_genomic.fna/checkm_result
[2024-01-24 14:40:20,331] [INFO] Task succeeded: CheckM
[2024-01-24 14:40:20,333] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2024-01-24 14:40:20,354] [INFO] ===== Completeness check finished =====
[2024-01-24 14:40:20,355] [INFO] ===== Start GTDB Search =====
[2024-01-24 14:40:20,355] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_004358105.1_ASM435810v1_genomic.fna/markers.fasta)
[2024-01-24 14:40:20,356] [INFO] Task started: Blastn
[2024-01-24 14:40:20,356] [INFO] Running command: blastn -query GCF_004358105.1_ASM435810v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgfc1d05f0-64d8-40de-89c0-d7f7300c9ee5/dqc_reference/reference_markers_gtdb.fasta -out GCF_004358105.1_ASM435810v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 14:40:21,500] [INFO] Task succeeded: Blastn
[2024-01-24 14:40:21,504] [INFO] Selected 18 target genomes.
[2024-01-24 14:40:21,505] [INFO] Target genome list was writen to GCF_004358105.1_ASM435810v1_genomic.fna/target_genomes_gtdb.txt
[2024-01-24 14:40:21,518] [INFO] Task started: fastANI
[2024-01-24 14:40:21,519] [INFO] Running command: fastANI --query /var/lib/cwl/stg843edd70-0cff-4393-8306-36eead5af112/GCF_004358105.1_ASM435810v1_genomic.fna.gz --refList GCF_004358105.1_ASM435810v1_genomic.fna/target_genomes_gtdb.txt --output GCF_004358105.1_ASM435810v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2024-01-24 14:40:35,619] [INFO] Task succeeded: fastANI
[2024-01-24 14:40:35,634] [INFO] Found 17 fastANI hits (1 hits with ANI > circumscription radius)
[2024-01-24 14:40:35,634] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_004358105.1	s__Solimicrobium aquaticum	100.0	1450	1452	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Solimicrobium	95.0	N/A	N/A	N/A	N/A	1	conclusive
GCF_002976435.1	s__Solimicrobium silvestre	78.9071	401	1452	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Solimicrobium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_903862415.1	s__Solimicrobium sp903862415	78.3419	308	1452	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Solimicrobium	95.0	99.96	99.96	0.98	0.98	2	-
GCF_014284275.1	s__Undibacterium amnicola	77.8024	149	1452	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Undibacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014284305.1	s__Undibacterium seohonense	77.7248	147	1452	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Undibacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003293745.1	s__Herminiimonas fonticola	77.6066	111	1452	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Herminiimonas	95.0	100.00	100.00	1.00	1.00	2	-
GCF_018139645.1	s__Undibacterium sp018139645	77.5842	145	1452	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Undibacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_015645465.1	s__Herminiimonas contaminans	77.2939	105	1452	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Herminiimonas	95.0	97.73	97.73	0.94	0.94	3	-
GCF_014284335.1	s__Undibacterium sp014284335	77.2843	131	1452	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Undibacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_018139565.1	s__Undibacterium sp018139565	77.2492	148	1452	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Undibacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014699135.1	s__Undibacterium aquatile	77.2441	149	1452	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Undibacterium	95.0	96.61	96.29	0.93	0.93	4	-
GCA_016200845.1	s__Undibacterium sp016200845	77.2121	107	1452	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Undibacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_001464355.1	s__Herminiimonas sp001464355	77.0805	119	1452	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Herminiimonas	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001584165.1	s__Collimonas arenae	76.9406	101	1452	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Collimonas	95.0	99.97	99.97	1.00	1.00	2	-
GCF_009497155.1	s__Glaciimonas soli	76.8163	125	1452	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Glaciimonas	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002564105.1	s__Collimonas sp002564105	76.6633	97	1452	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Collimonas	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001189915.1	s__Herbaspirillum autotrophicum	76.5361	105	1452	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Herbaspirillum	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2024-01-24 14:40:35,636] [INFO] GTDB search result was written to GCF_004358105.1_ASM435810v1_genomic.fna/result_gtdb.tsv
[2024-01-24 14:40:35,636] [INFO] ===== GTDB Search completed =====
[2024-01-24 14:40:35,641] [INFO] DFAST_QC result json was written to GCF_004358105.1_ASM435810v1_genomic.fna/dqc_result.json
[2024-01-24 14:40:35,641] [INFO] DFAST_QC completed!
[2024-01-24 14:40:35,641] [INFO] Total running time: 0h1m35s
