[2024-01-24 13:13:32,040] [INFO] DFAST_QC pipeline started.
[2024-01-24 13:13:32,042] [INFO] DFAST_QC version: 0.5.7
[2024-01-24 13:13:32,042] [INFO] DQC Reference Directory: /var/lib/cwl/stg8771a36f-c426-43ca-8da5-70b1eb3d3507/dqc_reference
[2024-01-24 13:13:33,476] [INFO] ===== Start taxonomy check using ANI =====
[2024-01-24 13:13:33,477] [INFO] Task started: Prodigal
[2024-01-24 13:13:33,477] [INFO] Running command: gunzip -c /var/lib/cwl/stgb9bfe147-a1a7-4c15-b8dc-009175af8f17/GCF_003065385.1_ASM306538v1_genomic.fna.gz | prodigal -d GCF_003065385.1_ASM306538v1_genomic.fna/cds.fna -a GCF_003065385.1_ASM306538v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2024-01-24 13:13:39,103] [INFO] Task succeeded: Prodigal
[2024-01-24 13:13:39,103] [INFO] Task started: HMMsearch
[2024-01-24 13:13:39,103] [INFO] Running command: hmmsearch --tblout GCF_003065385.1_ASM306538v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg8771a36f-c426-43ca-8da5-70b1eb3d3507/dqc_reference/reference_markers.hmm GCF_003065385.1_ASM306538v1_genomic.fna/protein.faa > /dev/null
[2024-01-24 13:13:39,423] [INFO] Task succeeded: HMMsearch
[2024-01-24 13:13:39,424] [INFO] Found 6/6 markers.
[2024-01-24 13:13:39,460] [INFO] Query marker FASTA was written to GCF_003065385.1_ASM306538v1_genomic.fna/markers.fasta
[2024-01-24 13:13:39,460] [INFO] Task started: Blastn
[2024-01-24 13:13:39,461] [INFO] Running command: blastn -query GCF_003065385.1_ASM306538v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg8771a36f-c426-43ca-8da5-70b1eb3d3507/dqc_reference/reference_markers.fasta -out GCF_003065385.1_ASM306538v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 13:13:40,230] [INFO] Task succeeded: Blastn
[2024-01-24 13:13:40,234] [INFO] Selected 21 target genomes.
[2024-01-24 13:13:40,235] [INFO] Target genome list was writen to GCF_003065385.1_ASM306538v1_genomic.fna/target_genomes.txt
[2024-01-24 13:13:40,252] [INFO] Task started: fastANI
[2024-01-24 13:13:40,252] [INFO] Running command: fastANI --query /var/lib/cwl/stgb9bfe147-a1a7-4c15-b8dc-009175af8f17/GCF_003065385.1_ASM306538v1_genomic.fna.gz --refList GCF_003065385.1_ASM306538v1_genomic.fna/target_genomes.txt --output GCF_003065385.1_ASM306538v1_genomic.fna/fastani_result.tsv --threads 1
[2024-01-24 13:13:48,151] [INFO] Task succeeded: fastANI
[2024-01-24 13:13:48,151] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg8771a36f-c426-43ca-8da5-70b1eb3d3507/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2024-01-24 13:13:48,151] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg8771a36f-c426-43ca-8da5-70b1eb3d3507/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2024-01-24 13:13:48,170] [INFO] Found 19 fastANI hits (1 hits with ANI > threshold)
[2024-01-24 13:13:48,171] [INFO] The taxonomy check result is classified as 'conclusive'.
[2024-01-24 13:13:48,171] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Polynucleobacter acidiphobus	strain=MWH-PoolGreenA3	GCA_003065385.1	556053	556053	type	True	100.0	617	617	95	conclusive
Polynucleobacter difficilis	strain=AM-8B5	GCA_003065365.1	556054	556054	type	True	77.9375	126	617	95	below_threshold
Polynucleobacter asymbioticus	strain=QLW-P1DMWA-1	GCA_000016345.1	576611	576611	type	True	77.4267	87	617	95	below_threshold
Polynucleobacter duraquae	strain=MWH-MoK4	GCA_000973625.1	1835254	1835254	type	True	77.3963	100	617	95	below_threshold
Polynucleobacter nymphae	strain=AP-Mumm-500A-B3	GCA_018882155.1	2081043	2081043	type	True	77.3467	89	617	95	below_threshold
Polynucleobacter yangtzensis	strain=MWH-JaK3	GCA_001595965.1	1743159	1743159	type	True	77.3163	100	617	95	below_threshold
Polynucleobacter finlandensis	strain=MWH-Mekk-B1	GCA_018881755.1	1855894	1855894	type	True	77.3063	110	617	95	below_threshold
Polynucleobacter paludilacus	strain=MWH-Mekk-C3	GCA_018687595.1	1855895	1855895	type	True	77.1169	99	617	95	below_threshold
Polynucleobacter hirudinilacicola	strain=MWH-EgelM1-30-B4	GCA_002192535.1	1743166	1743166	type	True	77.1066	102	617	95	below_threshold
Polynucleobacter sphagniphilus	strain=MWH-Weng1-1	GCA_001953355.1	1743169	1743169	type	True	77.0811	102	617	95	below_threshold
Polynucleobacter parvulilacunae	strain=Ross1-W9	GCA_018881715.1	1855631	1855631	type	True	77.0641	95	617	95	below_threshold
Polynucleobacter tropicus	strain=MWH-UH21B	GCA_013307225.1	1743174	1743174	type	True	77.0172	94	617	95	below_threshold
Polynucleobacter antarcticus	strain=LimPoW16	GCA_013307245.1	1743162	1743162	type	True	76.9723	93	617	95	below_threshold
Polynucleobacter sinensis	strain=MWH-HuW1	GCA_001595985.1	1743157	1743157	type	True	76.9668	103	617	95	below_threshold
Polynucleobacter corsicus	strain=AP-Melu-1000-A1	GCA_018688255.1	2081042	2081042	type	True	76.9273	96	617	95	below_threshold
Polynucleobacter alcilacus	strain=UK-Pondora-W15	GCA_018881545.1	1819739	1819739	type	True	76.9133	95	617	95	below_threshold
Polynucleobacter hallstattensis	strain=MWH-Hall10	GCA_018881835.1	1855586	1855586	type	True	76.8965	99	617	95	below_threshold
Polynucleobacter brandtiae	strain=UB-Domo-W1	GCA_002797575.1	1938816	1938816	type	True	76.852	88	617	95	below_threshold
Polynucleobacter bastaniensis	strain=AP-Basta-1000A-D1	GCA_018882205.1	2081039	2081039	type	True	76.7339	97	617	95	below_threshold
--------------------------------------------------------------------------------
[2024-01-24 13:13:48,173] [INFO] DFAST Taxonomy check result was written to GCF_003065385.1_ASM306538v1_genomic.fna/tc_result.tsv
[2024-01-24 13:13:48,173] [INFO] ===== Taxonomy check completed =====
[2024-01-24 13:13:48,174] [INFO] ===== Start completeness check using CheckM =====
[2024-01-24 13:13:48,174] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg8771a36f-c426-43ca-8da5-70b1eb3d3507/dqc_reference/checkm_data
[2024-01-24 13:13:48,175] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2024-01-24 13:13:48,200] [INFO] Task started: CheckM
[2024-01-24 13:13:48,200] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_003065385.1_ASM306538v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_003065385.1_ASM306538v1_genomic.fna/checkm_input GCF_003065385.1_ASM306538v1_genomic.fna/checkm_result
[2024-01-24 13:14:11,078] [INFO] Task succeeded: CheckM
[2024-01-24 13:14:11,079] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2024-01-24 13:14:11,101] [INFO] ===== Completeness check finished =====
[2024-01-24 13:14:11,102] [INFO] ===== Start GTDB Search =====
[2024-01-24 13:14:11,102] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_003065385.1_ASM306538v1_genomic.fna/markers.fasta)
[2024-01-24 13:14:11,103] [INFO] Task started: Blastn
[2024-01-24 13:14:11,103] [INFO] Running command: blastn -query GCF_003065385.1_ASM306538v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg8771a36f-c426-43ca-8da5-70b1eb3d3507/dqc_reference/reference_markers_gtdb.fasta -out GCF_003065385.1_ASM306538v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 13:14:12,265] [INFO] Task succeeded: Blastn
[2024-01-24 13:14:12,269] [INFO] Selected 16 target genomes.
[2024-01-24 13:14:12,269] [INFO] Target genome list was writen to GCF_003065385.1_ASM306538v1_genomic.fna/target_genomes_gtdb.txt
[2024-01-24 13:14:12,305] [INFO] Task started: fastANI
[2024-01-24 13:14:12,306] [INFO] Running command: fastANI --query /var/lib/cwl/stgb9bfe147-a1a7-4c15-b8dc-009175af8f17/GCF_003065385.1_ASM306538v1_genomic.fna.gz --refList GCF_003065385.1_ASM306538v1_genomic.fna/target_genomes_gtdb.txt --output GCF_003065385.1_ASM306538v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2024-01-24 13:14:18,875] [INFO] Task succeeded: fastANI
[2024-01-24 13:14:18,889] [INFO] Found 16 fastANI hits (1 hits with ANI > circumscription radius)
[2024-01-24 13:14:18,889] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_003065385.1	s__Polynucleobacter acidiphobus	100.0	617	617	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Polynucleobacter	95.0	N/A	N/A	N/A	N/A	1	conclusive
GCA_016867965.1	s__Polynucleobacter sp016867965	86.5712	422	617	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Polynucleobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_018687475.1	s__Polynucleobacter sp009928245	85.7928	522	617	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Polynucleobacter	95.0	97.52	97.29	0.97	0.94	13	-
GCA_002359975.1	s__Polynucleobacter sp002359975	78.8225	121	617	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Polynucleobacter	95.0	98.53	97.93	0.70	0.67	7	-
GCA_002292975.1	s__Polynucleobacter sp002292975	78.5545	211	617	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Polynucleobacter	95.0	98.75	98.63	0.92	0.90	4	-
GCA_009924985.1	s__Polynucleobacter sp009924985	78.2933	189	617	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Polynucleobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_018687975.1	s__Polynucleobacter sp018687975	77.8871	108	617	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Polynucleobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_018687575.1	s__Polynucleobacter asymbioticus_C	77.4509	105	617	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Polynucleobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900095215.1	s__Polynucleobacter necessarius_H	77.3868	75	617	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Polynucleobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_018688335.1	s__Polynucleobacter sp018688335	77.3412	88	617	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Polynucleobacter	95.0	95.20	95.20	0.89	0.89	2	-
GCF_018687455.1	s__Polynucleobacter sp001870365	77.2597	100	617	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Polynucleobacter	95.0	96.21	96.15	0.92	0.92	3	-
GCF_018687815.1	s__Polynucleobacter sp018687815	77.1669	83	617	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Polynucleobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002192535.1	s__Polynucleobacter hirudinilacicola	77.1066	102	617	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Polynucleobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002797575.1	s__Polynucleobacter sp002797575	76.83	90	617	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Polynucleobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016462905.1	s__Polynucleobacter sp016462905	76.6905	90	617	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Polynucleobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCA_903843235.1	s__Polynucleobacter sp903843235	76.199	75	617	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Polynucleobacter	95.0	99.64	99.31	0.88	0.84	5	-
--------------------------------------------------------------------------------
[2024-01-24 13:14:18,891] [INFO] GTDB search result was written to GCF_003065385.1_ASM306538v1_genomic.fna/result_gtdb.tsv
[2024-01-24 13:14:18,891] [INFO] ===== GTDB Search completed =====
[2024-01-24 13:14:18,896] [INFO] DFAST_QC result json was written to GCF_003065385.1_ASM306538v1_genomic.fna/dqc_result.json
[2024-01-24 13:14:18,896] [INFO] DFAST_QC completed!
[2024-01-24 13:14:18,896] [INFO] Total running time: 0h0m47s
