[2024-01-25 20:02:05,541] [INFO] DFAST_QC pipeline started.
[2024-01-25 20:02:05,542] [INFO] DFAST_QC version: 0.5.7
[2024-01-25 20:02:05,542] [INFO] DQC Reference Directory: /var/lib/cwl/stga5170bd8-b92a-4987-8c15-8ae1bea9dfde/dqc_reference
[2024-01-25 20:02:06,745] [INFO] ===== Start taxonomy check using ANI =====
[2024-01-25 20:02:06,745] [INFO] Task started: Prodigal
[2024-01-25 20:02:06,745] [INFO] Running command: gunzip -c /var/lib/cwl/stg9298af2d-9dce-49bd-80a9-4b5efa0af7a9/GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna.gz | prodigal -d GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/cds.fna -a GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/protein.faa -g 11 -q > /dev/null
[2024-01-25 20:02:14,094] [INFO] Task succeeded: Prodigal
[2024-01-25 20:02:14,095] [INFO] Task started: HMMsearch
[2024-01-25 20:02:14,095] [INFO] Running command: hmmsearch --tblout GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stga5170bd8-b92a-4987-8c15-8ae1bea9dfde/dqc_reference/reference_markers.hmm GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/protein.faa > /dev/null
[2024-01-25 20:02:14,314] [INFO] Task succeeded: HMMsearch
[2024-01-25 20:02:14,315] [INFO] Found 6/6 markers.
[2024-01-25 20:02:14,337] [INFO] Query marker FASTA was written to GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/markers.fasta
[2024-01-25 20:02:14,337] [INFO] Task started: Blastn
[2024-01-25 20:02:14,337] [INFO] Running command: blastn -query GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/markers.fasta -db /var/lib/cwl/stga5170bd8-b92a-4987-8c15-8ae1bea9dfde/dqc_reference/reference_markers.fasta -out GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-25 20:02:15,231] [INFO] Task succeeded: Blastn
[2024-01-25 20:02:15,234] [INFO] Selected 21 target genomes.
[2024-01-25 20:02:15,234] [INFO] Target genome list was writen to GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/target_genomes.txt
[2024-01-25 20:02:15,257] [INFO] Task started: fastANI
[2024-01-25 20:02:15,257] [INFO] Running command: fastANI --query /var/lib/cwl/stg9298af2d-9dce-49bd-80a9-4b5efa0af7a9/GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna.gz --refList GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/target_genomes.txt --output GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/fastani_result.tsv --threads 1
[2024-01-25 20:02:28,168] [INFO] Task succeeded: fastANI
[2024-01-25 20:02:28,168] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stga5170bd8-b92a-4987-8c15-8ae1bea9dfde/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2024-01-25 20:02:28,168] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stga5170bd8-b92a-4987-8c15-8ae1bea9dfde/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2024-01-25 20:02:28,180] [INFO] Found 21 fastANI hits (2 hits with ANI > threshold)
[2024-01-25 20:02:28,181] [INFO] The taxonomy check result is classified as 'conclusive'.
[2024-01-25 20:02:28,181] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Thiohalorhabdus denitrificans	strain=HL 19	GCA_900101365.1	381306	381306	type	True	100.0	950	951	95	conclusive
Thiohalorhabdus denitrificans	strain=HL 19	GCA_001399755.1	381306	381306	type	True	99.9915	943	951	95	conclusive
Thiohalospira halophila	strain=HL 3	GCA_900112605.1	381300	381300	type	True	78.1339	221	951	95	below_threshold
Thioalbus denitrificans	strain=DSM 26407	GCA_003337735.1	547122	547122	type	True	77.4568	226	951	95	below_threshold
Thermithiobacillus tepidarius	strain=DSM 3134	GCA_000423825.1	929	929	type	True	77.4527	185	951	95	below_threshold
Arhodomonas aquaeolei	strain=DSM 8974	GCA_000374645.1	2369	2369	type	True	77.1045	156	951	95	below_threshold
Halomonas denitrificans	strain=DSM 18045	GCA_003056305.1	370769	370769	type	True	77.0528	168	951	95	below_threshold
Inmirania thermothiophila	strain=DSM 100275	GCA_003751635.1	1750597	1750597	type	True	77.0173	184	951	95	below_threshold
Thiohalobacter thiocyanaticus	strain=Hrh1	GCA_003932505.1	585455	585455	type	True	76.9696	165	951	95	below_threshold
Thioalkalivibrio thiocyanodenitrificans	strain=ARhD 1	GCA_000378965.1	243063	243063	type	True	76.8943	131	951	95	below_threshold
Sulfuritortus calidifontis	strain=DSM 103923	GCA_004346085.1	1914471	1914471	type	True	76.8417	116	951	95	below_threshold
Sulfuritortus calidifontis	strain=J1A	GCA_003967275.1	1914471	1914471	type	True	76.7994	117	951	95	below_threshold
Halomonas ventosae	strain=CECT 5797	GCA_004363555.1	229007	229007	type	True	76.7436	163	951	95	below_threshold
Halomonas aestuarii	strain=Hb3	GCA_001886615.1	1897729	1897729	type	True	76.6836	146	951	95	below_threshold
Halorhodospira halophila	strain=SL1	GCA_000015585.1	1053	1053	suspected-type	True	76.673	109	951	95	below_threshold
Nitrogeniibacter mangrovi	strain=M9-3-2	GCA_010983895.1	2016596	2016596	type	True	76.6589	134	951	95	below_threshold
Thiohalocapsa halophila	strain=DSM 6210	GCA_016583825.1	69359	69359	type	True	76.6215	156	951	95	below_threshold
Thioalkalivibrio versutus	strain=AL 2	GCA_001999325.1	106634	106634	type	True	76.5827	106	951	95	below_threshold
Pseudomonas aromaticivorans	strain=MAP12	GCA_019097855.1	2849492	2849492	type	True	76.1376	123	951	95	below_threshold
Ottowia beijingensis	strain=GCS-AN-3	GCA_013423955.1	1207057	1207057	type	True	75.9611	105	951	95	below_threshold
Mitsuaria chitinivorans	strain=HWN-4	GCA_002761755.1	2917965	2917965	type	True	75.6819	110	951	95	below_threshold
--------------------------------------------------------------------------------
[2024-01-25 20:02:28,182] [INFO] DFAST Taxonomy check result was written to GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/tc_result.tsv
[2024-01-25 20:02:28,183] [INFO] ===== Taxonomy check completed =====
[2024-01-25 20:02:28,183] [INFO] ===== Start completeness check using CheckM =====
[2024-01-25 20:02:28,183] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stga5170bd8-b92a-4987-8c15-8ae1bea9dfde/dqc_reference/checkm_data
[2024-01-25 20:02:28,184] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2024-01-25 20:02:28,214] [INFO] Task started: CheckM
[2024-01-25 20:02:28,214] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/checkm_input GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/checkm_result
[2024-01-25 20:02:52,541] [INFO] Task succeeded: CheckM
[2024-01-25 20:02:52,542] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2024-01-25 20:02:52,560] [INFO] ===== Completeness check finished =====
[2024-01-25 20:02:52,560] [INFO] ===== Start GTDB Search =====
[2024-01-25 20:02:52,561] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/markers.fasta)
[2024-01-25 20:02:52,561] [INFO] Task started: Blastn
[2024-01-25 20:02:52,561] [INFO] Running command: blastn -query GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/markers.fasta -db /var/lib/cwl/stga5170bd8-b92a-4987-8c15-8ae1bea9dfde/dqc_reference/reference_markers_gtdb.fasta -out GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-25 20:02:54,351] [INFO] Task succeeded: Blastn
[2024-01-25 20:02:54,354] [INFO] Selected 25 target genomes.
[2024-01-25 20:02:54,355] [INFO] Target genome list was writen to GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/target_genomes_gtdb.txt
[2024-01-25 20:02:54,375] [INFO] Task started: fastANI
[2024-01-25 20:02:54,375] [INFO] Running command: fastANI --query /var/lib/cwl/stg9298af2d-9dce-49bd-80a9-4b5efa0af7a9/GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna.gz --refList GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/target_genomes_gtdb.txt --output GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2024-01-25 20:03:08,759] [INFO] Task succeeded: fastANI
[2024-01-25 20:03:08,772] [INFO] Found 24 fastANI hits (1 hits with ANI > circumscription radius)
[2024-01-25 20:03:08,773] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_001399755.1	s__Thiohalorhabdus denitrificans	99.9915	943	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Thiohalorhabdales;f__Thiohalorhabdaceae;g__Thiohalorhabdus	95.0	99.99	99.99	0.99	0.99	2	conclusive
GCA_018609925.1	s__JAHEOHG01 sp018609925	79.7508	255	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Thiohalorhabdales;f__Thiohalorhabdaceae;g__JAHEOHG01	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900112605.1	s__Thiohalospira halophila	78.147	221	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Thiohalospirales;f__Thiohalospiraceae;g__Thiohalospira	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003337735.1	s__Thioalbus denitrificans	77.4694	225	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__DSM-26407;f__DSM-26407;g__Thioalbus	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000423825.1	s__Thermithiobacillus tepidarius	77.4658	185	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Acidithiobacillales;f__Thermithiobacillaceae;g__Thermithiobacillus	95.0	N/A	N/A	N/A	N/A	1	-
GCA_015491615.1	s__S140-43 sp015491615	77.1912	140	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__S140-43;f__S140-43;g__S140-43	95.0	99.42	99.32	0.88	0.87	3	-
GCA_003972985.1	s__Thiolapillus sp003972985	77.1771	109	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Chromatiales;f__Sedimenticolaceae;g__Thiolapillus	95.0	99.52	99.52	0.88	0.88	2	-
GCF_000374645.1	s__Arhodomonas aquaeolei	77.1036	156	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Nitrococcales;f__Nitrococcaceae;g__Arhodomonas	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003751635.1	s__Inmirania thermothiophila	77.0067	185	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__DSM-100275;f__DSM-100275;g__Inmirania	95.0	N/A	N/A	N/A	N/A	1	-
GCA_015490395.1	s__S141-70 sp015490395	76.8802	135	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__S141-70;f__S141-70;g__S141-70	95.0	N/A	N/A	N/A	N/A	1	-
GCA_015494165.1	s__S144-34 sp015494165	76.8524	157	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__S144-34;f__S144-34;g__S144-34	95.0	N/A	N/A	N/A	N/A	1	-
GCA_011051715.1	s__HyVt-443 sp011051715	76.7524	143	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Chromatiales;f__Sedimenticolaceae;g__HyVt-443	95.0	N/A	N/A	N/A	N/A	1	-
GCF_004363555.1	s__Halomonas ventosae	76.7436	163	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Halomonadaceae;g__Halomonas	95.0	95.59	95.59	0.89	0.89	2	-
GCA_003695825.1	s__J048 sp003695825	76.6959	112	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__J048;f__J048;g__J048	95.0	N/A	N/A	N/A	N/A	1	-
GCF_004337445.1	s__Parasulfuritortus cantonensis	76.6913	117	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Thiobacillaceae;g__Parasulfuritortus	95.0	N/A	N/A	N/A	N/A	1	-
GCF_010983895.1	s__Denitromonas sp010983895	76.6749	133	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Rhodocyclaceae;g__Denitromonas	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001999325.1	s__Thioalkalivibrio versutus	76.5827	106	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Ectothiorhodospirales;f__Thioalkalivibrionaceae;g__Thioalkalivibrio	95.0	97.30	95.68	0.91	0.87	14	-
GCF_002286975.1	s__Halomonas_B sp002286975	76.5799	112	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Halomonadaceae;g__Halomonas_B	95.0	N/A	N/A	N/A	N/A	1	-
GCA_007133445.1	s__Aquisalimonas sp007133445	76.2335	59	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Nitrococcales;f__Aquisalimonadaceae;g__Aquisalimonas	95.0	N/A	N/A	N/A	N/A	1	-
GCF_012927165.1	s__Zoogloea sp012927165	76.1908	110	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Rhodocyclaceae;g__Zoogloea	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000474255.1	s__Pseudomonas_F alcaligenes_A	76.0679	134	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_F	95.0	98.08	97.69	0.89	0.89	3	-
GCF_008107625.1	s__Zoogloea oleivorans	75.9558	76	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Rhodocyclaceae;g__Zoogloea	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016790365.1	s__Thauera sp016790365	75.8726	66	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Rhodocyclaceae;g__Thauera	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002261335.1	s__Bordetella sp002261335	75.7275	107	951	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Bordetella	95.0	98.98	98.98	0.96	0.96	2	-
--------------------------------------------------------------------------------
[2024-01-25 20:03:08,774] [INFO] GTDB search result was written to GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/result_gtdb.tsv
[2024-01-25 20:03:08,775] [INFO] ===== GTDB Search completed =====
[2024-01-25 20:03:08,778] [INFO] DFAST_QC result json was written to GCF_900101365.1_IMG-taxon_2596583511_annotated_assembly_genomic.fna/dqc_result.json
[2024-01-25 20:03:08,778] [INFO] DFAST_QC completed!
[2024-01-25 20:03:08,778] [INFO] Total running time: 0h1m3s
