[2023-06-13 04:13:07,835] [INFO] DFAST_QC pipeline started.
[2023-06-13 04:13:07,837] [INFO] DFAST_QC version: 0.5.7
[2023-06-13 04:13:07,837] [INFO] DQC Reference Directory: /var/lib/cwl/stgfcb20327-c1ee-4788-b9bd-ed3e90d1e878/dqc_reference
[2023-06-13 04:13:09,160] [INFO] ===== Start taxonomy check using ANI =====
[2023-06-13 04:13:09,161] [INFO] Task started: Prodigal
[2023-06-13 04:13:09,161] [INFO] Running command: gunzip -c /var/lib/cwl/stgc7d94cd8-d518-4dd4-83d6-9b5697688be4/GCA_947458265.1_RE-18aug17-172_genomic.fna.gz | prodigal -d GCA_947458265.1_RE-18aug17-172_genomic.fna/cds.fna -a GCA_947458265.1_RE-18aug17-172_genomic.fna/protein.faa -g 11 -q > /dev/null
[2023-06-13 04:13:21,913] [INFO] Task succeeded: Prodigal
[2023-06-13 04:13:21,914] [INFO] Task started: HMMsearch
[2023-06-13 04:13:21,914] [INFO] Running command: hmmsearch --tblout GCA_947458265.1_RE-18aug17-172_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgfcb20327-c1ee-4788-b9bd-ed3e90d1e878/dqc_reference/reference_markers.hmm GCA_947458265.1_RE-18aug17-172_genomic.fna/protein.faa > /dev/null
[2023-06-13 04:13:22,230] [INFO] Task succeeded: HMMsearch
[2023-06-13 04:13:22,231] [INFO] Found 6/6 markers.
[2023-06-13 04:13:22,281] [INFO] Query marker FASTA was written to GCA_947458265.1_RE-18aug17-172_genomic.fna/markers.fasta
[2023-06-13 04:13:22,282] [INFO] Task started: Blastn
[2023-06-13 04:13:22,282] [INFO] Running command: blastn -query GCA_947458265.1_RE-18aug17-172_genomic.fna/markers.fasta -db /var/lib/cwl/stgfcb20327-c1ee-4788-b9bd-ed3e90d1e878/dqc_reference/reference_markers.fasta -out GCA_947458265.1_RE-18aug17-172_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-06-13 04:13:23,155] [INFO] Task succeeded: Blastn
[2023-06-13 04:13:23,159] [INFO] Selected 33 target genomes.
[2023-06-13 04:13:23,160] [INFO] Target genome list was writen to GCA_947458265.1_RE-18aug17-172_genomic.fna/target_genomes.txt
[2023-06-13 04:13:23,163] [INFO] Task started: fastANI
[2023-06-13 04:13:23,163] [INFO] Running command: fastANI --query /var/lib/cwl/stgc7d94cd8-d518-4dd4-83d6-9b5697688be4/GCA_947458265.1_RE-18aug17-172_genomic.fna.gz --refList GCA_947458265.1_RE-18aug17-172_genomic.fna/target_genomes.txt --output GCA_947458265.1_RE-18aug17-172_genomic.fna/fastani_result.tsv --threads 1
[2023-06-13 04:13:48,894] [INFO] Task succeeded: fastANI
[2023-06-13 04:13:48,895] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgfcb20327-c1ee-4788-b9bd-ed3e90d1e878/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-06-13 04:13:48,896] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgfcb20327-c1ee-4788-b9bd-ed3e90d1e878/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-06-13 04:13:48,930] [INFO] Found 33 fastANI hits (0 hits with ANI > threshold)
[2023-06-13 04:13:48,931] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-06-13 04:13:48,931] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Thauera aminoaromatica	strain=S2	GCA_000310185.1	164330	164330	type	True	77.6724	246	1322	95	below_threshold
Sulfurisoma sediminicola	strain=DSM 26916	GCA_003663955.1	1381557	1381557	type	True	77.6674	221	1322	95	below_threshold
Thauera phenylacetica	strain=B4P	GCA_000310225.1	164400	164400	type	True	77.6419	261	1322	95	below_threshold
Sulfurisoma sediminicola	strain=BSN1	GCA_003865015.1	1381557	1381557	type	True	77.6241	225	1322	95	below_threshold
Thauera butanivorans	strain=NBRC 103042	GCA_001591165.1	86174	86174	type	True	77.5301	262	1322	95	below_threshold
Thauera chlorobenzoica	strain=3CB1	GCA_001922305.1	96773	96773	type	True	77.5228	232	1322	95	below_threshold
Sulfuritortus calidifontis	strain=DSM 103923	GCA_004346085.1	1914471	1914471	type	True	77.5059	211	1322	95	below_threshold
Thauera chlorobenzoica	strain=3CB-1	GCA_900108255.1	96773	96773	type	True	77.4887	234	1322	95	below_threshold
Ferrigenium kumadai	strain=An22	GCA_018324385.1	1682490	1682490	type	True	77.4465	154	1322	95	below_threshold
Crenobacter sedimenti	strain=HX-7-9	GCA_010435965.1	2705474	2705474	type	True	77.4439	200	1322	95	below_threshold
Thauera phenolivorans	strain=ZV1C	GCA_001696715.1	1792543	1792543	type	True	77.2762	277	1322	95	below_threshold
Thauera aromatica	strain=K172	GCA_003030465.1	59405	59405	type	True	77.2692	244	1322	95	below_threshold
Massilia agilis	strain=JCM 31605	GCA_024756255.1	1811226	1811226	type	True	77.2648	239	1322	95	below_threshold
Chitinimonas koreensis	strain=DSM 17726	GCA_000428465.1	356302	356302	type	True	77.2374	314	1322	95	below_threshold
Crenobacter cavernae	strain=K1W11S-77	GCA_003355495.1	2290923	2290923	type	True	77.2113	218	1322	95	below_threshold
Thauera linaloolentis	strain=47Lol = DSM 12138	GCA_000310205.1	76112	76112	type	True	77.1059	225	1322	95	below_threshold
Thauera linaloolentis	strain=DSM 12138	GCA_000621305.1	76112	76112	type	True	77.0589	237	1322	95	below_threshold
Massilia terrae	strain=JCM 31606	GCA_024753145.1	1811224	1811224	type	True	77.053	250	1322	95	below_threshold
Sphaerotilus sulfidivorans	strain=D-501	GCA_013426975.1	639200	639200	type	True	77.0133	284	1322	95	below_threshold
Massilia cavernae	strain=K1S02-61	GCA_003590855.1	2320864	2320864	type	True	76.8197	216	1322	95	below_threshold
Massilia glaciei	strain=B448-2	GCA_003011895.2	1524097	1524097	type	True	76.8135	236	1322	95	below_threshold
Burkholderia perseverans	strain=INN12	GCA_022870505.1	2615214	2615214	type	True	76.7611	338	1322	95	below_threshold
Burkholderia plantarii	strain=LMG 9035	GCA_902832905.1	41899	41899	type	True	76.7358	323	1322	95	below_threshold
Burkholderia ubonensis		GCA_902499185.1	101571	101571	type	True	76.7295	294	1322	95	below_threshold
Bordetella bronchiseptica	strain=CCUG 219	GCA_021391275.1	518	518	suspected-type	True	76.677	234	1322	95	below_threshold
Pseudoduganella namucuonensis	strain=CGMCC 1.11014	GCA_900116645.1	1035707	1035707	type	True	76.6678	289	1322	95	below_threshold
Achromobacter dolens	strain=LMG 26840	GCA_902859745.1	1287738	1287738	type	True	76.6119	251	1322	95	below_threshold
Cupriavidus malaysiensis	strain=USMAA1020	GCA_001854325.1	367825	367825	type	True	76.5313	335	1322	95	below_threshold
Duganella lactea	strain=FT50W	GCA_009857505.1	2692173	2692173	type	True	76.4527	199	1322	95	below_threshold
Duganella radicis	strain=KCTC 22382	GCA_009720825.1	551988	551988	type	True	76.4136	198	1322	95	below_threshold
Paludibacterium paludis	strain=BCRC 80514	GCA_018802605.1	1225769	1225769	type	True	76.1596	115	1322	95	below_threshold
Paludibacterium paludis	strain=KCTC 32182	GCA_014652495.1	1225769	1225769	type	True	76.1257	113	1322	95	below_threshold
Algiphilus aromaticivorans	strain=DG1253	GCA_000733765.1	382454	382454	type	True	75.7719	99	1322	95	below_threshold
--------------------------------------------------------------------------------
[2023-06-13 04:13:48,956] [INFO] DFAST Taxonomy check result was written to GCA_947458265.1_RE-18aug17-172_genomic.fna/tc_result.tsv
[2023-06-13 04:13:48,956] [INFO] ===== Taxonomy check completed =====
[2023-06-13 04:13:48,957] [INFO] ===== Start completeness check using CheckM =====
[2023-06-13 04:13:48,957] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgfcb20327-c1ee-4788-b9bd-ed3e90d1e878/dqc_reference/checkm_data
[2023-06-13 04:13:48,959] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-06-13 04:13:49,008] [INFO] Task started: CheckM
[2023-06-13 04:13:49,008] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_947458265.1_RE-18aug17-172_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_947458265.1_RE-18aug17-172_genomic.fna/checkm_input GCA_947458265.1_RE-18aug17-172_genomic.fna/checkm_result
[2023-06-13 04:14:31,509] [INFO] Task succeeded: CheckM
[2023-06-13 04:14:31,510] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 95.83%
Contamintation: 2.08%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-06-13 04:14:31,534] [INFO] ===== Completeness check finished =====
[2023-06-13 04:14:31,535] [INFO] ===== Start GTDB Search =====
[2023-06-13 04:14:31,535] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_947458265.1_RE-18aug17-172_genomic.fna/markers.fasta)
[2023-06-13 04:14:31,535] [INFO] Task started: Blastn
[2023-06-13 04:14:31,536] [INFO] Running command: blastn -query GCA_947458265.1_RE-18aug17-172_genomic.fna/markers.fasta -db /var/lib/cwl/stgfcb20327-c1ee-4788-b9bd-ed3e90d1e878/dqc_reference/reference_markers_gtdb.fasta -out GCA_947458265.1_RE-18aug17-172_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-06-13 04:14:32,972] [INFO] Task succeeded: Blastn
[2023-06-13 04:14:32,979] [INFO] Selected 26 target genomes.
[2023-06-13 04:14:32,980] [INFO] Target genome list was writen to GCA_947458265.1_RE-18aug17-172_genomic.fna/target_genomes_gtdb.txt
[2023-06-13 04:14:32,992] [INFO] Task started: fastANI
[2023-06-13 04:14:32,992] [INFO] Running command: fastANI --query /var/lib/cwl/stgc7d94cd8-d518-4dd4-83d6-9b5697688be4/GCA_947458265.1_RE-18aug17-172_genomic.fna.gz --refList GCA_947458265.1_RE-18aug17-172_genomic.fna/target_genomes_gtdb.txt --output GCA_947458265.1_RE-18aug17-172_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2023-06-13 04:14:50,929] [INFO] Task succeeded: fastANI
[2023-06-13 04:14:50,953] [INFO] Found 26 fastANI hits (0 hits with ANI > circumscription radius)
[2023-06-13 04:14:50,954] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCA_903873475.1	s__CAIKXV01 sp903873475	87.2454	929	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__CAIKXV01;g__CAIKXV01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_903831455.1	s__CAIKXV01 sp903831455	80.0394	674	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__CAIKXV01;g__CAIKXV01	95.0	99.62	99.56	0.92	0.90	5	-
GCA_014379045.1	s__CAIKXV01 sp014379045	79.8787	556	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__CAIKXV01;g__CAIKXV01	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003865015.1	s__Sulfurisoma sediminicola	77.6385	224	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Rhodocyclaceae;g__Sulfurisoma	95.0	99.99	99.99	0.99	0.99	2	-
GCA_016716275.1	s__JADJWR01 sp016716275	77.5755	276	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__JADJWR01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_008933825.1	s__Desulfobacillus sp008933825	77.5102	216	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Rhodocyclaceae;g__Desulfobacillus	95.0	99.14	98.51	0.86	0.84	3	-
GCA_017347485.1	s__Desulfobacillus denitrificans	77.5004	248	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Rhodocyclaceae;g__Desulfobacillus	95.0	98.90	98.32	0.92	0.90	5	-
GCA_016791205.1	s__Desulfobacillus sp016791205	77.4809	216	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Rhodocyclaceae;g__Desulfobacillus	95.0	N/A	N/A	N/A	N/A	1	-
GCF_018324385.1	s__Gallionella kumadai	77.4465	154	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Gallionellaceae;g__Gallionella	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016720425.1	s__CAIWHR01 sp016720425	77.4436	290	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Casimicrobiaceae;g__CAIWHR01	95.0	98.60	97.38	0.93	0.90	10	-
GCA_016716425.1	s__VBCG01 sp016716425	77.3136	274	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Casimicrobiaceae;g__VBCG01	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000428465.1	s__Chitinimonas koreensis	77.236	314	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Chitinimonadaceae;g__Chitinimonas	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003355495.1	s__Crenobacter cavernae	77.1985	220	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Chromobacteriaceae;g__Crenobacter	95.0	96.17	96.17	0.93	0.93	2	-
GCA_018820765.1	s__Thiobacillus sp018820765	77.1653	194	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Thiobacillaceae;g__Thiobacillus	95.0	99.99	99.99	0.98	0.97	3	-
GCA_019137045.1	s__JAGVSZ01 sp019137045	77.1522	254	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__SG8-39;g__JAGVSZ01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_001464765.1	s__Ga0077526 sp001464765	77.148	223	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Ga0077523;g__Ga0077526	95.0	N/A	N/A	N/A	N/A	1	-
GCA_001771975.1	s__2-02-FULL-66-14 sp001771975	77.0758	169	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__SG8-39;g__2-02-FULL-66-14	95.0	99.59	99.59	0.95	0.95	2	-
GCA_903933915.1	s__Ga0077527 sp903933915	77.0529	204	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__SG8-41;g__Ga0077527	95.0	99.39	99.32	0.93	0.92	3	-
GCA_016861185.1	s__DSNY01 sp016861185	77.0162	230	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__DSNY01	95.0	N/A	N/A	N/A	N/A	1	-
GCF_008329925.1	s__Sphaerotilus sulfidivorans	76.9753	287	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Sphaerotilus	95.0	99.98	99.98	1.00	1.00	2	-
GCA_016790695.1	s__JAEUOS01 sp016790695	76.8884	346	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__JAEUOS01;g__JAEUOS01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_012274205.1	s__2-12-FULL-64-23 sp012274205	76.7019	198	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__SG8-39;g__2-12-FULL-64-23	95.0	N/A	N/A	N/A	N/A	1	-
GCA_017592775.1	s__Accumulibacter sp017592775	76.6171	169	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Rhodocyclaceae;g__Accumulibacter	95.0	99.29	99.29	0.85	0.85	2	-
GCF_001854325.1	s__Cupriavidus malaysiensis	76.5725	328	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Cupriavidus	95.0	99.34	99.30	0.91	0.91	3	-
GCA_016202615.1	s__PALSA-1004 sp016202615	76.5321	161	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__SG8-41;g__PALSA-1004	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016789955.1	s__Rubrivivax sp016789955	76.3068	252	1322	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Rubrivivax	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-06-13 04:14:50,956] [INFO] GTDB search result was written to GCA_947458265.1_RE-18aug17-172_genomic.fna/result_gtdb.tsv
[2023-06-13 04:14:50,957] [INFO] ===== GTDB Search completed =====
[2023-06-13 04:14:50,963] [INFO] DFAST_QC result json was written to GCA_947458265.1_RE-18aug17-172_genomic.fna/dqc_result.json
[2023-06-13 04:14:50,963] [INFO] DFAST_QC completed!
[2023-06-13 04:14:50,963] [INFO] Total running time: 0h1m43s
