[2024-01-24 12:22:14,107] [INFO] DFAST_QC pipeline started.
[2024-01-24 12:22:14,109] [INFO] DFAST_QC version: 0.5.7
[2024-01-24 12:22:14,110] [INFO] DQC Reference Directory: /var/lib/cwl/stg926685c2-b457-4de9-a6de-88dd8abf21a5/dqc_reference
[2024-01-24 12:22:15,565] [INFO] ===== Start taxonomy check using ANI =====
[2024-01-24 12:22:15,566] [INFO] Task started: Prodigal
[2024-01-24 12:22:15,567] [INFO] Running command: gunzip -c /var/lib/cwl/stgd047e940-386f-480e-8754-2f65094e94df/GCF_002088235.1_ASM208823v1_genomic.fna.gz | prodigal -d GCF_002088235.1_ASM208823v1_genomic.fna/cds.fna -a GCF_002088235.1_ASM208823v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2024-01-24 12:22:28,454] [INFO] Task succeeded: Prodigal
[2024-01-24 12:22:28,454] [INFO] Task started: HMMsearch
[2024-01-24 12:22:28,454] [INFO] Running command: hmmsearch --tblout GCF_002088235.1_ASM208823v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg926685c2-b457-4de9-a6de-88dd8abf21a5/dqc_reference/reference_markers.hmm GCF_002088235.1_ASM208823v1_genomic.fna/protein.faa > /dev/null
[2024-01-24 12:22:28,769] [INFO] Task succeeded: HMMsearch
[2024-01-24 12:22:28,770] [INFO] Found 6/6 markers.
[2024-01-24 12:22:28,814] [INFO] Query marker FASTA was written to GCF_002088235.1_ASM208823v1_genomic.fna/markers.fasta
[2024-01-24 12:22:28,814] [INFO] Task started: Blastn
[2024-01-24 12:22:28,814] [INFO] Running command: blastn -query GCF_002088235.1_ASM208823v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg926685c2-b457-4de9-a6de-88dd8abf21a5/dqc_reference/reference_markers.fasta -out GCF_002088235.1_ASM208823v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 12:22:29,824] [INFO] Task succeeded: Blastn
[2024-01-24 12:22:29,827] [INFO] Selected 30 target genomes.
[2024-01-24 12:22:29,828] [INFO] Target genome list was writen to GCF_002088235.1_ASM208823v1_genomic.fna/target_genomes.txt
[2024-01-24 12:22:29,839] [INFO] Task started: fastANI
[2024-01-24 12:22:29,839] [INFO] Running command: fastANI --query /var/lib/cwl/stgd047e940-386f-480e-8754-2f65094e94df/GCF_002088235.1_ASM208823v1_genomic.fna.gz --refList GCF_002088235.1_ASM208823v1_genomic.fna/target_genomes.txt --output GCF_002088235.1_ASM208823v1_genomic.fna/fastani_result.tsv --threads 1
[2024-01-24 12:22:49,962] [INFO] Task succeeded: fastANI
[2024-01-24 12:22:49,962] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg926685c2-b457-4de9-a6de-88dd8abf21a5/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2024-01-24 12:22:49,963] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg926685c2-b457-4de9-a6de-88dd8abf21a5/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2024-01-24 12:22:49,980] [INFO] Found 23 fastANI hits (1 hits with ANI > threshold)
[2024-01-24 12:22:49,980] [INFO] The taxonomy check result is classified as 'conclusive'.
[2024-01-24 12:22:49,981] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Oceanococcus atlanticus	strain=22II-S10r2	GCA_002088235.1	1317117	1317117	type	True	100.0	1208	1210	95	conclusive
Abyssibacter profundi	strain=OUC007	GCA_003151135.1	2182787	2182787	type	True	77.1157	109	1210	95	below_threshold
Aquisalimonas asiatica	strain=CGMCC 1.6291	GCA_900110585.1	406100	406100	type	True	77.035	62	1210	95	below_threshold
Methylonatrum kenyense	strain=AMT 1	GCA_023195885.1	455253	455253	type	True	76.6045	57	1210	95	below_threshold
Solimonas aquatica	strain=DSM 25927	GCA_900111015.1	489703	489703	type	True	76.546	126	1210	95	below_threshold
Nevskia soli	strain=DSM 19509	GCA_000711955.1	418856	418856	type	True	76.4968	137	1210	95	below_threshold
Pseudomonas oryzae	strain=KCTC 32247	GCA_900104805.1	1392877	1392877	type	True	76.4964	104	1210	95	below_threshold
Sulfurivermis fontis	strain=JG42	GCA_004001245.1	1972068	1972068	type	True	76.4476	75	1210	95	below_threshold
Arenimonas terrae	strain=R29	GCA_006265115.1	2546226	2546226	type	True	76.4285	58	1210	95	below_threshold
Pseudomonas germanica	strain=FIT28	GCA_019614655.1	2815720	2815720	type	True	76.4057	54	1210	95	below_threshold
Solimonas marina	strain=C16B3	GCA_012241385.1	2714601	2714601	type	True	76.3907	124	1210	95	below_threshold
Stenotrophomonas maltophilia	strain=NBRC 14161	GCA_001591205.1	40324	40324	type	True	76.2193	72	1210	95	below_threshold
Stenotrophomonas maltophilia	strain=NCTC10257	GCA_900186865.1	40324	40324	type	True	76.2144	77	1210	95	below_threshold
Stenotrophomonas maltophilia	strain=MTCC 434	GCA_000597745.1	40324	40324	type	True	76.1793	75	1210	95	below_threshold
Arenimonas composti	strain=TR7-09	GCA_000747175.1	370776	370776	type	True	76.178	60	1210	95	below_threshold
Stenotrophomonas maltophilia	strain=ATCC 13637	GCA_001997185.1	40324	40324	type	True	76.174	77	1210	95	below_threshold
Halomonas tianxiuensis	strain=BC-M4-5	GCA_009834345.1	2497861	2497861	type	True	76.1525	68	1210	95	below_threshold
Salinisphaera japonica	strain=YTM-1	GCA_003788585.1	1304270	1304270	type	True	76.1228	66	1210	95	below_threshold
Arenimonas composti	strain=DSM 18010	GCA_000426365.1	370776	370776	type	True	76.0378	64	1210	95	below_threshold
Pseudomonas sessilinigenes	strain=CMR12a	GCA_019139855.1	658629	658629	type	True	75.9408	75	1210	95	below_threshold
Pseudomonas sessilinigenes	strain=CMR12a	GCA_003850565.1	658629	658629	type	True	75.9007	77	1210	95	below_threshold
Pseudomonas pharyngis	strain=BML-PP036	GCA_021602345.1	2892333	2892333	type	True	75.7716	64	1210	95	below_threshold
Metallibacterium scheffleri	strain=DSM 24874	GCA_004798955.1	993689	993689	type	True	75.7345	57	1210	95	below_threshold
--------------------------------------------------------------------------------
[2024-01-24 12:22:49,982] [INFO] DFAST Taxonomy check result was written to GCF_002088235.1_ASM208823v1_genomic.fna/tc_result.tsv
[2024-01-24 12:22:49,983] [INFO] ===== Taxonomy check completed =====
[2024-01-24 12:22:49,983] [INFO] ===== Start completeness check using CheckM =====
[2024-01-24 12:22:49,983] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg926685c2-b457-4de9-a6de-88dd8abf21a5/dqc_reference/checkm_data
[2024-01-24 12:22:49,984] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2024-01-24 12:22:50,023] [INFO] Task started: CheckM
[2024-01-24 12:22:50,024] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_002088235.1_ASM208823v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_002088235.1_ASM208823v1_genomic.fna/checkm_input GCF_002088235.1_ASM208823v1_genomic.fna/checkm_result
[2024-01-24 12:23:33,877] [INFO] Task succeeded: CheckM
[2024-01-24 12:23:33,879] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2024-01-24 12:23:33,904] [INFO] ===== Completeness check finished =====
[2024-01-24 12:23:33,904] [INFO] ===== Start GTDB Search =====
[2024-01-24 12:23:33,905] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_002088235.1_ASM208823v1_genomic.fna/markers.fasta)
[2024-01-24 12:23:33,905] [INFO] Task started: Blastn
[2024-01-24 12:23:33,905] [INFO] Running command: blastn -query GCF_002088235.1_ASM208823v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg926685c2-b457-4de9-a6de-88dd8abf21a5/dqc_reference/reference_markers_gtdb.fasta -out GCF_002088235.1_ASM208823v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 12:23:35,599] [INFO] Task succeeded: Blastn
[2024-01-24 12:23:35,604] [INFO] Selected 29 target genomes.
[2024-01-24 12:23:35,605] [INFO] Target genome list was writen to GCF_002088235.1_ASM208823v1_genomic.fna/target_genomes_gtdb.txt
[2024-01-24 12:23:35,635] [INFO] Task started: fastANI
[2024-01-24 12:23:35,635] [INFO] Running command: fastANI --query /var/lib/cwl/stgd047e940-386f-480e-8754-2f65094e94df/GCF_002088235.1_ASM208823v1_genomic.fna.gz --refList GCF_002088235.1_ASM208823v1_genomic.fna/target_genomes_gtdb.txt --output GCF_002088235.1_ASM208823v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2024-01-24 12:23:52,326] [INFO] Task succeeded: fastANI
[2024-01-24 12:23:52,351] [INFO] Found 23 fastANI hits (1 hits with ANI > circumscription radius)
[2024-01-24 12:23:52,352] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_002088235.1	s__Oceanococcus atlanticus	100.0	1208	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Nevskiales;f__Oceanococcaceae;g__Oceanococcus	95.0	98.37	98.37	0.93	0.93	2	conclusive
GCF_003151135.1	s__Abyssibacter profundi	77.1157	109	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Nevskiales;f__OUC007;g__Abyssibacter	95.0	98.39	98.39	0.93	0.93	2	-
GCF_900110585.1	s__Aquisalimonas asiatica	77.035	62	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Nitrococcales;f__Aquisalimonadaceae;g__Aquisalimonas	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000021985.1	s__Thioalkalivibrio_A sulfidiphilus	76.7293	86	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Ectothiorhodospirales;f__Ectothiorhodospiraceae;g__Thioalkalivibrio_A	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002683405.1	s__GCA-2683405 sp002683405	76.7201	91	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Nevskiales;f__Salinisphaeraceae;g__GCA-2683405	95.0	N/A	N/A	N/A	N/A	1	-
GCA_019192445.1	s__MS8 sp019192445	76.7111	75	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Nevskiales;f__Oceanococcaceae;g__MS8	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000377785.1	s__Thioalkalivibrio sp000377785	76.7049	70	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Ectothiorhodospirales;f__Thioalkalivibrionaceae;g__Thioalkalivibrio	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900111015.1	s__Solimonas aquatica	76.5629	125	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Nevskiales;f__Nevskiaceae;g__Solimonas	95.0	N/A	N/A	N/A	N/A	1	-
GCF_013395055.1	s__Pseudomonas_Q sp013395055	76.5429	67	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_Q	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900104805.1	s__Pseudomonas_K oryzae	76.5133	103	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_K	95.0	N/A	N/A	N/A	N/A	1	-
GCA_018222345.1	s__JAAFAL01 sp018222345	76.4272	92	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Nevskiales;f__OUC007;g__JAAFAL01	95.0	N/A	N/A	N/A	N/A	1	-
GCF_016008875.1	s__Pseudomonas_E sp002439135	76.3481	98	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	96.91	96.75	0.87	0.85	5	-
GCA_002840095.1	s__Sulfurivermis sp002840095	76.3183	74	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Thiohalomonadales;f__Thiohalomonadaceae;g__Sulfurivermis	95.0	N/A	N/A	N/A	N/A	1	-
GCA_011321775.1	s__Thiogranum sp011321775	76.2918	60	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__DSM-19610;f__DSM-19610;g__Thiogranum	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002862435.1	s__Macondimonas sp002862435	76.2849	59	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__UBA5335;f__UBA5335;g__Macondimonas	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900186865.1	s__Stenotrophomonas maltophilia	76.2144	77	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Stenotrophomonas	95.0	97.99	97.31	0.92	0.87	234	-
GCF_004346925.1	s__Stenotrophomonas maltophilia_A	76.1857	70	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Xanthomonadaceae;g__Stenotrophomonas	95.0	97.29	95.44	0.89	0.84	41	-
GCF_003788585.1	s__Salinisphaera japonica	76.1228	66	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Nevskiales;f__Salinisphaeraceae;g__Salinisphaera	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003182095.1	s__Plasticicumulans acidivorans	76.0763	87	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Competibacterales;f__Competibacteraceae;g__Plasticicumulans	95.0	N/A	N/A	N/A	N/A	1	-
GCA_003565625.1	s__Wenzhouxiangella sp003565625	75.9824	61	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Wenzhouxiangellaceae;g__Wenzhouxiangella	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003850565.1	s__Pseudomonas_E sp001705835	75.9168	76	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Pseudomonadaceae;g__Pseudomonas_E	95.0	99.20	98.99	0.91	0.87	6	-
GCA_002483055.1	s__Dokdonella sp002483055	75.7339	69	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Rhodanobacteraceae;g__Dokdonella	95.0	N/A	N/A	N/A	N/A	1	-
GCA_017744955.1	s__Dokdonella_A sp017744955	75.6516	73	1210	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Rhodanobacteraceae;g__Dokdonella_A	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2024-01-24 12:23:52,355] [INFO] GTDB search result was written to GCF_002088235.1_ASM208823v1_genomic.fna/result_gtdb.tsv
[2024-01-24 12:23:52,356] [INFO] ===== GTDB Search completed =====
[2024-01-24 12:23:52,360] [INFO] DFAST_QC result json was written to GCF_002088235.1_ASM208823v1_genomic.fna/dqc_result.json
[2024-01-24 12:23:52,361] [INFO] DFAST_QC completed!
[2024-01-24 12:23:52,361] [INFO] Total running time: 0h1m38s
