[2024-01-25 18:06:20,784] [INFO] DFAST_QC pipeline started.
[2024-01-25 18:06:20,785] [INFO] DFAST_QC version: 0.5.7
[2024-01-25 18:06:20,785] [INFO] DQC Reference Directory: /var/lib/cwl/stg35634178-fb6d-4eea-92b1-cbba292151db/dqc_reference
[2024-01-25 18:06:21,966] [INFO] ===== Start taxonomy check using ANI =====
[2024-01-25 18:06:21,967] [INFO] Task started: Prodigal
[2024-01-25 18:06:21,967] [INFO] Running command: gunzip -c /var/lib/cwl/stg2c78cc8c-763f-4338-b160-189bec8f6a2a/GCF_000969705.1_ASM96970v1_genomic.fna.gz | prodigal -d GCF_000969705.1_ASM96970v1_genomic.fna/cds.fna -a GCF_000969705.1_ASM96970v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2024-01-25 18:06:36,352] [INFO] Task succeeded: Prodigal
[2024-01-25 18:06:36,353] [INFO] Task started: HMMsearch
[2024-01-25 18:06:36,353] [INFO] Running command: hmmsearch --tblout GCF_000969705.1_ASM96970v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg35634178-fb6d-4eea-92b1-cbba292151db/dqc_reference/reference_markers.hmm GCF_000969705.1_ASM96970v1_genomic.fna/protein.faa > /dev/null
[2024-01-25 18:06:36,636] [INFO] Task succeeded: HMMsearch
[2024-01-25 18:06:36,637] [INFO] Found 6/6 markers.
[2024-01-25 18:06:36,683] [INFO] Query marker FASTA was written to GCF_000969705.1_ASM96970v1_genomic.fna/markers.fasta
[2024-01-25 18:06:36,683] [INFO] Task started: Blastn
[2024-01-25 18:06:36,683] [INFO] Running command: blastn -query GCF_000969705.1_ASM96970v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg35634178-fb6d-4eea-92b1-cbba292151db/dqc_reference/reference_markers.fasta -out GCF_000969705.1_ASM96970v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-25 18:06:37,611] [INFO] Task succeeded: Blastn
[2024-01-25 18:06:37,614] [INFO] Selected 21 target genomes.
[2024-01-25 18:06:37,614] [INFO] Target genome list was writen to GCF_000969705.1_ASM96970v1_genomic.fna/target_genomes.txt
[2024-01-25 18:06:37,637] [INFO] Task started: fastANI
[2024-01-25 18:06:37,638] [INFO] Running command: fastANI --query /var/lib/cwl/stg2c78cc8c-763f-4338-b160-189bec8f6a2a/GCF_000969705.1_ASM96970v1_genomic.fna.gz --refList GCF_000969705.1_ASM96970v1_genomic.fna/target_genomes.txt --output GCF_000969705.1_ASM96970v1_genomic.fna/fastani_result.tsv --threads 1
[2024-01-25 18:06:58,256] [INFO] Task succeeded: fastANI
[2024-01-25 18:06:58,256] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg35634178-fb6d-4eea-92b1-cbba292151db/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2024-01-25 18:06:58,257] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg35634178-fb6d-4eea-92b1-cbba292151db/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2024-01-25 18:06:58,269] [INFO] Found 20 fastANI hits (1 hits with ANI > threshold)
[2024-01-25 18:06:58,269] [INFO] The taxonomy check result is classified as 'conclusive'.
[2024-01-25 18:06:58,269] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Nitriliruptor alkaliphilus	strain=DSM 45188	GCA_000969705.1	427918	427918	type	True	100.0	1851	1852	95	conclusive
Egicoccus halophilus	strain=EGI 80432	GCA_004300825.1	1670830	1670830	type	True	78.8311	597	1852	95	below_threshold
Egicoccus halophilus	strain=CGMCC 1.14988	GCA_014640475.1	1670830	1670830	type	True	78.812	593	1852	95	below_threshold
Salsipaludibacter albus	strain=AS10	GCA_019798055.1	2849650	2849650	type	True	77.1175	387	1852	95	below_threshold
Euzebya rosea	strain=DSW09	GCA_003073135.1	2052804	2052804	type	True	76.1639	373	1852	95	below_threshold
Euzebya pacifica	strain=DY32-46	GCA_003344865.1	1608957	1608957	type	True	75.9946	351	1852	95	below_threshold
Dermacoccus nishinomiyaensis	strain=FDAARGOS_1119	GCA_016766835.1	1274	1274	type	True	75.6238	127	1852	95	below_threshold
Glycomyces buryatensis	strain=18	GCA_004912275.1	2570927	2570927	type	True	75.6109	192	1852	95	below_threshold
Ruania albidiflava	strain=DSM 18029	GCA_000421225.1	366586	366586	type	True	75.5192	158	1852	95	below_threshold
Actinomyces viscosus	strain=CCUG 14476	GCA_004525795.1	1656	1656	type	True	75.4721	106	1852	95	below_threshold
Pseudonocardia hierapolitana	strain=DSM 45671	GCA_007994075.1	1128676	1128676	type	True	75.4637	390	1852	95	below_threshold
Amycolatopsis sacchari	strain=DSM 44468	GCA_900114035.1	115433	115433	type	True	75.3912	315	1852	95	below_threshold
Isoptericola chiayiensis	strain=KCTC 19740	GCA_013149805.1	579446	579446	type	True	75.3853	257	1852	95	below_threshold
Saccharopolyspora dendranthemae	strain=DSM 46699	GCA_007829955.1	1181886	1181886	type	True	75.3628	247	1852	95	below_threshold
Streptomyces harenosi	strain=PRKS01-65	GCA_011008945.1	2697029	2697029	type	True	75.3291	306	1852	95	below_threshold
Glycomyces dulcitolivorans	strain=SJ-25	GCA_003265355.1	2200759	2200759	type	True	75.293	272	1852	95	below_threshold
Acrocarpospora pleiomorpha	strain=NBRC 16267	GCA_009687885.1	90975	90975	type	True	75.2045	315	1852	95	below_threshold
Cellulosimicrobium arenosum	strain=KCTC 49039	GCA_014837295.1	2708133	2708133	type	True	75.2	223	1852	95	below_threshold
Sciscionella marina	strain=DSM 45152	GCA_000379465.1	508770	508770	type	True	75.0652	206	1852	95	below_threshold
Thalassobaculum fulvum	strain=KCTC 42651	GCA_014652915.1	1633335	1633335	type	True	74.8947	258	1852	95	below_threshold
--------------------------------------------------------------------------------
[2024-01-25 18:06:58,271] [INFO] DFAST Taxonomy check result was written to GCF_000969705.1_ASM96970v1_genomic.fna/tc_result.tsv
[2024-01-25 18:06:58,271] [INFO] ===== Taxonomy check completed =====
[2024-01-25 18:06:58,271] [INFO] ===== Start completeness check using CheckM =====
[2024-01-25 18:06:58,271] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg35634178-fb6d-4eea-92b1-cbba292151db/dqc_reference/checkm_data
[2024-01-25 18:06:58,272] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2024-01-25 18:06:58,323] [INFO] Task started: CheckM
[2024-01-25 18:06:58,323] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_000969705.1_ASM96970v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_000969705.1_ASM96970v1_genomic.fna/checkm_input GCF_000969705.1_ASM96970v1_genomic.fna/checkm_result
[2024-01-25 18:09:07,226] [INFO] Task succeeded: CheckM
[2024-01-25 18:09:07,228] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2024-01-25 18:09:07,249] [INFO] ===== Completeness check finished =====
[2024-01-25 18:09:07,250] [INFO] ===== Start GTDB Search =====
[2024-01-25 18:09:07,251] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_000969705.1_ASM96970v1_genomic.fna/markers.fasta)
[2024-01-25 18:09:07,251] [INFO] Task started: Blastn
[2024-01-25 18:09:07,251] [INFO] Running command: blastn -query GCF_000969705.1_ASM96970v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg35634178-fb6d-4eea-92b1-cbba292151db/dqc_reference/reference_markers_gtdb.fasta -out GCF_000969705.1_ASM96970v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-25 18:09:08,768] [INFO] Task succeeded: Blastn
[2024-01-25 18:09:08,770] [INFO] Selected 14 target genomes.
[2024-01-25 18:09:08,771] [INFO] Target genome list was writen to GCF_000969705.1_ASM96970v1_genomic.fna/target_genomes_gtdb.txt
[2024-01-25 18:09:08,794] [INFO] Task started: fastANI
[2024-01-25 18:09:08,795] [INFO] Running command: fastANI --query /var/lib/cwl/stg2c78cc8c-763f-4338-b160-189bec8f6a2a/GCF_000969705.1_ASM96970v1_genomic.fna.gz --refList GCF_000969705.1_ASM96970v1_genomic.fna/target_genomes_gtdb.txt --output GCF_000969705.1_ASM96970v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2024-01-25 18:09:19,521] [INFO] Task succeeded: fastANI
[2024-01-25 18:09:19,530] [INFO] Found 14 fastANI hits (1 hits with ANI > circumscription radius)
[2024-01-25 18:09:19,531] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_000969705.1	s__Nitriliruptor alkaliphilus	100.0	1852	1852	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nitriliruptorales;f__Nitriliruptoraceae;g__Nitriliruptor	95.0	N/A	N/A	N/A	N/A	1	conclusive
GCF_004300825.1	s__Egicoccus halophilus	78.8226	598	1852	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nitriliruptorales;f__Nitriliruptoraceae;g__Egicoccus	95.0	100.00	100.00	1.00	1.00	2	-
GCA_007133465.1	s__SLMP01 sp007133465	78.6001	392	1852	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nitriliruptorales;f__Nitriliruptoraceae;g__SLMP01	95.0	99.22	99.05	0.81	0.81	3	-
GCA_003558435.1	s__PWLR01 sp003558435	78.1395	465	1852	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nitriliruptorales;f__Nitriliruptoraceae;g__PWLR01	95.0	99.63	99.63	0.94	0.94	2	-
GCA_007130695.1	s__SKLC01 sp007130695	78.1164	413	1852	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nitriliruptorales;f__Nitriliruptoraceae;g__SKLC01	95.0	99.35	99.35	0.92	0.92	2	-
GCA_007120465.1	s__CSSed11-175R1 sp007120465	77.9743	398	1852	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nitriliruptorales;f__Nitriliruptoraceae;g__CSSed11-175R1	95.0	N/A	N/A	N/A	N/A	1	-
GCA_007123425.1	s__SKLC01 sp007123425	77.9705	303	1852	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nitriliruptorales;f__Nitriliruptoraceae;g__SKLC01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_007136095.1	s__CSSed11-175R1 sp007136095	77.8992	359	1852	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nitriliruptorales;f__Nitriliruptoraceae;g__CSSed11-175R1	95.0	99.27	97.70	0.87	0.80	7	-
GCA_007121785.1	s__SKTG01 sp007121785	77.5686	412	1852	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nitriliruptorales;f__Nitriliruptoraceae;g__SKTG01	95.0	99.65	99.65	0.87	0.87	2	-
GCA_003565155.1	s__T1Sed10-7 sp003565155	77.4685	366	1852	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nitriliruptorales;f__Nitriliruptoraceae;g__T1Sed10-7	95.0	N/A	N/A	N/A	N/A	1	-
GCA_007122985.1	s__SKTG01 sp007122985	77.4455	360	1852	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nitriliruptorales;f__Nitriliruptoraceae;g__SKTG01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_003561535.1	s__T1Sed10-7 sp003561535	77.4007	369	1852	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nitriliruptorales;f__Nitriliruptoraceae;g__T1Sed10-7	95.0	98.89	95.80	0.87	0.76	8	-
GCA_007117945.1	s__T1Sed10-7 sp007117945	77.2087	294	1852	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nitriliruptorales;f__Nitriliruptoraceae;g__T1Sed10-7	95.0	N/A	N/A	N/A	N/A	1	-
GCA_007693335.1	s__SKSP01 sp007693335	75.5361	162	1852	d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Ilumatobacteraceae;g__SKSP01	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2024-01-25 18:09:19,532] [INFO] GTDB search result was written to GCF_000969705.1_ASM96970v1_genomic.fna/result_gtdb.tsv
[2024-01-25 18:09:19,532] [INFO] ===== GTDB Search completed =====
[2024-01-25 18:09:19,536] [INFO] DFAST_QC result json was written to GCF_000969705.1_ASM96970v1_genomic.fna/dqc_result.json
[2024-01-25 18:09:19,536] [INFO] DFAST_QC completed!
[2024-01-25 18:09:19,536] [INFO] Total running time: 0h2m59s
