[2024-01-24 13:31:11,973] [INFO] DFAST_QC pipeline started.
[2024-01-24 13:31:11,975] [INFO] DFAST_QC version: 0.5.7
[2024-01-24 13:31:11,975] [INFO] DQC Reference Directory: /var/lib/cwl/stg14b2cac3-bea3-4d5c-b642-3febc3effbc4/dqc_reference
[2024-01-24 13:31:13,160] [INFO] ===== Start taxonomy check using ANI =====
[2024-01-24 13:31:13,161] [INFO] Task started: Prodigal
[2024-01-24 13:31:13,161] [INFO] Running command: gunzip -c /var/lib/cwl/stg0b9e9b83-344d-4804-b95b-9df567ef6714/GCF_014207395.1_ASM1420739v1_genomic.fna.gz | prodigal -d GCF_014207395.1_ASM1420739v1_genomic.fna/cds.fna -a GCF_014207395.1_ASM1420739v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2024-01-24 13:31:22,996] [INFO] Task succeeded: Prodigal
[2024-01-24 13:31:22,997] [INFO] Task started: HMMsearch
[2024-01-24 13:31:22,997] [INFO] Running command: hmmsearch --tblout GCF_014207395.1_ASM1420739v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg14b2cac3-bea3-4d5c-b642-3febc3effbc4/dqc_reference/reference_markers.hmm GCF_014207395.1_ASM1420739v1_genomic.fna/protein.faa > /dev/null
[2024-01-24 13:31:23,241] [INFO] Task succeeded: HMMsearch
[2024-01-24 13:31:23,242] [INFO] Found 6/6 markers.
[2024-01-24 13:31:23,272] [INFO] Query marker FASTA was written to GCF_014207395.1_ASM1420739v1_genomic.fna/markers.fasta
[2024-01-24 13:31:23,272] [INFO] Task started: Blastn
[2024-01-24 13:31:23,272] [INFO] Running command: blastn -query GCF_014207395.1_ASM1420739v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg14b2cac3-bea3-4d5c-b642-3febc3effbc4/dqc_reference/reference_markers.fasta -out GCF_014207395.1_ASM1420739v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 13:31:23,982] [INFO] Task succeeded: Blastn
[2024-01-24 13:31:23,986] [INFO] Selected 22 target genomes.
[2024-01-24 13:31:23,987] [INFO] Target genome list was writen to GCF_014207395.1_ASM1420739v1_genomic.fna/target_genomes.txt
[2024-01-24 13:31:23,994] [INFO] Task started: fastANI
[2024-01-24 13:31:23,994] [INFO] Running command: fastANI --query /var/lib/cwl/stg0b9e9b83-344d-4804-b95b-9df567ef6714/GCF_014207395.1_ASM1420739v1_genomic.fna.gz --refList GCF_014207395.1_ASM1420739v1_genomic.fna/target_genomes.txt --output GCF_014207395.1_ASM1420739v1_genomic.fna/fastani_result.tsv --threads 1
[2024-01-24 13:31:44,209] [INFO] Task succeeded: fastANI
[2024-01-24 13:31:44,209] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg14b2cac3-bea3-4d5c-b642-3febc3effbc4/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2024-01-24 13:31:44,210] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg14b2cac3-bea3-4d5c-b642-3febc3effbc4/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2024-01-24 13:31:44,225] [INFO] Found 19 fastANI hits (2 hits with ANI > threshold)
[2024-01-24 13:31:44,225] [INFO] The taxonomy check result is classified as 'conclusive'.
[2024-01-24 13:31:44,225] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Phycisphaera mikurensis	strain=DSM 103959	GCA_014207395.1	547188	547188	type	True	100.0	1274	1275	95	conclusive
Phycisphaera mikurensis	strain=NBRC 102666	GCA_000284115.1	547188	547188	type	True	99.9915	1275	1275	95	conclusive
Algisphaera agarilytica	strain=DSM 103725	GCA_014207595.1	1385975	1385975	type	True	77.4854	260	1275	95	below_threshold
Mucisphaera calidilacus	strain=Pan265	GCA_007748075.1	2527982	2527982	type	True	76.5762	138	1275	95	below_threshold
Posidoniimonas corsicana	strain=KOR34	GCA_007859765.1	1938618	1938618	type	True	75.4064	211	1275	95	below_threshold
Paludisphaera borealis	strain=PX4	GCA_001956985.1	1387353	1387353	type	True	75.0889	163	1275	95	below_threshold
Aquisphaera giovannonii	strain=OJF2	GCA_008087625.1	406548	406548	type	True	74.9556	339	1275	95	below_threshold
Nocardioides perillae	strain=DSM 24552	GCA_013409425.1	1119534	1119534	type	True	74.9104	234	1275	95	below_threshold
Bosea thiooxidans	strain=DSM 9653	GCA_900168195.1	53254	53254	type	True	74.9002	132	1275	95	below_threshold
Actinomyces ruminicola	strain=DSM 27982	GCA_900103885.1	332524	332524	type	True	74.8704	97	1275	95	below_threshold
Actinacidiphila guanduensis	strain=CGMCC 4.2022	GCA_900103985.1	310781	310781	type	True	74.8231	322	1275	95	below_threshold
Opitutus terrae	strain=PB90-1	GCA_000019965.1	107709	107709	type	True	74.8179	108	1275	95	below_threshold
Nocardia donostiensis	strain=X1654	GCA_002081715.1	1538463	1538463	type	True	74.8135	78	1275	95	below_threshold
Saccharothrix syringae	strain=NRRL B-16468	GCA_009498035.1	103733	103733	type	True	74.8119	385	1275	95	below_threshold
Nocardiopsis gilva	strain=YIM 90087	GCA_002263495.1	280236	280236	type	True	74.8029	162	1275	95	below_threshold
Kineococcus rhizosphaerae	strain=DSM 19711	GCA_003002055.1	559628	559628	type	True	74.7925	238	1275	95	below_threshold
Saccharothrix syringae	strain=NRRL B-16468	GCA_000716755.1	103733	103733	type	True	74.7802	386	1275	95	below_threshold
Actinotalea subterranea	strain=HO-Ch2	GCA_008364845.1	2607497	2607497	type	True	74.7093	174	1275	95	below_threshold
Nocardiopsis gilva	strain=YIM 90087	GCA_000341165.1	280236	280236	type	True	74.7088	148	1275	95	below_threshold
--------------------------------------------------------------------------------
[2024-01-24 13:31:44,227] [INFO] DFAST Taxonomy check result was written to GCF_014207395.1_ASM1420739v1_genomic.fna/tc_result.tsv
[2024-01-24 13:31:44,228] [INFO] ===== Taxonomy check completed =====
[2024-01-24 13:31:44,228] [INFO] ===== Start completeness check using CheckM =====
[2024-01-24 13:31:44,228] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg14b2cac3-bea3-4d5c-b642-3febc3effbc4/dqc_reference/checkm_data
[2024-01-24 13:31:44,230] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2024-01-24 13:31:44,269] [INFO] Task started: CheckM
[2024-01-24 13:31:44,269] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_014207395.1_ASM1420739v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_014207395.1_ASM1420739v1_genomic.fna/checkm_input GCF_014207395.1_ASM1420739v1_genomic.fna/checkm_result
[2024-01-24 13:32:33,394] [INFO] Task succeeded: CheckM
[2024-01-24 13:32:33,395] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 95.83%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2024-01-24 13:32:33,413] [INFO] ===== Completeness check finished =====
[2024-01-24 13:32:33,413] [INFO] ===== Start GTDB Search =====
[2024-01-24 13:32:33,414] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_014207395.1_ASM1420739v1_genomic.fna/markers.fasta)
[2024-01-24 13:32:33,414] [INFO] Task started: Blastn
[2024-01-24 13:32:33,414] [INFO] Running command: blastn -query GCF_014207395.1_ASM1420739v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg14b2cac3-bea3-4d5c-b642-3febc3effbc4/dqc_reference/reference_markers_gtdb.fasta -out GCF_014207395.1_ASM1420739v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 13:32:34,580] [INFO] Task succeeded: Blastn
[2024-01-24 13:32:34,583] [INFO] Selected 25 target genomes.
[2024-01-24 13:32:34,583] [INFO] Target genome list was writen to GCF_014207395.1_ASM1420739v1_genomic.fna/target_genomes_gtdb.txt
[2024-01-24 13:32:34,602] [INFO] Task started: fastANI
[2024-01-24 13:32:34,602] [INFO] Running command: fastANI --query /var/lib/cwl/stg0b9e9b83-344d-4804-b95b-9df567ef6714/GCF_014207395.1_ASM1420739v1_genomic.fna.gz --refList GCF_014207395.1_ASM1420739v1_genomic.fna/target_genomes_gtdb.txt --output GCF_014207395.1_ASM1420739v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2024-01-24 13:32:52,782] [INFO] Task succeeded: fastANI
[2024-01-24 13:32:52,798] [INFO] Found 20 fastANI hits (1 hits with ANI > circumscription radius)
[2024-01-24 13:32:52,799] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_000284115.1	s__Phycisphaera mikurensis	99.9915	1275	1275	d__Bacteria;p__Planctomycetota;c__Phycisphaerae;o__Phycisphaerales;f__Phycisphaeraceae;g__Phycisphaera	95.0	99.99	99.99	1.00	1.00	2	conclusive
GCF_014207595.1	s__Algisphaera agarilytica	77.5062	258	1275	d__Bacteria;p__Planctomycetota;c__Phycisphaerae;o__Phycisphaerales;f__Phycisphaeraceae;g__Algisphaera	95.0	N/A	N/A	N/A	N/A	1	-
GCA_007748075.1	s__Pan265 sp007748075	76.5872	137	1275	d__Bacteria;p__Planctomycetota;c__Phycisphaerae;o__Phycisphaerales;f__Phycisphaeraceae;g__Pan265	95.0	N/A	N/A	N/A	N/A	1	-
GCA_018970045.1	s__SYIB01 sp018970045	76.3672	72	1275	d__Bacteria;p__Planctomycetota;c__Phycisphaerae;o__Phycisphaerales;f__SM1A02;g__SYIB01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_007693965.1	s__RECQ01 sp007693965	75.9134	137	1275	d__Bacteria;p__Planctomycetota;c__Phycisphaerae;o__Phycisphaerales;f__UBA1924;g__RECQ01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_007693765.1	s__RECZ01 sp007693765	75.8469	158	1275	d__Bacteria;p__Planctomycetota;c__Phycisphaerae;o__Phycisphaerales;f__UBA1924;g__RECZ01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_007694925.1	s__SLFH01 sp007694925	75.8338	123	1275	d__Bacteria;p__Planctomycetota;c__Phycisphaerae;o__Phycisphaerales;f__UBA1924;g__SLFH01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_018883105.1	s__UBA966 sp018883105	75.7709	147	1275	d__Bacteria;p__Planctomycetota;c__Phycisphaerae;o__Phycisphaerales;f__SM1A02;g__UBA966	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002686995.1	s__GCA-2686995 sp002686995	75.7332	180	1275	d__Bacteria;p__Planctomycetota;c__Phycisphaerae;o__Phycisphaerales;f__Phycisphaeraceae;g__GCA-2686995	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016742075.1	s__JACVCL01 sp016742075	75.6633	194	1275	d__Bacteria;p__Planctomycetota;c__Phycisphaerae;o__Phycisphaerales;f__UBA1924;g__JACVCL01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_903927725.1	s__SYIB01 sp903927725	75.6366	77	1275	d__Bacteria;p__Planctomycetota;c__Phycisphaerae;o__Phycisphaerales;f__SM1A02;g__SYIB01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_009936695.1	s__UBA12014 sp009936695	75.6265	80	1275	d__Bacteria;p__Planctomycetota;c__Phycisphaerae;o__Phycisphaerales;f__SM1A02;g__UBA12014	95.0	N/A	N/A	N/A	N/A	1	-
GCA_012729815.1	s__JAAYCJ01 sp012729815	75.2648	75	1275	d__Bacteria;p__Planctomycetota;c__Phycisphaerae;o__JAAYCJ01;f__JAAYCJ01;g__JAAYCJ01	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002865585.1	s__Nocardioides houyundeii	74.8512	136	1275	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides	95.0	97.48	96.00	0.92	0.88	3	-
GCF_004135735.1	s__Sorangium cellulosum_F	74.8363	378	1275	d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__Polyangiaceae;g__Sorangium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_001464385.1	s__Ga0077550 sp001464385	74.799	309	1275	d__Bacteria;p__Myxococcota;c__Polyangia;o__Nannocystales;f__Nannocystaceae;g__Ga0077550	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900107175.1	s__Modestobacter sp900107175	74.7899	202	1275	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Geodermatophilaceae;g__Modestobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCA_009379955.1	s__WHST01 sp009379955	74.7486	221	1275	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Streptosporangiales;f__WHST01;g__WHST01	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001665395.1	s__Mycobacterium sp001665395	74.7072	72	1275	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016210845.1	s__Gp6-AA40 sp016210845	74.6756	73	1275	d__Bacteria;p__Acidobacteriota;c__Vicinamibacteria;o__Vicinamibacterales;f__UBA2999;g__Gp6-AA40	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2024-01-24 13:32:52,802] [INFO] GTDB search result was written to GCF_014207395.1_ASM1420739v1_genomic.fna/result_gtdb.tsv
[2024-01-24 13:32:52,802] [INFO] ===== GTDB Search completed =====
[2024-01-24 13:32:52,825] [INFO] DFAST_QC result json was written to GCF_014207395.1_ASM1420739v1_genomic.fna/dqc_result.json
[2024-01-24 13:32:52,826] [INFO] DFAST_QC completed!
[2024-01-24 13:32:52,826] [INFO] Total running time: 0h1m41s
