[2024-01-24 13:27:58,705] [INFO] DFAST_QC pipeline started.
[2024-01-24 13:27:58,706] [INFO] DFAST_QC version: 0.5.7
[2024-01-24 13:27:58,706] [INFO] DQC Reference Directory: /var/lib/cwl/stg5af8a8c1-14fc-4474-a517-612e4b5a6424/dqc_reference
[2024-01-24 13:27:59,955] [INFO] ===== Start taxonomy check using ANI =====
[2024-01-24 13:27:59,956] [INFO] Task started: Prodigal
[2024-01-24 13:27:59,956] [INFO] Running command: gunzip -c /var/lib/cwl/stg8b336b5d-c063-414f-b731-17129ffdf4cd/GCF_000340885.1_ASM34088v1_genomic.fna.gz | prodigal -d GCF_000340885.1_ASM34088v1_genomic.fna/cds.fna -a GCF_000340885.1_ASM34088v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2024-01-24 13:28:10,829] [INFO] Task succeeded: Prodigal
[2024-01-24 13:28:10,829] [INFO] Task started: HMMsearch
[2024-01-24 13:28:10,829] [INFO] Running command: hmmsearch --tblout GCF_000340885.1_ASM34088v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg5af8a8c1-14fc-4474-a517-612e4b5a6424/dqc_reference/reference_markers.hmm GCF_000340885.1_ASM34088v1_genomic.fna/protein.faa > /dev/null
[2024-01-24 13:28:11,231] [INFO] Task succeeded: HMMsearch
[2024-01-24 13:28:11,233] [INFO] Found 6/6 markers.
[2024-01-24 13:28:11,287] [INFO] Query marker FASTA was written to GCF_000340885.1_ASM34088v1_genomic.fna/markers.fasta
[2024-01-24 13:28:11,287] [INFO] Task started: Blastn
[2024-01-24 13:28:11,287] [INFO] Running command: blastn -query GCF_000340885.1_ASM34088v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg5af8a8c1-14fc-4474-a517-612e4b5a6424/dqc_reference/reference_markers.fasta -out GCF_000340885.1_ASM34088v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 13:28:11,889] [INFO] Task succeeded: Blastn
[2024-01-24 13:28:11,892] [INFO] Selected 17 target genomes.
[2024-01-24 13:28:11,893] [INFO] Target genome list was writen to GCF_000340885.1_ASM34088v1_genomic.fna/target_genomes.txt
[2024-01-24 13:28:11,921] [INFO] Task started: fastANI
[2024-01-24 13:28:11,921] [INFO] Running command: fastANI --query /var/lib/cwl/stg8b336b5d-c063-414f-b731-17129ffdf4cd/GCF_000340885.1_ASM34088v1_genomic.fna.gz --refList GCF_000340885.1_ASM34088v1_genomic.fna/target_genomes.txt --output GCF_000340885.1_ASM34088v1_genomic.fna/fastani_result.tsv --threads 1
[2024-01-24 13:28:37,371] [INFO] Task succeeded: fastANI
[2024-01-24 13:28:37,371] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg5af8a8c1-14fc-4474-a517-612e4b5a6424/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2024-01-24 13:28:37,372] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg5af8a8c1-14fc-4474-a517-612e4b5a6424/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2024-01-24 13:28:37,386] [INFO] Found 17 fastANI hits (2 hits with ANI > threshold)
[2024-01-24 13:28:37,386] [INFO] The taxonomy check result is classified as 'conclusive'.
[2024-01-24 13:28:37,386] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Clostridium saccharoperbutylacetonicum	strain=N1-4(HMT)	GCA_000340885.1	36745	36745	type	True	100.0	2221	2221	95	conclusive
Clostridium saccharoperbutylacetonicum	strain=ATCC 27021	GCA_000334435.1	36745	36745	type	True	99.943	2085	2221	95	conclusive
Clostridium puniceum	strain=DSM 2619	GCA_002006345.1	29367	29367	type	True	81.0062	991	2221	95	below_threshold
Clostridium gelidum	strain=C5S11	GCA_019977655.1	704125	704125	type	True	80.9094	954	2221	95	below_threshold
Clostridium diolis	strain=DSM 15410	GCA_008705175.1	223919	223919	suspected-type	True	80.902	963	2221	95	below_threshold
Clostridium beijerinckii	strain=NCTC13035	GCA_900447025.1	1520	1520	suspected-type	True	80.8066	940	2221	95	below_threshold
Clostridium chromiireducens	strain=DSM 23318	GCA_002029255.1	225345	225345	type	True	80.7722	866	2221	95	below_threshold
Clostridium beijerinckii	strain=DSM 791	GCA_018223745.1	1520	1520	suspected-type	True	80.7569	934	2221	95	below_threshold
Clostridium beijerinckii	strain=DSM 791	GCA_002006445.1	1520	1520	suspected-type	True	80.6192	894	2221	95	below_threshold
Clostridium saccharobutylicum	strain=DSM 13864	GCA_000473995.1	169679	169679	type	True	80.3381	760	2221	95	below_threshold
Clostridium saccharobutylicum	strain=DSM 13864	GCA_001657435.1	169679	169679	type	True	80.0488	679	2221	95	below_threshold
Clostridium weizhouense	strain=YB-6	GCA_019431045.1	2859781	2859781	type	True	78.6737	512	2221	95	below_threshold
Clostridium zeae	strain=CSC2	GCA_017312485.1	2759022	2759022	type	True	76.7542	277	2221	95	below_threshold
Clostridium mobile	strain=MSJ-11	GCA_018918285.1	2841512	2841512	type	True	75.9114	137	2221	95	below_threshold
Malaciobacter canalis	strain=LMG 29148	GCA_008000835.1	1912871	1912871	type	True	75.6137	71	2221	95	below_threshold
Anaerovirgula multivorans	strain=SCA	GCA_900188145.1	312168	312168	type	True	75.356	67	2221	95	below_threshold
Malaciobacter canalis	strain=F138-33	GCA_002723485.1	1912871	1912871	type	True	74.7856	59	2221	95	below_threshold
--------------------------------------------------------------------------------
[2024-01-24 13:28:37,392] [INFO] DFAST Taxonomy check result was written to GCF_000340885.1_ASM34088v1_genomic.fna/tc_result.tsv
[2024-01-24 13:28:37,394] [INFO] ===== Taxonomy check completed =====
[2024-01-24 13:28:37,394] [INFO] ===== Start completeness check using CheckM =====
[2024-01-24 13:28:37,394] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg5af8a8c1-14fc-4474-a517-612e4b5a6424/dqc_reference/checkm_data
[2024-01-24 13:28:37,398] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2024-01-24 13:28:37,499] [INFO] Task started: CheckM
[2024-01-24 13:28:37,499] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_000340885.1_ASM34088v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_000340885.1_ASM34088v1_genomic.fna/checkm_input GCF_000340885.1_ASM34088v1_genomic.fna/checkm_result
[2024-01-24 13:29:14,946] [INFO] Task succeeded: CheckM
[2024-01-24 13:29:14,947] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2024-01-24 13:29:14,964] [INFO] ===== Completeness check finished =====
[2024-01-24 13:29:14,965] [INFO] ===== Start GTDB Search =====
[2024-01-24 13:29:14,965] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_000340885.1_ASM34088v1_genomic.fna/markers.fasta)
[2024-01-24 13:29:14,965] [INFO] Task started: Blastn
[2024-01-24 13:29:14,966] [INFO] Running command: blastn -query GCF_000340885.1_ASM34088v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg5af8a8c1-14fc-4474-a517-612e4b5a6424/dqc_reference/reference_markers_gtdb.fasta -out GCF_000340885.1_ASM34088v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 13:29:15,798] [INFO] Task succeeded: Blastn
[2024-01-24 13:29:15,802] [INFO] Selected 16 target genomes.
[2024-01-24 13:29:15,802] [INFO] Target genome list was writen to GCF_000340885.1_ASM34088v1_genomic.fna/target_genomes_gtdb.txt
[2024-01-24 13:29:15,835] [INFO] Task started: fastANI
[2024-01-24 13:29:15,835] [INFO] Running command: fastANI --query /var/lib/cwl/stg8b336b5d-c063-414f-b731-17129ffdf4cd/GCF_000340885.1_ASM34088v1_genomic.fna.gz --refList GCF_000340885.1_ASM34088v1_genomic.fna/target_genomes_gtdb.txt --output GCF_000340885.1_ASM34088v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2024-01-24 13:29:37,200] [INFO] Task succeeded: fastANI
[2024-01-24 13:29:37,216] [INFO] Found 15 fastANI hits (1 hits with ANI > circumscription radius)
[2024-01-24 13:29:37,216] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_000340885.1	s__Clostridium saccharoperbutylacetonicum	100.0	2221	2221	d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium	95.0	99.35	98.00	0.96	0.89	7	conclusive
GCF_009928485.1	s__Clostridium sp009928485	81.0395	967	2221	d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002006345.1	s__Clostridium puniceum	80.9978	991	2221	d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_003129525.1	s__Clostridium beijerinckii_D	80.9553	759	2221	d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000230835.1	s__Clostridium sp000230835	80.903	987	2221	d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium	95.0	96.12	96.12	0.82	0.82	2	-
GCF_018223745.1	s__Clostridium beijerinckii	80.8034	932	2221	d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium	95.0	97.01	95.18	0.85	0.79	244	-
GCF_002029255.1	s__Clostridium chromiireducens	80.7891	862	2221	d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium	95.0	98.07	97.40	0.90	0.87	3	-
GCF_000621745.1	s__Clostridium beijerinckii_A	80.7571	933	2221	d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002760435.1	s__Clostridium sp002760435	80.5857	857	2221	d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium	95.0	99.99	99.99	0.98	0.98	2	-
GCF_000473995.1	s__Clostridium saccharobutylicum	80.3016	760	2221	d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium	95.0	99.89	95.82	0.99	0.83	65	-
GCF_012843905.1	s__Clostridium sp012843905	78.4446	394	2221	d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_003539755.1	s__Clostridium sp003539755	78.0708	296	2221	d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900129365.1	s__Clostridium_AH fallax	76.5369	219	2221	d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__Clostridium_AH	95.0	100.00	100.00	1.00	1.00	2	-
GCF_018918285.1	s__MSJ-11 sp018918285	75.9114	137	2221	d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Clostridiales;f__Clostridiaceae;g__MSJ-11	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900188145.1	s__Anaerovirgula multivorans	75.3228	66	2221	d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Peptostreptococcales;f__Natronincolaceae;g__Anaerovirgula	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2024-01-24 13:29:37,219] [INFO] GTDB search result was written to GCF_000340885.1_ASM34088v1_genomic.fna/result_gtdb.tsv
[2024-01-24 13:29:37,219] [INFO] ===== GTDB Search completed =====
[2024-01-24 13:29:37,225] [INFO] DFAST_QC result json was written to GCF_000340885.1_ASM34088v1_genomic.fna/dqc_result.json
[2024-01-24 13:29:37,225] [INFO] DFAST_QC completed!
[2024-01-24 13:29:37,225] [INFO] Total running time: 0h1m39s
