[2023-06-30 10:22:06,238] [INFO] DFAST_QC pipeline started.
[2023-06-30 10:22:06,261] [INFO] DFAST_QC version: 0.5.7
[2023-06-30 10:22:06,261] [INFO] DQC Reference Directory: /var/lib/cwl/stg52036e55-cfce-41c9-a40c-f517f4c43c85/dqc_reference
[2023-06-30 10:22:07,863] [INFO] ===== Start taxonomy check using ANI =====
[2023-06-30 10:22:07,864] [INFO] Task started: Prodigal
[2023-06-30 10:22:07,864] [INFO] Running command: gunzip -c /var/lib/cwl/stgb8f317f1-4aa5-40dd-8505-a1814f4ff6c1/GCA_024277255.1_ASM2427725v1_genomic.fna.gz | prodigal -d GCA_024277255.1_ASM2427725v1_genomic.fna/cds.fna -a GCA_024277255.1_ASM2427725v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2023-06-30 10:22:30,637] [INFO] Task succeeded: Prodigal
[2023-06-30 10:22:30,637] [INFO] Task started: HMMsearch
[2023-06-30 10:22:30,637] [INFO] Running command: hmmsearch --tblout GCA_024277255.1_ASM2427725v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg52036e55-cfce-41c9-a40c-f517f4c43c85/dqc_reference/reference_markers.hmm GCA_024277255.1_ASM2427725v1_genomic.fna/protein.faa > /dev/null
[2023-06-30 10:22:30,909] [INFO] Task succeeded: HMMsearch
[2023-06-30 10:22:30,910] [INFO] Found 6/6 markers.
[2023-06-30 10:22:30,952] [INFO] Query marker FASTA was written to GCA_024277255.1_ASM2427725v1_genomic.fna/markers.fasta
[2023-06-30 10:22:30,953] [INFO] Task started: Blastn
[2023-06-30 10:22:30,953] [INFO] Running command: blastn -query GCA_024277255.1_ASM2427725v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg52036e55-cfce-41c9-a40c-f517f4c43c85/dqc_reference/reference_markers.fasta -out GCA_024277255.1_ASM2427725v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-06-30 10:22:31,517] [INFO] Task succeeded: Blastn
[2023-06-30 10:22:31,522] [INFO] Selected 29 target genomes.
[2023-06-30 10:22:31,522] [INFO] Target genome list was writen to GCA_024277255.1_ASM2427725v1_genomic.fna/target_genomes.txt
[2023-06-30 10:22:31,523] [INFO] Task started: fastANI
[2023-06-30 10:22:31,523] [INFO] Running command: fastANI --query /var/lib/cwl/stgb8f317f1-4aa5-40dd-8505-a1814f4ff6c1/GCA_024277255.1_ASM2427725v1_genomic.fna.gz --refList GCA_024277255.1_ASM2427725v1_genomic.fna/target_genomes.txt --output GCA_024277255.1_ASM2427725v1_genomic.fna/fastani_result.tsv --threads 1
[2023-06-30 10:22:53,251] [INFO] Task succeeded: fastANI
[2023-06-30 10:22:53,251] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg52036e55-cfce-41c9-a40c-f517f4c43c85/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-06-30 10:22:53,252] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg52036e55-cfce-41c9-a40c-f517f4c43c85/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-06-30 10:22:53,262] [INFO] Found 15 fastANI hits (0 hits with ANI > threshold)
[2023-06-30 10:22:53,262] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-06-30 10:22:53,262] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Desulfuromonas versatilis	strain=NIT-T3	GCA_019704135.1	2802975	2802975	type	True	75.5681	62	1882	95	below_threshold
Fulvimonas soli	strain=LMG 19981	GCA_006352285.1	155197	155197	type	True	75.2425	66	1882	95	below_threshold
Halomonas ventosae	strain=CECT 5797	GCA_004363555.1	229007	229007	type	True	75.2019	69	1882	95	below_threshold
Halomonas aestuarii	strain=Hb3	GCA_001886615.1	1897729	1897729	type	True	75.132	64	1882	95	below_threshold
Pseudomonas lalucatii	strain=R1b54	GCA_018398425.1	1424203	1424203	type	True	75.1054	64	1882	95	below_threshold
Pseudomonas rhizoryzae	strain=RY24	GCA_005250615.1	2571129	2571129	type	True	75.0845	54	1882	95	below_threshold
Fulvimonas soli	strain=DSM 14263	GCA_003148905.1	155197	155197	type	True	75.0365	73	1882	95	below_threshold
Plasticicumulans lactativorans	strain=DSM 25287	GCA_004341245.1	1133106	1133106	type	True	74.9709	115	1882	95	below_threshold
Halomonas lysinitropha	strain=3(2)	GCA_902500215.1	2607506	2607506	type	True	74.9692	55	1882	95	below_threshold
Halomonas shengliensis	strain=CGMCC 1.6444	GCA_900104135.1	419597	419597	type	True	74.9386	65	1882	95	below_threshold
Prescottella defluvii	strain=Ca11	GCA_000738775.1	1323361	1323361	type	True	74.8297	50	1882	95	below_threshold
Thauera butanivorans	strain=NBRC 103042	GCA_001591165.1	86174	86174	type	True	74.8223	67	1882	95	below_threshold
Frankia inefficax	strain=EuI1c	GCA_000166135.1	298654	298654	type	True	74.7577	109	1882	95	below_threshold
Frankia asymbiotica	strain=NRRL B-16386	GCA_001983105.1	1834516	1834516	type	True	74.7421	111	1882	95	below_threshold
Nocardia colli	strain=CICC 11023	GCA_008704205.1	2545717	2545717	type	True	74.6476	82	1882	95	below_threshold
--------------------------------------------------------------------------------
[2023-06-30 10:22:53,264] [INFO] DFAST Taxonomy check result was written to GCA_024277255.1_ASM2427725v1_genomic.fna/tc_result.tsv
[2023-06-30 10:22:53,264] [INFO] ===== Taxonomy check completed =====
[2023-06-30 10:22:53,264] [INFO] ===== Start completeness check using CheckM =====
[2023-06-30 10:22:53,264] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg52036e55-cfce-41c9-a40c-f517f4c43c85/dqc_reference/checkm_data
[2023-06-30 10:22:53,265] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-06-30 10:22:53,319] [INFO] Task started: CheckM
[2023-06-30 10:22:53,319] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_024277255.1_ASM2427725v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_024277255.1_ASM2427725v1_genomic.fna/checkm_input GCA_024277255.1_ASM2427725v1_genomic.fna/checkm_result
[2023-06-30 10:23:56,441] [INFO] Task succeeded: CheckM
[2023-06-30 10:23:56,442] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 12.50%
Strain heterogeneity: 33.33%
--------------------------------------------------------------------------------
[2023-06-30 10:23:56,460] [INFO] ===== Completeness check finished =====
[2023-06-30 10:23:56,460] [INFO] ===== Start GTDB Search =====
[2023-06-30 10:23:56,461] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_024277255.1_ASM2427725v1_genomic.fna/markers.fasta)
[2023-06-30 10:23:56,461] [INFO] Task started: Blastn
[2023-06-30 10:23:56,461] [INFO] Running command: blastn -query GCA_024277255.1_ASM2427725v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg52036e55-cfce-41c9-a40c-f517f4c43c85/dqc_reference/reference_markers_gtdb.fasta -out GCA_024277255.1_ASM2427725v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-06-30 10:23:57,291] [INFO] Task succeeded: Blastn
[2023-06-30 10:23:57,294] [INFO] Selected 28 target genomes.
[2023-06-30 10:23:57,294] [INFO] Target genome list was writen to GCA_024277255.1_ASM2427725v1_genomic.fna/target_genomes_gtdb.txt
[2023-06-30 10:23:57,296] [INFO] Task started: fastANI
[2023-06-30 10:23:57,296] [INFO] Running command: fastANI --query /var/lib/cwl/stgb8f317f1-4aa5-40dd-8505-a1814f4ff6c1/GCA_024277255.1_ASM2427725v1_genomic.fna.gz --refList GCA_024277255.1_ASM2427725v1_genomic.fna/target_genomes_gtdb.txt --output GCA_024277255.1_ASM2427725v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2023-06-30 10:24:15,730] [INFO] Task succeeded: fastANI
[2023-06-30 10:24:15,749] [INFO] Found 19 fastANI hits (0 hits with ANI > circumscription radius)
[2023-06-30 10:24:15,749] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCA_003231035.1	s__SZUA-115 sp003231035	81.3092	812	1882	d__Bacteria;p__Acidobacteriota;c__Thermoanaerobaculia;o__UBA5704;f__UBA5704;g__SZUA-115	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016699405.1	s__JAAYLR01 sp016699405	76.3971	197	1882	d__Bacteria;p__Acidobacteriota;c__Thermoanaerobaculia;o__UBA5704;f__UBA5704;g__JAAYLR01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_017998715.1	s__JAGPDF01 sp017998715	76.3398	135	1882	d__Bacteria;p__Acidobacteriota;c__Thermoanaerobaculia;o__UBA5704;f__UBA5704;g__JAGPDF01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_012521815.1	s__JAAYLR01 sp012521815	76.257	136	1882	d__Bacteria;p__Acidobacteriota;c__Thermoanaerobaculia;o__UBA5704;f__UBA5704;g__JAAYLR01	95.0	99.12	98.38	0.89	0.78	5	-
GCA_009843505.1	s__WTGL01 sp009843505	76.253	105	1882	d__Bacteria;p__Acidobacteriota;c__Thermoanaerobaculia;o__UBA5704;f__QQVD01;g__WTGL01	95.0	99.97	99.97	0.99	0.99	2	-
GCA_011525905.1	s__JACTMI01 sp011525905	76.2006	157	1882	d__Bacteria;p__Acidobacteriota;c__Thermoanaerobaculia;o__UBA5704;f__UBA5704;g__JACTMI01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_009837085.1	s__WTGL01 sp009837085	76.181	106	1882	d__Bacteria;p__Acidobacteriota;c__Thermoanaerobaculia;o__UBA5704;f__QQVD01;g__WTGL01	95.0	99.78	99.78	0.96	0.95	3	-
GCA_003388555.1	s__QQVD01 sp003388555	76.1731	212	1882	d__Bacteria;p__Acidobacteriota;c__Thermoanaerobaculia;o__UBA5704;f__QQVD01;g__QQVD01	95.0	99.97	99.97	0.99	0.99	2	-
GCA_018057705.1	s__JAGPDF01 sp018057705	76.1644	163	1882	d__Bacteria;p__Acidobacteriota;c__Thermoanaerobaculia;o__UBA5704;f__UBA5704;g__JAGPDF01	95.0	99.73	99.73	0.94	0.94	2	-
GCA_017860005.1	s__JACTMI01 sp017860005	76.1211	142	1882	d__Bacteria;p__Acidobacteriota;c__Thermoanaerobaculia;o__UBA5704;f__UBA5704;g__JACTMI01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_905479655.1	s__CAJQNK01 sp905479655	76.1049	176	1882	d__Bacteria;p__Acidobacteriota;c__Thermoanaerobaculia;o__UBA5704;f__UBA5704;g__CAJQNK01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_011525365.1	s__WTGL01 sp011525365	76.0659	117	1882	d__Bacteria;p__Acidobacteriota;c__Thermoanaerobaculia;o__UBA5704;f__QQVD01;g__WTGL01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_009837885.1	s__WTGL01 sp009837885	76.0551	117	1882	d__Bacteria;p__Acidobacteriota;c__Thermoanaerobaculia;o__UBA5704;f__QQVD01;g__WTGL01	95.0	99.92	99.92	0.99	0.99	2	-
GCA_012270995.1	s__WTGL01 sp012270995	75.8931	107	1882	d__Bacteria;p__Acidobacteriota;c__Thermoanaerobaculia;o__UBA5704;f__QQVD01;g__WTGL01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002420005.1	s__UBA5704 sp002420005	75.8826	232	1882	d__Bacteria;p__Acidobacteriota;c__Thermoanaerobaculia;o__UBA5704;f__UBA5704;g__UBA5704	95.0	N/A	N/A	N/A	N/A	1	-
GCA_011523565.1	s__WTGL01 sp011523565	75.7725	72	1882	d__Bacteria;p__Acidobacteriota;c__Thermoanaerobaculia;o__UBA5704;f__QQVD01;g__WTGL01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_019247495.1	s__JADGNZ01 sp019247495	75.4555	58	1882	d__Bacteria;p__Acidobacteriota;c__Thermoanaerobaculia;o__Gp7-AA8;f__Gp7-AA8;g__JADGNZ01	95.0	99.75	99.65	0.94	0.93	3	-
GCA_017883235.1	s__JADGNZ01 sp017883235	75.3532	100	1882	d__Bacteria;p__Acidobacteriota;c__Thermoanaerobaculia;o__Gp7-AA8;f__Gp7-AA8;g__JADGNZ01	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014192275.1	s__Halomonas stenophila	75.0277	70	1882	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Pseudomonadales;f__Halomonadaceae;g__Halomonas	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-06-30 10:24:15,758] [INFO] GTDB search result was written to GCA_024277255.1_ASM2427725v1_genomic.fna/result_gtdb.tsv
[2023-06-30 10:24:15,758] [INFO] ===== GTDB Search completed =====
[2023-06-30 10:24:15,761] [INFO] DFAST_QC result json was written to GCA_024277255.1_ASM2427725v1_genomic.fna/dqc_result.json
[2023-06-30 10:24:15,761] [INFO] DFAST_QC completed!
[2023-06-30 10:24:15,761] [INFO] Total running time: 0h2m10s
