[2023-03-16 17:08:13,823] [INFO] DFAST_QC pipeline started.
[2023-03-16 17:08:13,823] [INFO] DFAST_QC version: 0.5.7
[2023-03-16 17:08:13,823] [INFO] DQC Reference Directory: /var/lib/cwl/stg93efb681-5922-4fa1-ae53-678df9153568/dqc_reference
[2023-03-16 17:08:16,715] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-16 17:08:16,716] [INFO] Task started: Prodigal
[2023-03-16 17:08:16,716] [INFO] Running command: cat /var/lib/cwl/stg1fcf24de-ce95-430c-b4c7-e3453abc524c/OceanDNA-b24095.fa | prodigal -d OceanDNA-b24095/cds.fna -a OceanDNA-b24095/protein.faa -g 11 -q > /dev/null
[2023-03-16 17:08:46,019] [INFO] Task succeeded: Prodigal
[2023-03-16 17:08:46,019] [INFO] Task started: HMMsearch
[2023-03-16 17:08:46,020] [INFO] Running command: hmmsearch --tblout OceanDNA-b24095/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg93efb681-5922-4fa1-ae53-678df9153568/dqc_reference/reference_markers.hmm OceanDNA-b24095/protein.faa > /dev/null
[2023-03-16 17:08:46,250] [INFO] Task succeeded: HMMsearch
[2023-03-16 17:08:46,251] [INFO] Found 6/6 markers.
[2023-03-16 17:08:46,283] [INFO] Query marker FASTA was written to OceanDNA-b24095/markers.fasta
[2023-03-16 17:08:46,284] [INFO] Task started: Blastn
[2023-03-16 17:08:46,284] [INFO] Running command: blastn -query OceanDNA-b24095/markers.fasta -db /var/lib/cwl/stg93efb681-5922-4fa1-ae53-678df9153568/dqc_reference/reference_markers.fasta -out OceanDNA-b24095/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-16 17:08:47,090] [INFO] Task succeeded: Blastn
[2023-03-16 17:08:47,091] [INFO] Selected 35 target genomes.
[2023-03-16 17:08:47,092] [INFO] Target genome list was writen to OceanDNA-b24095/target_genomes.txt
[2023-03-16 17:08:47,225] [INFO] Task started: fastANI
[2023-03-16 17:08:47,225] [INFO] Running command: fastANI --query /var/lib/cwl/stg1fcf24de-ce95-430c-b4c7-e3453abc524c/OceanDNA-b24095.fa --refList OceanDNA-b24095/target_genomes.txt --output OceanDNA-b24095/fastani_result.tsv --threads 1
[2023-03-16 17:09:22,189] [INFO] Task succeeded: fastANI
[2023-03-16 17:09:22,189] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg93efb681-5922-4fa1-ae53-678df9153568/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-16 17:09:22,190] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg93efb681-5922-4fa1-ae53-678df9153568/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-16 17:09:22,221] [INFO] Found 35 fastANI hits (0 hits with ANI > threshold)
[2023-03-16 17:09:22,221] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-16 17:09:22,222] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Oceanibaculum indicum	strain=P24	GCA_000299935.1	526216	526216	type	True	76.6489	206	1602	95	below_threshold
Oceanibaculum pacificum	strain=MCCC 1A02656	GCA_001618175.1	580166	580166	type	True	76.6444	191	1602	95	below_threshold
Nisaea sediminum	strain=NBU1469	GCA_014904705.1	2775867	2775867	type	True	76.5015	196	1602	95	below_threshold
Nisaea acidiphila	strain=MEBiC11861	GCA_024662015.1	1862145	1862145	type	True	76.4611	160	1602	95	below_threshold
Aliidongia dinghuensis	strain=CGMCC 1.15725	GCA_014643535.1	1867774	1867774	type	True	76.4578	224	1602	95	below_threshold
Stella humosa	strain=ATCC 43930	GCA_006738645.1	94	94	type	True	76.456	229	1602	95	below_threshold
Stella humosa	strain=DSM 5900	GCA_003751345.1	94	94	type	True	76.4042	231	1602	95	below_threshold
Hypericibacter adhaerens	strain=R5959	GCA_008728835.1	2602016	2602016	type	True	76.4024	240	1602	95	below_threshold
Tistlia consotensis	strain=DSM 21585	GCA_900188055.1	1321365	1321365	type	True	76.3458	318	1602	95	below_threshold
Roseospirillum parvum	strain=930I	GCA_900100455.1	83401	83401	type	True	76.3373	158	1602	95	below_threshold
Azospirillum palustre	strain=B2	GCA_002573965.1	2044885	2044885	type	True	76.3116	243	1602	95	below_threshold
Nisaea nitritireducens	strain=DSM 19540	GCA_014904795.1	568392	568392	type	True	76.2986	128	1602	95	below_threshold
Inquilinus limosus	strain=DSM 16000	GCA_000423185.1	171674	171674	type	True	76.2843	260	1602	95	below_threshold
Bradyrhizobium jicamae	strain=PAC68	GCA_001440395.1	280332	280332	type	True	76.2411	154	1602	95	below_threshold
Skermanella stibiiresistens	strain=SB22	GCA_000576635.1	913326	913326	type	True	76.2223	200	1602	95	below_threshold
Bradyrhizobium lablabi	strain=CCBAU 23086	GCA_001440475.1	722472	722472	suspected-type	True	76.1611	163	1602	95	below_threshold
Nisaea denitrificans	strain=DSM 18348	GCA_000426505.1	390877	390877	type	True	76.1319	106	1602	95	below_threshold
Kaistia hirudinis	strain=DSM 25966	GCA_014196455.1	1293440	1293440	type	True	76.1082	155	1602	95	below_threshold
Rhodovibrio sodomensis	strain=DSM 9895	GCA_016583645.1	1088	1088	type	True	76.1063	200	1602	95	below_threshold
Reyranella aquatilis	strain=KCTC 52223	GCA_020880995.1	2035356	2035356	type	True	76.0881	203	1602	95	below_threshold
Hoeflea alexandrii	strain=DSM 16655	GCA_024105735.1	288436	288436	type	True	76.0744	99	1602	95	below_threshold
Mesorhizobium opportunistum	strain=WSM2075	GCA_000176035.2	593909	593909	type	True	76.0396	151	1602	95	below_threshold
Jiella sonneratiae	strain=MQZ13P-4	GCA_017353515.1	2816856	2816856	type	True	76.0157	188	1602	95	below_threshold
Vineibacter terrae	strain=CC-CFT640	GCA_008039615.1	2586908	2586908	type	True	75.9967	263	1602	95	below_threshold
Tepidamorphus gemmatus	strain=DSM 19345	GCA_004346195.1	747076	747076	type	True	75.9807	138	1602	95	below_threshold
Shinella pollutisoli	strain=KCTC 52677	GCA_024609765.1	2250594	2250594	type	True	75.9535	182	1602	95	below_threshold
Rhodoplanes piscinae	strain=DSM 19946	GCA_003258855.1	444923	444923	type	True	75.9306	171	1602	95	below_threshold
Oricola indica	strain=JL-62	GCA_019966595.1	2872591	2872591	type	True	75.9131	127	1602	95	below_threshold
Azorhizobium caulinodans	strain=ORS 571	GCA_000010525.1	7	7	type	True	75.892	126	1602	95	below_threshold
Phreatobacter stygius	strain=KCTC 52518	GCA_005144885.1	1940610	1940610	type	True	75.8841	200	1602	95	below_threshold
Rhizobium croatiense	strain=13T	GCA_019793465.1	2867516	2867516	type	True	75.8608	119	1602	95	below_threshold
Rhizobium redzepovicii	strain=18T	GCA_019793435.1	2867518	2867518	type	True	75.7954	117	1602	95	below_threshold
Roseococcus pinisoli	strain=XZZS9	GCA_018413645.1	2835040	2835040	type	True	75.7225	165	1602	95	below_threshold
Roseococcus microcysteis	strain=NIBR12	GCA_014764365.1	2771361	2771361	type	True	75.5796	142	1602	95	below_threshold
Sphingomonas hylomeconis	strain=CCTCC AB 2013304	GCA_025370105.1	1395958	1395958	type	True	75.5546	121	1602	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-16 17:09:22,222] [INFO] DFAST Taxonomy check result was written to OceanDNA-b24095/tc_result.tsv
[2023-03-16 17:09:22,222] [INFO] ===== Taxonomy check completed =====
[2023-03-16 17:09:22,222] [INFO] ===== Start completeness check using CheckM =====
[2023-03-16 17:09:22,222] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg93efb681-5922-4fa1-ae53-678df9153568/dqc_reference/checkm_data
[2023-03-16 17:09:22,223] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-16 17:09:22,238] [INFO] Task started: CheckM
[2023-03-16 17:09:22,238] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b24095/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b24095/checkm_input OceanDNA-b24095/checkm_result
[2023-03-16 17:10:47,844] [INFO] Task succeeded: CheckM
[2023-03-16 17:10:47,845] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-03-16 17:10:47,849] [INFO] ===== Completeness check finished =====
[2023-03-16 17:10:47,849] [INFO] ===== Start GTDB Search =====
[2023-03-16 17:10:47,850] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b24095/markers.fasta)
[2023-03-16 17:10:47,851] [INFO] Task started: Blastn
[2023-03-16 17:10:47,851] [INFO] Running command: blastn -query OceanDNA-b24095/markers.fasta -db /var/lib/cwl/stg93efb681-5922-4fa1-ae53-678df9153568/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b24095/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-16 17:10:49,522] [INFO] Task succeeded: Blastn
[2023-03-16 17:10:49,523] [INFO] Selected 34 target genomes.
[2023-03-16 17:10:49,523] [INFO] Target genome list was writen to OceanDNA-b24095/target_genomes_gtdb.txt
[2023-03-16 17:10:49,589] [INFO] Task started: fastANI
[2023-03-16 17:10:49,589] [INFO] Running command: fastANI --query /var/lib/cwl/stg1fcf24de-ce95-430c-b4c7-e3453abc524c/OceanDNA-b24095.fa --refList OceanDNA-b24095/target_genomes_gtdb.txt --output OceanDNA-b24095/fastani_result_gtdb.tsv --threads 1
[2023-03-16 17:11:17,143] [INFO] Task succeeded: fastANI
[2023-03-16 17:11:17,162] [INFO] Found 34 fastANI hits (0 hits with ANI > circumscription radius)
[2023-03-16 17:11:17,162] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCA_905479125.1	s__UBA830 sp905479125	77.9192	364	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__MPNO01;f__UBA830;g__UBA830	95.0	98.22	98.19	0.84	0.83	3	-
GCA_905181725.1	s__UBA830 sp905181725	77.9085	380	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__MPNO01;f__UBA830;g__UBA830	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002471575.1	s__UBA830 sp002471575	77.3755	315	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__MPNO01;f__UBA830;g__UBA830	95.0	97.90	95.90	0.91	0.84	7	-
GCA_009377375.1	s__WHUE01 sp009377375	77.2827	353	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__MPNO01;f__UBA2964;g__WHUE01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016869145.1	s__VGEU01 sp016869145	76.9318	294	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__MPNO01;f__VGEU01;g__VGEU01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_003250275.1	s__Alpha-05 sp003250275	76.6705	133	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Rhodospirillaceae;g__Alpha-05	95.0	N/A	N/A	N/A	N/A	1	-
GCA_017640745.1	s__Nisaea sp017640745	76.6685	178	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Thalassobaculales;f__Thalassobaculaceae;g__Nisaea	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001618175.1	s__Oceanibaculum pacificum	76.6354	192	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Oceanibaculales;f__Oceanibaculaceae;g__Oceanibaculum	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014904705.1	s__Nisaea sp014904705	76.5258	195	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Thalassobaculales;f__Thalassobaculaceae;g__Nisaea	95.0	N/A	N/A	N/A	N/A	1	-
GCF_006738645.1	s__Stella humosa	76.4597	230	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__ATCC43930;f__Stellaceae;g__Stella	95.0	100.00	100.00	1.00	1.00	2	-
GCA_001603335.1	s__Alpha-05 sp001603335	76.4459	95	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Rhodospirillaceae;g__Alpha-05	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002689465.1	s__UBA830 sp002689465	76.3949	66	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__MPNO01;f__UBA830;g__UBA830	95.0	N/A	N/A	N/A	N/A	1	-
GCA_005884705.1	s__Reyranella sp005884705	76.3808	150	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Reyranellales;f__Reyranellaceae;g__Reyranella	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900177295.1	s__Tistlia consotensis	76.3356	314	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Kiloniellales;f__DSM-21159;g__Tistlia	95.0	99.99	99.99	0.99	0.99	2	-
GCA_016792605.1	s__JAEULC01 sp016792605	76.3319	198	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__JAEULC01;f__JAEULC01;g__JAEULC01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_018823745.1	s__Oceanibaculum sp018823745	76.31	164	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Oceanibaculales;f__Oceanibaculaceae;g__Oceanibaculum	95.0	99.98	99.98	1.00	1.00	4	-
GCA_018432935.1	s__JAHDSF01 sp018432935	76.2341	160	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__JAHDSF01;g__JAHDSF01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016869645.1	s__SHVW01 sp016869645	76.1545	203	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__SHVW01;f__SHVW01;g__SHVW01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_017307375.1	s__JAFKFH01 sp017307375	76.134	279	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Ferrovibrionales;f__Ferrovibrionaceae;g__JAFKFH01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_903858635.1	s__Telmatospirillum sp903858635	76.1048	168	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Magnetospirillaceae;g__Telmatospirillum	95.0	99.88	99.86	0.92	0.90	3	-
GCA_004297265.1	s__Reyranella sp004297265	76.0691	188	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Reyranellales;f__Reyranellaceae;g__Reyranella	95.0	N/A	N/A	N/A	N/A	1	-
GCF_008039615.1	s__SYSU-D60007 sp008039615	76.0014	262	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Reyranellales;f__Reyranellaceae;g__SYSU-D60007	95.0	N/A	N/A	N/A	N/A	1	-
GCA_900110395.1	s__Reyranella sp900110395	75.9895	219	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Reyranellales;f__Reyranellaceae;g__Reyranella	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001703635.1	s__Hoeflea olei	75.9797	146	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Hoeflea	95.0	N/A	N/A	N/A	N/A	1	-
GCF_004346195.1	s__Tepidamorphus gemmatus	75.9717	139	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Tepidamorphaceae;g__Tepidamorphus	95.0	N/A	N/A	N/A	N/A	1	-
GCA_014359905.1	s__Hoeflea sp014359905	75.9159	164	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Hoeflea	95.0	N/A	N/A	N/A	N/A	1	-
GCA_017307955.1	s__Kaistia sp017307955	75.9081	168	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Kaistiaceae;g__Kaistia	95.0	N/A	N/A	N/A	N/A	1	-
GCA_019240395.1	s__JAFAYA01 sp019240395	75.8961	157	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Acetobacterales;f__Acetobacteraceae;g__JAFAYA01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_019084245.1	s__Hoeflea sp019084245	75.8863	119	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Hoeflea	95.0	100.00	100.00	0.99	0.99	6	-
GCF_000010525.1	s__Azorhizobium caulinodans	75.8691	128	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Azorhizobium	95.0	97.54	97.54	0.94	0.94	2	-
GCA_018829385.1	s__Hoeflea sp018829385	75.8503	84	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Hoeflea	95.0	N/A	N/A	N/A	N/A	1	-
GCA_009720755.1	s__Rhodoplanes serenus	75.8398	198	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Rhodoplanes	95.0	97.72	97.48	0.91	0.88	4	-
GCF_000759435.1	s__Hoeflea sp000759435	75.7898	170	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Hoeflea	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000526315.1	s__Methylopila sp000526315	75.7895	158	1602	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Methylopilaceae;g__Methylopila	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-03-16 17:11:17,163] [INFO] GTDB search result was written to OceanDNA-b24095/result_gtdb.tsv
[2023-03-16 17:11:17,164] [INFO] ===== GTDB Search completed =====
[2023-03-16 17:11:17,168] [INFO] DFAST_QC result json was written to OceanDNA-b24095/dqc_result.json
[2023-03-16 17:11:17,168] [INFO] DFAST_QC completed!
[2023-03-16 17:11:17,168] [INFO] Total running time: 0h3m3s
