[2023-03-16 16:35:56,429] [INFO] DFAST_QC pipeline started.
[2023-03-16 16:35:56,430] [INFO] DFAST_QC version: 0.5.7
[2023-03-16 16:35:56,430] [INFO] DQC Reference Directory: /var/lib/cwl/stg72fae799-dc7e-40a0-9cd7-ad80ad33db21/dqc_reference
[2023-03-16 16:35:57,560] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-16 16:35:57,560] [INFO] Task started: Prodigal
[2023-03-16 16:35:57,560] [INFO] Running command: cat /var/lib/cwl/stgb31136d8-9176-4a55-bbcb-8c1e0e40e2cc/OceanDNA-b10366.fa | prodigal -d OceanDNA-b10366/cds.fna -a OceanDNA-b10366/protein.faa -g 11 -q > /dev/null
[2023-03-16 16:36:44,996] [INFO] Task succeeded: Prodigal
[2023-03-16 16:36:44,996] [INFO] Task started: HMMsearch
[2023-03-16 16:36:44,996] [INFO] Running command: hmmsearch --tblout OceanDNA-b10366/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg72fae799-dc7e-40a0-9cd7-ad80ad33db21/dqc_reference/reference_markers.hmm OceanDNA-b10366/protein.faa > /dev/null
[2023-03-16 16:36:45,226] [INFO] Task succeeded: HMMsearch
[2023-03-16 16:36:45,227] [INFO] Found 6/6 markers.
[2023-03-16 16:36:45,250] [INFO] Query marker FASTA was written to OceanDNA-b10366/markers.fasta
[2023-03-16 16:36:45,252] [INFO] Task started: Blastn
[2023-03-16 16:36:45,252] [INFO] Running command: blastn -query OceanDNA-b10366/markers.fasta -db /var/lib/cwl/stg72fae799-dc7e-40a0-9cd7-ad80ad33db21/dqc_reference/reference_markers.fasta -out OceanDNA-b10366/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-16 16:36:45,797] [INFO] Task succeeded: Blastn
[2023-03-16 16:36:45,798] [INFO] Selected 29 target genomes.
[2023-03-16 16:36:45,798] [INFO] Target genome list was writen to OceanDNA-b10366/target_genomes.txt
[2023-03-16 16:36:45,884] [INFO] Task started: fastANI
[2023-03-16 16:36:45,884] [INFO] Running command: fastANI --query /var/lib/cwl/stgb31136d8-9176-4a55-bbcb-8c1e0e40e2cc/OceanDNA-b10366.fa --refList OceanDNA-b10366/target_genomes.txt --output OceanDNA-b10366/fastani_result.tsv --threads 1
[2023-03-16 16:37:05,158] [INFO] Task succeeded: fastANI
[2023-03-16 16:37:05,159] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg72fae799-dc7e-40a0-9cd7-ad80ad33db21/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-16 16:37:05,159] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg72fae799-dc7e-40a0-9cd7-ad80ad33db21/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-16 16:37:05,173] [INFO] Found 25 fastANI hits (0 hits with ANI > threshold)
[2023-03-16 16:37:05,173] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-16 16:37:05,173] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Arenibacter hampyeongensis	strain=JCM 17788	GCA_002909255.1	1028743	1028743	type	True	76.8895	120	1409	95	below_threshold
Muricauda taeanensis	strain=JCM 17757	GCA_003584105.1	1005926	1005926	type	True	76.7927	126	1409	95	below_threshold
Arenibacter echinorum	strain=DSM 23522	GCA_003259375.1	440515	440515	type	True	76.6721	125	1409	95	below_threshold
Arenibacter troitsensis	strain=DSM 19835	GCA_900177645.1	188872	188872	type	True	76.6262	154	1409	95	below_threshold
Arenibacter palladensis	strain=DSM 17539	GCA_900129275.1	237373	237373	type	True	76.6233	150	1409	95	below_threshold
Arenibacter catalasegens	strain=P308H10	GCA_002909235.1	2056779	2056779	type	True	76.6199	127	1409	95	below_threshold
Arenibacter arenosicollis	strain=BSSL-BM3	GCA_014397245.1	2762274	2762274	type	True	76.6051	144	1409	95	below_threshold
Arenibacter aquaticus	strain=GUO	GCA_003957295.1	2489054	2489054	type	True	76.583	134	1409	95	below_threshold
Arenibacter algicola	strain=TG409	GCA_000733925.1	616991	616991	type	True	76.4439	139	1409	95	below_threshold
Muricauda hadalis	strain=MT-229	GCA_007785775.2	2597517	2597517	type	True	76.3834	133	1409	95	below_threshold
Muricauda amoyensis	strain=GCL-11	GCA_003058265.1	2169401	2169401	type	True	76.3775	155	1409	95	below_threshold
Maribacter luteus	strain=RZ05	GCA_009674825.1	2594478	2594478	type	True	76.3578	98	1409	95	below_threshold
Muricauda beolgyonensis	strain=KCTC 23501	GCA_003992615.1	864064	864064	type	True	76.2259	115	1409	95	below_threshold
Pelagihabitans pacificus	strain=TP-CH-4	GCA_009371985.2	2696054	2696054	type	True	76.2112	107	1409	95	below_threshold
Muricauda sediminis	strain=40Bstr401	GCA_010500845.1	2696468	2696468	type	True	76.1601	121	1409	95	below_threshold
Maribacter polysiphoniae	strain=DSM 23514	GCA_003148665.1	429344	429344	type	True	76.159	118	1409	95	below_threshold
Maribacter algarum	strain=RZ26	GCA_005885635.1	2578118	2578118	type	True	76.1438	60	1409	95	below_threshold
Aggregatimonas sangjinii	strain=F202Z8	GCA_005943945.1	2583587	2583587	type	True	76.1107	68	1409	95	below_threshold
Muricauda lutaonensis	strain=CC-HSB-11	GCA_000963865.1	516051	516051	type	True	76.0617	77	1409	95	below_threshold
Muricauda oceanensis	strain=40DY170	GCA_003992595.1	2499163	2499163	type	True	76.0518	81	1409	95	below_threshold
Muricauda zhangzhouensis	strain=DSM 25030	GCA_900106825.1	1073328	1073328	type	True	75.9344	84	1409	95	below_threshold
Muricauda zhangzhouensis	strain=CGMCC 1.11028	GCA_900102925.1	1073328	1073328	type	True	75.9219	82	1409	95	below_threshold
Muricauda ochracea	strain=JGD-17	GCA_009903685.1	2696472	2696472	type	True	75.9113	90	1409	95	below_threshold
Sinomicrobium kalidii	strain=HD2P242	GCA_021183825.1	2900738	2900738	type	True	75.901	60	1409	95	below_threshold
Ulvibacterium marinum	strain=CCMM003	GCA_003626755.1	2419782	2419782	type	True	75.8463	115	1409	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-16 16:37:05,174] [INFO] DFAST Taxonomy check result was written to OceanDNA-b10366/tc_result.tsv
[2023-03-16 16:37:05,174] [INFO] ===== Taxonomy check completed =====
[2023-03-16 16:37:05,174] [INFO] ===== Start completeness check using CheckM =====
[2023-03-16 16:37:05,174] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg72fae799-dc7e-40a0-9cd7-ad80ad33db21/dqc_reference/checkm_data
[2023-03-16 16:37:05,175] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-16 16:37:05,180] [INFO] Task started: CheckM
[2023-03-16 16:37:05,180] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b10366/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b10366/checkm_input OceanDNA-b10366/checkm_result
[2023-03-16 16:38:46,876] [INFO] Task succeeded: CheckM
[2023-03-16 16:38:46,877] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 91.67%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-03-16 16:38:46,881] [INFO] ===== Completeness check finished =====
[2023-03-16 16:38:46,881] [INFO] ===== Start GTDB Search =====
[2023-03-16 16:38:46,882] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b10366/markers.fasta)
[2023-03-16 16:38:46,883] [INFO] Task started: Blastn
[2023-03-16 16:38:46,884] [INFO] Running command: blastn -query OceanDNA-b10366/markers.fasta -db /var/lib/cwl/stg72fae799-dc7e-40a0-9cd7-ad80ad33db21/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b10366/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-16 16:38:48,172] [INFO] Task succeeded: Blastn
[2023-03-16 16:38:48,173] [INFO] Selected 27 target genomes.
[2023-03-16 16:38:48,174] [INFO] Target genome list was writen to OceanDNA-b10366/target_genomes_gtdb.txt
[2023-03-16 16:38:49,075] [INFO] Task started: fastANI
[2023-03-16 16:38:49,075] [INFO] Running command: fastANI --query /var/lib/cwl/stgb31136d8-9176-4a55-bbcb-8c1e0e40e2cc/OceanDNA-b10366.fa --refList OceanDNA-b10366/target_genomes_gtdb.txt --output OceanDNA-b10366/fastani_result_gtdb.tsv --threads 1
[2023-03-16 16:39:07,954] [INFO] Task succeeded: fastANI
[2023-03-16 16:39:07,968] [INFO] Found 24 fastANI hits (0 hits with ANI > circumscription radius)
[2023-03-16 16:39:07,968] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_001430825.1	s__Sediminicola sp001430825	77.6542	211	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Sediminicola	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003426735.1	s__Arenibacter sp003426735	76.9327	163	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Arenibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002909255.1	s__Arenibacter hampyeongensis	76.9076	120	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Arenibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003584105.1	s__Muricauda taeanensis	76.7944	126	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Muricauda	95.0	97.68	97.68	0.85	0.85	3	-
GCF_003201775.1	s__Arenibacter sp003201775	76.6794	150	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Arenibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003259375.1	s__Arenibacter echinorum	76.6721	125	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Arenibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900129275.1	s__Arenibacter palladensis	76.6248	151	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Arenibacter	95.0	97.38	97.38	0.84	0.84	2	-
GCF_002909235.1	s__Arenibacter catalasegens	76.6015	127	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Arenibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014397245.1	s__Arenibacter arenosicollis	76.578	143	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Arenibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003957295.1	s__Arenibacter aquaticus	76.5675	135	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Arenibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900141855.1	s__Pseudozobellia thermophila	76.3953	106	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Pseudozobellia	95.0	N/A	N/A	N/A	N/A	1	-
GCF_007785775.2	s__Muricauda sp003973595	76.3834	133	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Muricauda	95.0	98.36	98.36	0.87	0.87	2	-
GCF_013407935.1	s__Muricauda sp013407935	76.3017	93	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Muricauda	95.0	96.58	96.50	0.87	0.86	8	-
GCA_001683825.1	s__Zeaxanthinibacter sp001683825	76.2775	96	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Zeaxanthinibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003992615.1	s__Muricauda beolgyonensis	76.2259	115	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Muricauda	95.0	N/A	N/A	N/A	N/A	1	-
GCF_005885635.1	s__RZ26 sp005885635	76.1705	59	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__RZ26	95.0	N/A	N/A	N/A	N/A	1	-
GCF_010500845.1	s__Muricauda sediminis	76.1601	121	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Muricauda	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003385855.1	s__Muricauda sp003385855	76.0806	117	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Muricauda	95.0	97.98	97.98	0.89	0.89	2	-
GCF_000963865.1	s__Muricauda_A lutaonensis	76.0617	77	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Muricauda_A	95.0	97.74	97.69	0.86	0.85	5	-
GCF_000224085.1	s__Muricauda ruestringensis	76.0614	99	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Muricauda	95.0	N/A	N/A	N/A	N/A	1	-
GCA_013214065.1	s__DT-32 sp013214065	76.0614	60	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__DT-32	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003992595.1	s__Muricauda sp002452975	76.0327	82	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Muricauda	95.0	99.01	99.01	0.89	0.89	2	-
GCF_900102925.1	s__Muricauda zhangzhouensis	75.9219	82	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Muricauda	95.0	100.00	100.00	1.00	1.00	2	-
GCF_003626755.1	s__Ulvibacterium marinum	75.8463	115	1409	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Ulvibacterium	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-03-16 16:39:07,969] [INFO] GTDB search result was written to OceanDNA-b10366/result_gtdb.tsv
[2023-03-16 16:39:07,969] [INFO] ===== GTDB Search completed =====
[2023-03-16 16:39:07,973] [INFO] DFAST_QC result json was written to OceanDNA-b10366/dqc_result.json
[2023-03-16 16:39:07,973] [INFO] DFAST_QC completed!
[2023-03-16 16:39:07,973] [INFO] Total running time: 0h3m12s
