[2023-03-15 07:52:40,350] [INFO] DFAST_QC pipeline started.
[2023-03-15 07:52:40,350] [INFO] DFAST_QC version: 0.5.7
[2023-03-15 07:52:40,350] [INFO] DQC Reference Directory: /var/lib/cwl/stg07358fe8-1650-459f-bb99-e2d2c4f4a1dc/dqc_reference
[2023-03-15 07:52:41,648] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-15 07:52:41,649] [INFO] Task started: Prodigal
[2023-03-15 07:52:41,649] [INFO] Running command: cat /var/lib/cwl/stg97e4bc29-7833-4688-bdc5-a32e3919143f/OceanDNA-b33241.fa | prodigal -d OceanDNA-b33241/cds.fna -a OceanDNA-b33241/protein.faa -g 11 -q > /dev/null
[2023-03-15 07:52:57,819] [INFO] Task succeeded: Prodigal
[2023-03-15 07:52:57,819] [INFO] Task started: HMMsearch
[2023-03-15 07:52:57,819] [INFO] Running command: hmmsearch --tblout OceanDNA-b33241/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg07358fe8-1650-459f-bb99-e2d2c4f4a1dc/dqc_reference/reference_markers.hmm OceanDNA-b33241/protein.faa > /dev/null
[2023-03-15 07:52:58,024] [INFO] Task succeeded: HMMsearch
[2023-03-15 07:52:58,024] [INFO] Found 6/6 markers.
[2023-03-15 07:52:58,041] [INFO] Query marker FASTA was written to OceanDNA-b33241/markers.fasta
[2023-03-15 07:52:58,041] [INFO] Task started: Blastn
[2023-03-15 07:52:58,042] [INFO] Running command: blastn -query OceanDNA-b33241/markers.fasta -db /var/lib/cwl/stg07358fe8-1650-459f-bb99-e2d2c4f4a1dc/dqc_reference/reference_markers.fasta -out OceanDNA-b33241/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-15 07:52:58,993] [INFO] Task succeeded: Blastn
[2023-03-15 07:52:58,993] [INFO] Selected 32 target genomes.
[2023-03-15 07:52:58,994] [INFO] Target genome list was writen to OceanDNA-b33241/target_genomes.txt
[2023-03-15 07:52:59,010] [INFO] Task started: fastANI
[2023-03-15 07:52:59,010] [INFO] Running command: fastANI --query /var/lib/cwl/stg97e4bc29-7833-4688-bdc5-a32e3919143f/OceanDNA-b33241.fa --refList OceanDNA-b33241/target_genomes.txt --output OceanDNA-b33241/fastani_result.tsv --threads 1
[2023-03-15 07:53:20,140] [INFO] Task succeeded: fastANI
[2023-03-15 07:53:20,141] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg07358fe8-1650-459f-bb99-e2d2c4f4a1dc/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-15 07:53:20,141] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg07358fe8-1650-459f-bb99-e2d2c4f4a1dc/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-15 07:53:20,158] [INFO] Found 32 fastANI hits (0 hits with ANI > threshold)
[2023-03-15 07:53:20,158] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-15 07:53:20,158] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Thioalkalivibrio sulfidiphilus	strain=HL-EbGR7	GCA_000021985.1	1033854	1033854	type	True	77.2684	141	840	95	below_threshold
Arenimonas caeni	strain=z29	GCA_003024235.1	2058085	2058085	type	True	77.1848	192	840	95	below_threshold
Thiohalobacter thiocyanaticus	strain=Hrh1	GCA_003932505.1	585455	585455	type	True	77.0464	133	840	95	below_threshold
Vulcaniibacterium gelatinicum	strain=R-5-52-3	GCA_008033445.1	2598725	2598725	type	True	76.9666	205	840	95	below_threshold
Thioalbus denitrificans	strain=DSM 26407	GCA_003337735.1	547122	547122	type	True	76.9225	178	840	95	below_threshold
Fulvimonas soli	strain=DSM 14263	GCA_003148905.1	155197	155197	type	True	76.9099	232	840	95	below_threshold
Inmirania thermothiophila	strain=DSM 100275	GCA_003751635.1	1750597	1750597	type	True	76.9088	172	840	95	below_threshold
Allochromatium tepidum	strain=NZ	GCA_018409545.1	553982	553982	type	True	76.9062	104	840	95	below_threshold
Fulvimonas soli	strain=LMG 19981	GCA_006352285.1	155197	155197	type	True	76.9061	222	840	95	below_threshold
Vulcaniibacterium thermophilum	strain=KCTC 32020	GCA_007923255.1	1169913	1169913	type	True	76.8075	171	840	95	below_threshold
Rhodanobacter denitrificans	strain=2APBS1	GCA_000230695.3	666685	666685	type	True	76.7998	151	840	95	below_threshold
Plasticicumulans lactativorans	strain=DSM 25287	GCA_004341245.1	1133106	1133106	type	True	76.7942	222	840	95	below_threshold
Steroidobacter gossypii	strain=S1-65	GCA_016801985.1	2805490	2805490	type	True	76.7707	137	840	95	below_threshold
Thioalkalivibrio halophilus	strain=HL17	GCA_001995255.1	252474	252474	type	True	76.7174	107	840	95	below_threshold
Thioalkalivibrio paradoxus	strain=ARh 1	GCA_000227685.3	108010	108010	type	True	76.661	107	840	95	below_threshold
Halorhodospira neutriphila	strain=DSM 15116	GCA_016584055.1	168379	168379	type	True	76.6405	131	840	95	below_threshold
Frateuria terrea	strain=CGMCC 1.7053	GCA_900115705.1	529704	529704	type	True	76.6322	145	840	95	below_threshold
Frateuria terrea	strain=DSM 26515	GCA_900109025.1	529704	529704	type	True	76.6307	146	840	95	below_threshold
Marichromatium gracile	strain=DSM 203	GCA_004343155.1	1048	1048	type	True	76.538	146	840	95	below_threshold
Rhodanobacter spathiphylli	strain=B39	GCA_000264295.1	347483	347483	type	True	76.4746	150	840	95	below_threshold
Steroidobacter soli	strain=JW-3	GCA_004138485.1	2497877	2497877	type	True	76.4576	145	840	95	below_threshold
Marichromatium gracile	strain=DSM 203	GCA_016583515.1	1048	1048	type	True	76.4493	147	840	95	below_threshold
Luteimonas salinisoli	strain=SJ-92	GCA_013425525.1	2752307	2752307	type	True	76.4216	194	840	95	below_threshold
Marichromatium purpuratum	strain=984	GCA_000224005.3	37487	37487	type	True	76.4146	135	840	95	below_threshold
Pseudomonas lalucatii	strain=R1b54	GCA_018398425.1	1424203	1424203	type	True	76.3212	165	840	95	below_threshold
Thauera aminoaromatica	strain=S2	GCA_000310185.1	164330	164330	type	True	76.2922	148	840	95	below_threshold
Lysobacter bugurensis	strain=KCTC 23077	GCA_014652095.1	543356	543356	type	True	76.2872	114	840	95	below_threshold
Pseudomonas yangonensis	strain=MY50	GCA_009932725.1	2579922	2579922	type	True	76.2185	98	840	95	below_threshold
Plasticicumulans acidivorans	strain=DSM 23606	GCA_003182095.1	886464	886464	type	True	76.1584	149	840	95	below_threshold
Lysobacter terrigena	strain=17J7-1	GCA_004361065.1	2488749	2488749	type	True	76.1124	118	840	95	below_threshold
Pseudomonas mangiferae	strain=DMKU BBB3-04	GCA_007109405.1	2593654	2593654	type	True	76.0784	131	840	95	below_threshold
Pseudomonas insulae	strain=UL073	GCA_016901015.1	2809017	2809017	type	True	75.7939	127	840	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-15 07:53:20,159] [INFO] DFAST Taxonomy check result was written to OceanDNA-b33241/tc_result.tsv
[2023-03-15 07:53:20,159] [INFO] ===== Taxonomy check completed =====
[2023-03-15 07:53:20,159] [INFO] ===== Start completeness check using CheckM =====
[2023-03-15 07:53:20,159] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg07358fe8-1650-459f-bb99-e2d2c4f4a1dc/dqc_reference/checkm_data
[2023-03-15 07:53:20,160] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-15 07:53:20,169] [INFO] Task started: CheckM
[2023-03-15 07:53:20,170] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b33241/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b33241/checkm_input OceanDNA-b33241/checkm_result
[2023-03-15 07:54:02,281] [INFO] Task succeeded: CheckM
[2023-03-15 07:54:02,281] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 75.00%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-03-15 07:54:02,283] [INFO] ===== Completeness check finished =====
[2023-03-15 07:54:02,283] [INFO] ===== Start GTDB Search =====
[2023-03-15 07:54:02,284] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b33241/markers.fasta)
[2023-03-15 07:54:02,284] [INFO] Task started: Blastn
[2023-03-15 07:54:02,284] [INFO] Running command: blastn -query OceanDNA-b33241/markers.fasta -db /var/lib/cwl/stg07358fe8-1650-459f-bb99-e2d2c4f4a1dc/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b33241/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-15 07:54:04,151] [INFO] Task succeeded: Blastn
[2023-03-15 07:54:04,152] [INFO] Selected 20 target genomes.
[2023-03-15 07:54:04,152] [INFO] Target genome list was writen to OceanDNA-b33241/target_genomes_gtdb.txt
[2023-03-15 07:54:04,183] [INFO] Task started: fastANI
[2023-03-15 07:54:04,184] [INFO] Running command: fastANI --query /var/lib/cwl/stg97e4bc29-7833-4688-bdc5-a32e3919143f/OceanDNA-b33241.fa --refList OceanDNA-b33241/target_genomes_gtdb.txt --output OceanDNA-b33241/fastani_result_gtdb.tsv --threads 1
[2023-03-15 07:54:15,084] [INFO] Task succeeded: fastANI
[2023-03-15 07:54:15,096] [INFO] Found 20 fastANI hits (0 hits with ANI > circumscription radius)
[2023-03-15 07:54:15,096] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_011089885.1	s__XN24 sp011089885	81.9662	529	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__XN24;f__XN24;g__XN24	95.0	N/A	N/A	N/A	N/A	1	-
GCF_011058255.1	s__XN24 sp011058255	81.7616	534	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__XN24;f__XN24;g__XN24	95.0	N/A	N/A	N/A	N/A	1	-
GCF_011064545.1	s__XN24 sp011064545	80.7849	517	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__XN24;f__XN24;g__XN24	95.0	N/A	N/A	N/A	N/A	1	-
GCA_019136105.1	s__RPQJ01 sp019136105	77.8105	126	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Steroidobacterales;f__Steroidobacteraceae;g__RPQJ01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_007127715.1	s__PWYM01 sp007127715	77.5093	141	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__PWYM01;f__PWYM01;g__PWYM01	95.0	99.45	99.43	0.90	0.89	3	-
GCA_003242495.1	s__ZC4RG39 sp003242495	77.421	131	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Steroidobacterales;f__Steroidobacteraceae;g__ZC4RG39	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016789885.1	s__SHZI01 sp016789885	77.266	179	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__GCA-2729495;f__GCA-2729495;g__SHZI01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_014337915.1	s__QUBU01 sp014337915	77.1972	157	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__GCA-2729495;f__GCA-2729495;g__QUBU01	95.0	99.72	99.56	0.97	0.95	4	-
GCA_014762505.1	s__SpSt-1174 sp014762505	77.1341	163	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__SpSt-1174;f__SpSt-1174;g__SpSt-1174	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000953855.2	s__Mizugakiibacter sediminis	77.0684	225	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Rhodanobacteraceae;g__Mizugakiibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCA_003695765.1	s__J054 sp003695765	76.9596	144	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__GCA-2729495;f__GCA-2729495;g__J054	95.0	N/A	N/A	N/A	N/A	1	-
GCA_003819815.1	s__RPQJ01 sp003819815	76.9548	147	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Steroidobacterales;f__Steroidobacteraceae;g__RPQJ01	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001182895.1	s__Frateuria_B defendens	76.9152	201	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Rhodanobacteraceae;g__Frateuria_B	95.0	N/A	N/A	N/A	N/A	1	-
GCA_011046015.1	s__SpSt-1174 sp011046015	76.9057	95	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__SpSt-1174;f__SpSt-1174;g__SpSt-1174	95.0	N/A	N/A	N/A	N/A	1	-
GCA_017577365.1	s__ZC4RG20 sp017577365	76.7532	112	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__UBA6522;f__UBA6522;g__ZC4RG20	95.0	N/A	N/A	N/A	N/A	1	-
GCA_018240295.1	s__Plasticicumulans sp003962905	76.7037	191	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Competibacterales;f__Competibacteraceae;g__Plasticicumulans	95.0	98.50	98.50	0.84	0.84	2	-
GCF_001428405.1	s__Frateuria_A sp001428405	76.6551	199	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Rhodanobacteraceae;g__Frateuria_A	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002405525.1	s__UBA4656 sp002405525	76.398	166	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Ahniellaceae;g__UBA4656	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003182095.1	s__Plasticicumulans acidivorans	76.1675	148	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Competibacterales;f__Competibacteraceae;g__Plasticicumulans	95.0	N/A	N/A	N/A	N/A	1	-
GCA_018241145.1	s__66-474 sp018241145	76.0961	90	840	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Xanthomonadales;f__Rhodanobacteraceae;g__66-474	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-03-15 07:54:15,096] [INFO] GTDB search result was written to OceanDNA-b33241/result_gtdb.tsv
[2023-03-15 07:54:15,096] [INFO] ===== GTDB Search completed =====
[2023-03-15 07:54:15,099] [INFO] DFAST_QC result json was written to OceanDNA-b33241/dqc_result.json
[2023-03-15 07:54:15,099] [INFO] DFAST_QC completed!
[2023-03-15 07:54:15,099] [INFO] Total running time: 0h1m35s
