[2023-03-18 01:26:26,344] [INFO] DFAST_QC pipeline started.
[2023-03-18 01:26:26,344] [INFO] DFAST_QC version: 0.5.7
[2023-03-18 01:26:26,345] [INFO] DQC Reference Directory: /var/lib/cwl/stgefbc69a2-28da-4953-8357-434d8598fd65/dqc_reference
[2023-03-18 01:26:27,495] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-18 01:26:27,496] [INFO] Task started: Prodigal
[2023-03-18 01:26:27,496] [INFO] Running command: cat /var/lib/cwl/stgcc9672ca-3dca-4980-b2fa-89d2951a672e/OceanDNA-b26662.fa | prodigal -d OceanDNA-b26662/cds.fna -a OceanDNA-b26662/protein.faa -g 11 -q > /dev/null
[2023-03-18 01:26:56,736] [INFO] Task succeeded: Prodigal
[2023-03-18 01:26:56,736] [INFO] Task started: HMMsearch
[2023-03-18 01:26:56,736] [INFO] Running command: hmmsearch --tblout OceanDNA-b26662/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgefbc69a2-28da-4953-8357-434d8598fd65/dqc_reference/reference_markers.hmm OceanDNA-b26662/protein.faa > /dev/null
[2023-03-18 01:26:57,112] [INFO] Task succeeded: HMMsearch
[2023-03-18 01:26:57,113] [INFO] Found 6/6 markers.
[2023-03-18 01:26:57,144] [INFO] Query marker FASTA was written to OceanDNA-b26662/markers.fasta
[2023-03-18 01:26:57,145] [INFO] Task started: Blastn
[2023-03-18 01:26:57,146] [INFO] Running command: blastn -query OceanDNA-b26662/markers.fasta -db /var/lib/cwl/stgefbc69a2-28da-4953-8357-434d8598fd65/dqc_reference/reference_markers.fasta -out OceanDNA-b26662/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-18 01:26:58,230] [INFO] Task succeeded: Blastn
[2023-03-18 01:26:58,231] [INFO] Selected 25 target genomes.
[2023-03-18 01:26:58,231] [INFO] Target genome list was writen to OceanDNA-b26662/target_genomes.txt
[2023-03-18 01:26:58,285] [INFO] Task started: fastANI
[2023-03-18 01:26:58,285] [INFO] Running command: fastANI --query /var/lib/cwl/stgcc9672ca-3dca-4980-b2fa-89d2951a672e/OceanDNA-b26662.fa --refList OceanDNA-b26662/target_genomes.txt --output OceanDNA-b26662/fastani_result.tsv --threads 1
[2023-03-18 01:27:23,676] [INFO] Task succeeded: fastANI
[2023-03-18 01:27:23,677] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgefbc69a2-28da-4953-8357-434d8598fd65/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-18 01:27:23,677] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgefbc69a2-28da-4953-8357-434d8598fd65/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-18 01:27:23,691] [INFO] Found 25 fastANI hits (0 hits with ANI > threshold)
[2023-03-18 01:27:23,691] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-18 01:27:23,691] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Oharaeibacter diazotrophicus	strain=SM30	GCA_011317485.1	1920512	1920512	type	True	78.7562	553	1628	95	below_threshold
Methylobrevis albus	strain=L22	GCA_015904235.1	2793297	2793297	type	True	78.7462	616	1628	95	below_threshold
Methylobrevis pamukkalensis	strain=PK2	GCA_001720135.1	1439726	1439726	type	True	78.3985	503	1628	95	below_threshold
Oharaeibacter diazotrophicus	strain=DSM 102969	GCA_004362745.1	1920512	1920512	type	True	78.3671	667	1628	95	below_threshold
Starkeya novella	strain=DSM 506	GCA_000092925.1	921	921	type	True	78.0762	499	1628	95	below_threshold
Rhodobium orientis	strain=DSM 11290	GCA_016653375.1	34017	34017	type	True	77.9076	474	1628	95	below_threshold
Rhodobium orientis	strain=DSM 11290	GCA_014197785.1	34017	34017	type	True	77.8777	483	1628	95	below_threshold
Rhodobium orientis	strain=DSM 11290	GCA_003258835.1	34017	34017	type	True	77.8754	476	1628	95	below_threshold
Blastochloris sulfoviridis	strain=DSM 729	GCA_008630065.1	50712	50712	type	True	77.7724	400	1628	95	below_threshold
Ancylobacter sonchi	strain=VKM B-3145	GCA_018390695.1	1937790	1937790	type	True	77.746	539	1628	95	below_threshold
Ancylobacter oerskovii	strain=CCM 7435	GCA_018390555.1	459519	459519	type	True	77.7347	521	1628	95	below_threshold
Pleomorphomonas carboxyditropha	strain=SVCO-16	GCA_002770725.1	2023338	2023338	type	True	77.6832	464	1628	95	below_threshold
Ancylobacter defluvii	strain=VKM B-2789	GCA_018390605.1	1282440	1282440	type	True	77.6588	495	1628	95	below_threshold
Starkeya koreensis	strain=Jip08	GCA_023016525.1	266121	266121	type	True	77.5772	477	1628	95	below_threshold
Methylobacterium terrae	strain=17Sr1-28	GCA_003173755.1	2202827	2202827	type	True	77.3228	560	1628	95	below_threshold
Methylobacterium terricola	strain=17Sr1-39	GCA_006151805.1	2583531	2583531	type	True	77.3164	603	1628	95	below_threshold
Methylobacterium oryzihabitans	strain=TER-1	GCA_004004555.2	2499852	2499852	type	True	77.2929	581	1628	95	below_threshold
Methylobacterium tarhaniae	strain=DSM 25844	GCA_001043955.1	1187852	1187852	type	True	77.2776	499	1628	95	below_threshold
Methylobacterium nonmethylotrophicum	strain=6HR-1	GCA_004745635.1	1141884	1141884	type	True	77.241	592	1628	95	below_threshold
Bauldia litoralis	strain=ATCC 35022	GCA_900104485.1	665467	665467	type	True	77.2345	413	1628	95	below_threshold
Xanthobacter tagetidis	strain=DSM 11105	GCA_014206845.1	60216	60216	type	True	77.2239	462	1628	95	below_threshold
Methylobacterium currus	strain=PR1016A	GCA_003058325.1	2051553	2051553	type	True	77.2133	550	1628	95	below_threshold
Xanthobacter tagetidis	strain=ATCC 700314	GCA_003667445.1	60216	60216	type	True	77.2008	465	1628	95	below_threshold
Methylobacterium crusticola	strain=KCTC 52305	GCA_022179145.1	1697972	1697972	type	True	77.0928	573	1628	95	below_threshold
Methylobacterium goesingense	strain=DSM 21331	GCA_022179225.1	243690	243690	type	True	76.9352	360	1628	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-18 01:27:23,691] [INFO] DFAST Taxonomy check result was written to OceanDNA-b26662/tc_result.tsv
[2023-03-18 01:27:23,692] [INFO] ===== Taxonomy check completed =====
[2023-03-18 01:27:23,692] [INFO] ===== Start completeness check using CheckM =====
[2023-03-18 01:27:23,692] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgefbc69a2-28da-4953-8357-434d8598fd65/dqc_reference/checkm_data
[2023-03-18 01:27:23,693] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-18 01:27:23,700] [INFO] Task started: CheckM
[2023-03-18 01:27:23,700] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b26662/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b26662/checkm_input OceanDNA-b26662/checkm_result
[2023-03-18 01:28:37,423] [INFO] Task succeeded: CheckM
[2023-03-18 01:28:37,423] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 83.33%
Contamintation: 5.21%
Strain heterogeneity: 100.00%
--------------------------------------------------------------------------------
[2023-03-18 01:28:37,427] [INFO] ===== Completeness check finished =====
[2023-03-18 01:28:37,427] [INFO] ===== Start GTDB Search =====
[2023-03-18 01:28:37,427] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b26662/markers.fasta)
[2023-03-18 01:28:37,428] [INFO] Task started: Blastn
[2023-03-18 01:28:37,429] [INFO] Running command: blastn -query OceanDNA-b26662/markers.fasta -db /var/lib/cwl/stgefbc69a2-28da-4953-8357-434d8598fd65/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b26662/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-18 01:28:39,565] [INFO] Task succeeded: Blastn
[2023-03-18 01:28:39,566] [INFO] Selected 28 target genomes.
[2023-03-18 01:28:39,566] [INFO] Target genome list was writen to OceanDNA-b26662/target_genomes_gtdb.txt
[2023-03-18 01:28:39,790] [INFO] Task started: fastANI
[2023-03-18 01:28:39,791] [INFO] Running command: fastANI --query /var/lib/cwl/stgcc9672ca-3dca-4980-b2fa-89d2951a672e/OceanDNA-b26662.fa --refList OceanDNA-b26662/target_genomes_gtdb.txt --output OceanDNA-b26662/fastani_result_gtdb.tsv --threads 1
[2023-03-18 01:29:06,355] [INFO] Task succeeded: fastANI
[2023-03-18 01:29:06,371] [INFO] Found 28 fastANI hits (0 hits with ANI > circumscription radius)
[2023-03-18 01:29:06,371] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_015904235.1	s__L22 sp015904235	78.7786	611	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Pleomorphomonadaceae;g__L22	95.0	N/A	N/A	N/A	N/A	1	-
GCF_009600605.1	s__Pseudoxanthobacter spirostomi	78.7762	629	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Pseudoxanthobacteraceae;g__Pseudoxanthobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900148505.1	s__Pseudoxanthobacter soli	78.5661	533	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Pseudoxanthobacteraceae;g__Pseudoxanthobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001720135.1	s__Methylobrevis pamukkalensis	78.3915	503	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Pleomorphomonadaceae;g__Methylobrevis	95.0	N/A	N/A	N/A	N/A	1	-
GCF_004362745.1	s__Oharaeibacter diazotrophicus	78.3741	665	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Pleomorphomonadaceae;g__Oharaeibacter	95.0	99.98	99.97	1.00	1.00	3	-
GCF_017872635.1	s__Starkeya sp017872635	77.9519	490	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Starkeya	95.0	100.00	100.00	1.00	1.00	2	-
GCF_003258835.1	s__Rhodobium orientis	77.9254	469	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhodobiaceae;g__Rhodobium	95.0	99.99	99.98	0.99	0.98	3	-
GCF_014199915.1	s__Prosthecomicrobium pneumaticum	77.9019	556	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Kaistiaceae;g__Prosthecomicrobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000702305.1	s__GCF-000702305 sp000702305	77.8971	490	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__GCF-000702305	95.0	N/A	N/A	N/A	N/A	1	-
GCF_013839525.1	s__Mongoliimonas rhizosphaerae	77.8306	510	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Pleomorphomonadaceae;g__Mongoliimonas	95.0	N/A	N/A	N/A	N/A	1	-
GCF_018390695.1	s__Ancylobacter_B sonchi	77.7911	530	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Ancylobacter_B	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001696535.1	s__Stappia indica	77.7455	458	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Stappia	95.0	97.23	96.79	0.94	0.94	3	-
GCA_018729455.1	s__Prosthecomicrobium_A sp018729455	77.7286	562	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Ancalomicrobiaceae;g__Prosthecomicrobium_A	95.0	N/A	N/A	N/A	N/A	1	-
GCA_012517025.1	s__Pinisolibacter sp012517025	77.6742	419	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Ancalomicrobiaceae;g__Pinisolibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002770725.1	s__Pleomorphomonas carboxyditropha	77.6715	464	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Pleomorphomonadaceae;g__Pleomorphomonas	95.0	98.10	98.10	0.89	0.89	2	-
GCF_018390635.1	s__Ancylobacter_C lacus	77.6406	495	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Ancylobacter_C	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003610575.1	s__Stappia sp003610575	77.5009	443	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Stappia	95.0	98.18	98.18	0.97	0.97	2	-
GCF_006151805.1	s__Methylobacterium sp006151805	77.3195	604	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Beijerinckiaceae;g__Methylobacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_013377085.1	s__Rhodoplanes sp013377085	77.3072	499	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Rhodoplanes	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016793065.1	s__JAEULV01 sp016793065	77.2902	417	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Ancalomicrobiaceae;g__JAEULV01	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900104485.1	s__Bauldia litoralis	77.2812	406	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Kaistiaceae;g__Bauldia	95.0	N/A	N/A	N/A	N/A	1	-
GCA_009720755.1	s__Rhodoplanes serenus	77.2801	519	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Rhodoplanes	95.0	97.72	97.48	0.91	0.88	4	-
GCA_016793595.1	s__Phreatobacter sp016793595	77.2394	387	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Phreatobacteraceae;g__Phreatobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003667445.1	s__Xanthobacter tagetidis	77.1954	466	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Xanthobacter	95.0	100.00	100.00	1.00	1.00	2	-
GCA_016793505.1	s__GCA-013693735 sp016793505	77.1196	447	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Beijerinckiaceae;g__GCA-013693735	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000019365.1	s__Methylobacterium sp000019365	76.7301	549	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Beijerinckiaceae;g__Methylobacterium	95.0	99.09	99.09	0.89	0.89	2	-
GCA_007118805.1	s__SKQW01 sp007118805	76.6183	214	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Tepidamorphaceae;g__SKQW01	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014644455.1	s__Caldovatus sediminis	75.9679	447	1628	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Acetobacterales;f__Acetobacteraceae;g__Caldovatus	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-03-18 01:29:06,371] [INFO] GTDB search result was written to OceanDNA-b26662/result_gtdb.tsv
[2023-03-18 01:29:06,371] [INFO] ===== GTDB Search completed =====
[2023-03-18 01:29:06,374] [INFO] DFAST_QC result json was written to OceanDNA-b26662/dqc_result.json
[2023-03-18 01:29:06,374] [INFO] DFAST_QC completed!
[2023-03-18 01:29:06,374] [INFO] Total running time: 0h2m40s
