[2023-03-18 00:35:04,589] [INFO] DFAST_QC pipeline started.
[2023-03-18 00:35:04,590] [INFO] DFAST_QC version: 0.5.7
[2023-03-18 00:35:04,590] [INFO] DQC Reference Directory: /var/lib/cwl/stg80f9ce44-217e-4a31-b961-edb3b7e25ea7/dqc_reference
[2023-03-18 00:35:05,669] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-18 00:35:05,669] [INFO] Task started: Prodigal
[2023-03-18 00:35:05,670] [INFO] Running command: cat /var/lib/cwl/stg297f4130-2b58-43ed-9809-ee2b5daca818/OceanDNA-b31687.fa | prodigal -d OceanDNA-b31687/cds.fna -a OceanDNA-b31687/protein.faa -g 11 -q > /dev/null
[2023-03-18 00:35:29,798] [INFO] Task succeeded: Prodigal
[2023-03-18 00:35:29,799] [INFO] Task started: HMMsearch
[2023-03-18 00:35:29,799] [INFO] Running command: hmmsearch --tblout OceanDNA-b31687/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg80f9ce44-217e-4a31-b961-edb3b7e25ea7/dqc_reference/reference_markers.hmm OceanDNA-b31687/protein.faa > /dev/null
[2023-03-18 00:35:30,049] [INFO] Task succeeded: HMMsearch
[2023-03-18 00:35:30,050] [INFO] Found 6/6 markers.
[2023-03-18 00:35:30,072] [INFO] Query marker FASTA was written to OceanDNA-b31687/markers.fasta
[2023-03-18 00:35:30,073] [INFO] Task started: Blastn
[2023-03-18 00:35:30,073] [INFO] Running command: blastn -query OceanDNA-b31687/markers.fasta -db /var/lib/cwl/stg80f9ce44-217e-4a31-b961-edb3b7e25ea7/dqc_reference/reference_markers.fasta -out OceanDNA-b31687/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-18 00:35:30,829] [INFO] Task succeeded: Blastn
[2023-03-18 00:35:30,829] [INFO] Selected 26 target genomes.
[2023-03-18 00:35:30,830] [INFO] Target genome list was writen to OceanDNA-b31687/target_genomes.txt
[2023-03-18 00:35:30,846] [INFO] Task started: fastANI
[2023-03-18 00:35:30,847] [INFO] Running command: fastANI --query /var/lib/cwl/stg297f4130-2b58-43ed-9809-ee2b5daca818/OceanDNA-b31687.fa --refList OceanDNA-b31687/target_genomes.txt --output OceanDNA-b31687/fastani_result.tsv --threads 1
[2023-03-18 00:35:47,063] [INFO] Task succeeded: fastANI
[2023-03-18 00:35:47,063] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg80f9ce44-217e-4a31-b961-edb3b7e25ea7/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-18 00:35:47,064] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg80f9ce44-217e-4a31-b961-edb3b7e25ea7/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-18 00:35:47,077] [INFO] Found 26 fastANI hits (0 hits with ANI > threshold)
[2023-03-18 00:35:47,077] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-18 00:35:47,078] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Erythrobacter litoralis	strain=DSM 8509	GCA_000714795.1	39960	39960	type	True	78.942	488	1232	95	below_threshold
Erythrobacter litoralis	strain=DSM 8509	GCA_001719165.1	39960	39960	type	True	78.9259	486	1232	95	below_threshold
Erythrobacter rubeus	strain=KMU-140	GCA_014705715.1	2760803	2760803	type	True	78.8528	455	1232	95	below_threshold
Erythrobacter colymbi	strain=JCM 18338	GCA_002155685.1	1161202	1161202	type	True	78.6516	453	1232	95	below_threshold
Qipengyuania proteolytica	strain=6B39	GCA_019711565.1	2867239	2867239	type	True	78.5914	338	1232	95	below_threshold
Erythrobacter sanguineus	strain=DSM 11032	GCA_900143235.1	198312	198312	type	True	78.5335	437	1232	95	below_threshold
Erythrobacter sanguineus	strain=JCM 20691	GCA_002155655.1	198312	198312	type	True	78.5292	448	1232	95	below_threshold
Erythrobacter tepidarius	strain=DSM 10594	GCA_002155695.1	60454	60454	type	True	78.4845	441	1232	95	below_threshold
Alteriqipengyuania lutimaris	strain=S-5	GCA_003363135.1	1538146	1538146	type	True	78.3368	299	1232	95	below_threshold
Alteriqipengyuania lutimaris	strain=CECT 8624	GCA_014191645.1	1538146	1538146	type	True	78.3272	302	1232	95	below_threshold
Qipengyuania qiaonensis	strain=6D47A	GCA_019711515.1	2867240	2867240	type	True	78.2735	288	1232	95	below_threshold
Qipengyuania xiamenensis	strain=1XM1-15A	GCA_019711495.1	2867237	2867237	type	True	78.2284	315	1232	95	below_threshold
Qipengyuania flava	strain=DSM 16421	GCA_011762005.1	192812	192812	type	True	78.2108	304	1232	95	below_threshold
Pelagerythrobacter aerophilus	strain=Ery1	GCA_003581645.1	2306995	2306995	type	True	78.1489	292	1232	95	below_threshold
Qipengyuania algicida	strain=KEMB 9005-328	GCA_009828025.1	1836209	1836209	type	True	78.1255	248	1232	95	below_threshold
Qipengyuania polymorpha	strain=1NDH17	GCA_019711435.1	2867234	2867234	type	True	78.0963	293	1232	95	below_threshold
Erythrobacter cryptus	strain=DSM 12079	GCA_000422985.1	196588	196588	type	True	78.0916	400	1232	95	below_threshold
Qipengyuania huizhouensis	strain=YG19	GCA_019711635.1	2867245	2867245	type	True	78.0855	236	1232	95	below_threshold
Qipengyuania pelagi	strain=JCM 17468	GCA_009827295.1	994320	994320	type	True	78.059	313	1232	95	below_threshold
Pelagerythrobacter marinus	strain=H32	GCA_009827515.1	538382	538382	type	True	77.8959	324	1232	95	below_threshold
Tsuneonella rigui	strain=KCTC 42620	GCA_003958625.1	1708790	1708790	type	True	77.8528	275	1232	95	below_threshold
Aurantiacibacter xanthus	strain=CCTCC AB 2015396	GCA_003584015.1	1784712	1784712	type	True	77.8024	246	1232	95	below_threshold
Croceicoccus marinus	strain=E4A9	GCA_001661675.2	450378	450378	type	True	77.6131	224	1232	95	below_threshold
Tsuneonella amylolytica	strain=NS1	GCA_003626915.1	2338327	2338327	type	True	77.5336	256	1232	95	below_threshold
Novosphingobium percolationis	strain=c1	GCA_020179425.1	2871811	2871811	type	True	77.252	232	1232	95	below_threshold
Novosphingobium rosa	strain=NBRC 15208	GCA_001598555.1	76978	76978	type	True	77.0666	171	1232	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-18 00:35:47,078] [INFO] DFAST Taxonomy check result was written to OceanDNA-b31687/tc_result.tsv
[2023-03-18 00:35:47,078] [INFO] ===== Taxonomy check completed =====
[2023-03-18 00:35:47,078] [INFO] ===== Start completeness check using CheckM =====
[2023-03-18 00:35:47,078] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg80f9ce44-217e-4a31-b961-edb3b7e25ea7/dqc_reference/checkm_data
[2023-03-18 00:35:47,079] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-18 00:35:47,084] [INFO] Task started: CheckM
[2023-03-18 00:35:47,084] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b31687/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b31687/checkm_input OceanDNA-b31687/checkm_result
[2023-03-18 00:36:45,479] [INFO] Task succeeded: CheckM
[2023-03-18 00:36:45,479] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-03-18 00:36:45,482] [INFO] ===== Completeness check finished =====
[2023-03-18 00:36:45,482] [INFO] ===== Start GTDB Search =====
[2023-03-18 00:36:45,482] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b31687/markers.fasta)
[2023-03-18 00:36:45,482] [INFO] Task started: Blastn
[2023-03-18 00:36:45,482] [INFO] Running command: blastn -query OceanDNA-b31687/markers.fasta -db /var/lib/cwl/stg80f9ce44-217e-4a31-b961-edb3b7e25ea7/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b31687/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-18 00:36:46,851] [INFO] Task succeeded: Blastn
[2023-03-18 00:36:46,852] [INFO] Selected 18 target genomes.
[2023-03-18 00:36:46,852] [INFO] Target genome list was writen to OceanDNA-b31687/target_genomes_gtdb.txt
[2023-03-18 00:36:46,873] [INFO] Task started: fastANI
[2023-03-18 00:36:46,873] [INFO] Running command: fastANI --query /var/lib/cwl/stg297f4130-2b58-43ed-9809-ee2b5daca818/OceanDNA-b31687.fa --refList OceanDNA-b31687/target_genomes_gtdb.txt --output OceanDNA-b31687/fastani_result_gtdb.tsv --threads 1
[2023-03-18 00:36:57,263] [INFO] Task succeeded: fastANI
[2023-03-18 00:36:57,273] [INFO] Found 18 fastANI hits (0 hits with ANI > circumscription radius)
[2023-03-18 00:36:57,274] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_018205975.1	s__Erythrobacter sp018205975	80.8097	723	1232	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Erythrobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCA_000714765.1	s__Erythrobacter sp000714765	79.2153	512	1232	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Erythrobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_009363635.1	s__Erythrobacter sp009363635	79.1027	498	1232	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Erythrobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000152865.1	s__Erythrobacter sp000152865	78.9651	478	1232	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Erythrobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003264115.1	s__Erythrobacter sp003264115	78.9427	438	1232	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Erythrobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014705715.1	s__Erythrobacter sp014705715	78.8672	453	1232	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Erythrobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCA_903934325.1	s__Erythrobacter sp903934325	78.7535	424	1232	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Erythrobacter	95.0	99.99	99.99	1.00	1.00	2	-
GCF_001720465.1	s__Erythrobacter sp001720465	78.7395	441	1232	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Erythrobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002706445.1	s__Erythrobacter sp002706445	78.6217	317	1232	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Erythrobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCA_013822695.1	s__Erythrobacter sp013822695	78.5646	407	1232	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Erythrobacter	95.0	99.78	99.78	0.82	0.82	2	-
GCF_900143235.1	s__Erythrobacter sanguineus	78.5379	436	1232	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Erythrobacter	95.0	99.99	99.99	1.00	1.00	2	-
GCF_900177715.1	s__Erythrobacter xiamenensis	78.5221	408	1232	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Erythrobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001557325.1	s__Erythrobacter donghaensis_B	78.5065	393	1232	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Erythrobacter	95.0	99.70	99.64	0.86	0.85	7	-
GCA_019204025.1	s__Erythrobacter sp019204025	78.4359	421	1232	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Erythrobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_016803135.1	s__Pelagerythrobacter sp016803135	78.2466	299	1232	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Pelagerythrobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_009827295.1	s__Qipengyuania pelagi	78.059	313	1232	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Qipengyuania	95.0	N/A	N/A	N/A	N/A	1	-
GCA_013911865.1	s__Qipengyuania sp013911865	77.7629	196	1232	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Qipengyuania	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000715015.1	s__Erythrobacter longus	77.3732	274	1232	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Erythrobacter	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-03-18 00:36:57,274] [INFO] GTDB search result was written to OceanDNA-b31687/result_gtdb.tsv
[2023-03-18 00:36:57,274] [INFO] ===== GTDB Search completed =====
[2023-03-18 00:36:57,276] [INFO] DFAST_QC result json was written to OceanDNA-b31687/dqc_result.json
[2023-03-18 00:36:57,276] [INFO] DFAST_QC completed!
[2023-03-18 00:36:57,276] [INFO] Total running time: 0h1m53s
