[2023-03-15 11:54:50,862] [INFO] DFAST_QC pipeline started.
[2023-03-15 11:54:50,862] [INFO] DFAST_QC version: 0.5.7
[2023-03-15 11:54:50,862] [INFO] DQC Reference Directory: /var/lib/cwl/stg09f817f9-b21c-4156-bc32-a021bf1157c8/dqc_reference
[2023-03-15 11:54:51,984] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-15 11:54:51,984] [INFO] Task started: Prodigal
[2023-03-15 11:54:51,984] [INFO] Running command: cat /var/lib/cwl/stgcd3813fb-4e09-43f3-afda-6dd09f24d0a9/OceanDNA-b27564.fa | prodigal -d OceanDNA-b27564/cds.fna -a OceanDNA-b27564/protein.faa -g 11 -q > /dev/null
[2023-03-15 11:55:11,008] [INFO] Task succeeded: Prodigal
[2023-03-15 11:55:11,009] [INFO] Task started: HMMsearch
[2023-03-15 11:55:11,009] [INFO] Running command: hmmsearch --tblout OceanDNA-b27564/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg09f817f9-b21c-4156-bc32-a021bf1157c8/dqc_reference/reference_markers.hmm OceanDNA-b27564/protein.faa > /dev/null
[2023-03-15 11:55:11,199] [INFO] Task succeeded: HMMsearch
[2023-03-15 11:55:11,200] [INFO] Found 6/6 markers.
[2023-03-15 11:55:11,224] [INFO] Query marker FASTA was written to OceanDNA-b27564/markers.fasta
[2023-03-15 11:55:11,226] [INFO] Task started: Blastn
[2023-03-15 11:55:11,226] [INFO] Running command: blastn -query OceanDNA-b27564/markers.fasta -db /var/lib/cwl/stg09f817f9-b21c-4156-bc32-a021bf1157c8/dqc_reference/reference_markers.fasta -out OceanDNA-b27564/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-15 11:55:12,051] [INFO] Task succeeded: Blastn
[2023-03-15 11:55:12,053] [INFO] Selected 34 target genomes.
[2023-03-15 11:55:12,054] [INFO] Target genome list was writen to OceanDNA-b27564/target_genomes.txt
[2023-03-15 11:55:12,073] [INFO] Task started: fastANI
[2023-03-15 11:55:12,073] [INFO] Running command: fastANI --query /var/lib/cwl/stgcd3813fb-4e09-43f3-afda-6dd09f24d0a9/OceanDNA-b27564.fa --refList OceanDNA-b27564/target_genomes.txt --output OceanDNA-b27564/fastani_result.tsv --threads 1
[2023-03-15 11:55:34,294] [INFO] Task succeeded: fastANI
[2023-03-15 11:55:34,294] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg09f817f9-b21c-4156-bc32-a021bf1157c8/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-15 11:55:34,295] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg09f817f9-b21c-4156-bc32-a021bf1157c8/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-15 11:55:34,312] [INFO] Found 34 fastANI hits (0 hits with ANI > threshold)
[2023-03-15 11:55:34,312] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-15 11:55:34,313] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Rhodovulum marinum	strain=DSM 18063	GCA_004343075.1	320662	320662	type	True	77.4496	282	1019	95	below_threshold
Brevirhabdus pacifica	strain=22DY15	GCA_002094875.1	1267768	1267768	type	True	77.3898	215	1019	95	below_threshold
Brevirhabdus pacifica	strain=DSM 27767	GCA_002797755.1	1267768	1267768	type	True	77.3175	241	1019	95	below_threshold
Ruegeria pomeroyi	strain=DSS-3	GCA_000011965.2	89184	89184	suspected-type	True	77.2634	330	1019	95	below_threshold
Sulfitobacter sabulilitoris	strain=HSMS-29	GCA_005887615.1	2562655	2562655	type	True	77.2491	264	1019	95	below_threshold
Pelagivirga sediminicola	strain=BH-SD19	GCA_003072125.1	2170575	2170575	type	True	77.2445	253	1019	95	below_threshold
Loktanella atrilutea	strain=DSM 29326	GCA_900128995.1	366533	366533	type	True	77.2243	228	1019	95	below_threshold
Cribrihabitans marinus	strain=CGMCC 1.13219	GCA_014640375.1	1227549	1227549	type	True	77.2212	305	1019	95	below_threshold
Cribrihabitans marinus	strain=DSM 29340	GCA_900109035.1	1227549	1227549	type	True	77.2144	309	1019	95	below_threshold
Actibacterium mucosum	strain=KCTC 23349	GCA_000647975.1	1087332	1087332	type	True	77.2019	215	1019	95	below_threshold
Thalassobius mangrovi	strain=GS-10	GCA_009857745.1	2692236	2692236	type	True	77.1907	297	1019	95	below_threshold
Salipiger marinus	strain=DSM 26424	GCA_900100085.1	555512	555512	type	True	77.1268	271	1019	95	below_threshold
Pelagivirga dicentrarchi	strain=YLY04	GCA_003316635.1	2250573	2250573	type	True	77.1039	204	1019	95	below_threshold
Actibacterium atlanticum	strain=22II-S11-z10	GCA_000671395.1	1461693	1461693	type	True	77.054	170	1019	95	below_threshold
Roseovarius halotolerans	strain=CECT 8110	GCA_900172255.1	505353	505353	type	True	77.0013	243	1019	95	below_threshold
Roseovarius halotolerans	strain=DSM 29507	GCA_003634925.1	505353	505353	type	True	76.9603	244	1019	95	below_threshold
Puniceibacterium antarcticum	strain=SM1211	GCA_002760615.1	1206336	1206336	type	True	76.9136	231	1019	95	below_threshold
Roseovarius aestuarii	strain=KCTC 22174	GCA_019966515.1	475083	475083	type	True	76.8853	189	1019	95	below_threshold
Puniceibacterium sediminis	strain=DSM 29052	GCA_900188035.1	1608407	1608407	type	True	76.8831	216	1019	95	below_threshold
Nioella nitratireducens	strain=SSW136	GCA_001879715.1	1287720	1287720	type	True	76.8562	230	1019	95	below_threshold
Pseudopuniceibacterium antarcticum	strain=HQ09	GCA_014899185.1	2613965	2613965	type	True	76.8385	229	1019	95	below_threshold
Phaeovulum veldkampii	strain=DSM 11550	GCA_003034995.1	33049	33049	type	True	76.8335	228	1019	95	below_threshold
Phaeovulum veldkampii	strain=DSM 11550	GCA_016653435.1	33049	33049	type	True	76.8288	224	1019	95	below_threshold
Phaeovulum veldkampii	strain=DSM 11550	GCA_004363195.1	33049	33049	type	True	76.8154	226	1019	95	below_threshold
Yoonia vestfoldensis	strain=DSM 16212	GCA_000382265.1	245188	245188	type	True	76.8145	241	1019	95	below_threshold
Alexandriicola marinus	strain=LZ-14	GCA_004000435.1	2081710	2081710	type	True	76.7774	223	1019	95	below_threshold
Actibacterium pelagium	strain=JN33	GCA_002285415.1	2029103	2029103	type	True	76.7467	154	1019	95	below_threshold
Flavimaricola marinus	strain=CECT 8899	GCA_900184895.1	1819565	1819565	type	True	76.7412	202	1019	95	below_threshold
Paracoccus limosus	strain=JCM 17370	GCA_009711185.1	913252	913252	type	True	76.7334	205	1019	95	below_threshold
Pseudooceanicola algae	strain=Lw-13e	GCA_003590145.2	1537215	1537215	type	True	76.6496	214	1019	95	below_threshold
Roseobacter denitrificans	strain=OCh 114	GCA_000014045.1	2434	2434	type	True	76.6081	159	1019	95	below_threshold
Salipiger pallidus	strain=CGMCC 1.15762	GCA_014643635.1	1775170	1775170	type	True	76.5946	160	1019	95	below_threshold
Cereibacter azotoformans	strain=KA25	GCA_003050905.1	43057	43057	type	True	76.5644	185	1019	95	below_threshold
Marivivens niveibacter	strain=MCCC 1A06712	GCA_002150005.2	1930667	1930667	type	True	75.7135	90	1019	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-15 11:55:34,314] [INFO] DFAST Taxonomy check result was written to OceanDNA-b27564/tc_result.tsv
[2023-03-15 11:55:34,315] [INFO] ===== Taxonomy check completed =====
[2023-03-15 11:55:34,316] [INFO] ===== Start completeness check using CheckM =====
[2023-03-15 11:55:34,316] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg09f817f9-b21c-4156-bc32-a021bf1157c8/dqc_reference/checkm_data
[2023-03-15 11:55:34,316] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-15 11:55:34,323] [INFO] Task started: CheckM
[2023-03-15 11:55:34,323] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b27564/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b27564/checkm_input OceanDNA-b27564/checkm_result
[2023-03-15 11:56:22,819] [INFO] Task succeeded: CheckM
[2023-03-15 11:56:22,819] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 83.33%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-03-15 11:56:22,840] [INFO] ===== Completeness check finished =====
[2023-03-15 11:56:22,840] [INFO] ===== Start GTDB Search =====
[2023-03-15 11:56:22,840] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b27564/markers.fasta)
[2023-03-15 11:56:22,841] [INFO] Task started: Blastn
[2023-03-15 11:56:22,841] [INFO] Running command: blastn -query OceanDNA-b27564/markers.fasta -db /var/lib/cwl/stg09f817f9-b21c-4156-bc32-a021bf1157c8/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b27564/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-15 11:56:24,391] [INFO] Task succeeded: Blastn
[2023-03-15 11:56:24,395] [INFO] Selected 31 target genomes.
[2023-03-15 11:56:24,395] [INFO] Target genome list was writen to OceanDNA-b27564/target_genomes_gtdb.txt
[2023-03-15 11:56:24,458] [INFO] Task started: fastANI
[2023-03-15 11:56:24,458] [INFO] Running command: fastANI --query /var/lib/cwl/stgcd3813fb-4e09-43f3-afda-6dd09f24d0a9/OceanDNA-b27564.fa --refList OceanDNA-b27564/target_genomes_gtdb.txt --output OceanDNA-b27564/fastani_result_gtdb.tsv --threads 1
[2023-03-15 11:56:46,179] [INFO] Task succeeded: fastANI
[2023-03-15 11:56:46,195] [INFO] Found 31 fastANI hits (0 hits with ANI > circumscription radius)
[2023-03-15 11:56:46,196] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_014196965.1	s__Actibacterium_A naphthalenivorans	77.7099	288	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Actibacterium_A	95.0	98.28	97.38	0.93	0.89	4	-
GCA_017792445.1	s__Rhodobacter_B sp017792445	77.6732	329	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodobacter_B	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900142185.1	s__Lutimaribacter pacificus	77.6373	329	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Lutimaribacter	95.0	100.00	100.00	1.00	1.00	2	-
GCF_019104745.1	s__CAU-1522 sp019104745	77.5699	291	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__CAU-1522	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002701395.1	s__Rhodobacter_B sp002701395	77.5506	335	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodobacter_B	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016938875.1	s__Rhodovulum sp016938875	77.5413	279	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum	95.0	N/A	N/A	N/A	N/A	1	-
GCF_004343075.1	s__Rhodovulum marinum	77.4822	279	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum	95.0	N/A	N/A	N/A	N/A	1	-
GCA_006226775.1	s__Marinovum sp006226775	77.392	306	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Marinovum	95.0	N/A	N/A	N/A	N/A	1	-
GCA_001650895.1	s__EhC02 sp001650895	77.3549	287	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__EhC02	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000744955.1	s__Actibacterium sp000744955	77.3276	313	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Actibacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000011965.2	s__Ruegeria_B pomeroyi	77.2647	330	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Ruegeria_B	95.0	99.97	99.92	0.99	0.98	5	-
GCF_003072125.1	s__Roseovarius sediminicola	77.2445	253	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius	95.0	N/A	N/A	N/A	N/A	1	-
GCA_001314655.1	s__Salibaculum sp001314655	77.2243	271	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Salibaculum	95.0	N/A	N/A	N/A	N/A	1	-
GCA_013042605.1	s__IMCC34051 sp013042605	77.1985	231	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__IMCC34051	95.0	99.10	98.86	0.88	0.86	4	-
GCF_013184805.1	s__Rhodobacter_B sp013184805	77.1131	292	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodobacter_B	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003316635.1	s__Roseovarius dicentrarchi	77.1062	204	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002210105.1	s__Pseudooceanicola sp002210105	77.0451	258	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Pseudooceanicola	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003390845.1	s__Tropicimonas sp003390845	77.035	225	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Tropicimonas	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003335585.1	s__Sulfitobacter sp003335585	76.956	293	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001650875.1	s__Sulfitobacter sp001650875	76.9553	254	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003053725.1	s__Rhodovulum kholense	76.9236	275	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum	95.0	98.27	98.27	0.96	0.96	2	-
GCF_900188035.1	s__Puniceibacterium sediminis	76.8947	215	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Puniceibacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003034995.1	s__Phaeovulum veldkampii	76.8439	227	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Phaeovulum	95.0	99.98	99.97	0.97	0.94	3	-
GCF_000813965.1	s__Leisingera sp000813965	76.835	226	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Leisingera	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900172235.1	s__Pseudoruegeria aquimaris	76.8339	248	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Pseudoruegeria	95.0	N/A	N/A	N/A	N/A	1	-
GCF_009711185.1	s__Paracoccus limosus	76.7567	203	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Paracoccus	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900184895.1	s__Flavimaricola marinus	76.7411	203	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Flavimaricola	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003663885.1	s__Litoreibacter meonggei	76.7094	170	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Litoreibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016278285.1	s__JADQCB01 sp016278285	76.6776	179	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__JADQCB01	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003590145.2	s__Pseudooceanicola algae	76.669	212	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Pseudooceanicola	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002222635.1	s__Ascidiaceihabitans pseudonitzschiae_A	76.6441	213	1019	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Ascidiaceihabitans	95.0	99.92	99.92	0.96	0.96	2	-
--------------------------------------------------------------------------------
[2023-03-15 11:56:46,197] [INFO] GTDB search result was written to OceanDNA-b27564/result_gtdb.tsv
[2023-03-15 11:56:46,200] [INFO] ===== GTDB Search completed =====
[2023-03-15 11:56:46,205] [INFO] DFAST_QC result json was written to OceanDNA-b27564/dqc_result.json
[2023-03-15 11:56:46,205] [INFO] DFAST_QC completed!
[2023-03-15 11:56:46,205] [INFO] Total running time: 0h1m55s
