[2023-03-19 01:26:02,179] [INFO] DFAST_QC pipeline started.
[2023-03-19 01:26:02,180] [INFO] DFAST_QC version: 0.5.7
[2023-03-19 01:26:02,180] [INFO] DQC Reference Directory: /var/lib/cwl/stga9f7a561-222b-4a9f-a3d5-96f8bf676178/dqc_reference
[2023-03-19 01:26:03,786] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-19 01:26:03,787] [INFO] Task started: Prodigal
[2023-03-19 01:26:03,787] [INFO] Running command: cat /var/lib/cwl/stgd2e368eb-9ff4-4adb-b160-e355755d40ba/OceanDNA-b29505.fa | prodigal -d OceanDNA-b29505/cds.fna -a OceanDNA-b29505/protein.faa -g 11 -q > /dev/null
[2023-03-19 01:26:18,929] [INFO] Task succeeded: Prodigal
[2023-03-19 01:26:18,929] [INFO] Task started: HMMsearch
[2023-03-19 01:26:18,929] [INFO] Running command: hmmsearch --tblout OceanDNA-b29505/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stga9f7a561-222b-4a9f-a3d5-96f8bf676178/dqc_reference/reference_markers.hmm OceanDNA-b29505/protein.faa > /dev/null
[2023-03-19 01:26:19,124] [INFO] Task succeeded: HMMsearch
[2023-03-19 01:26:19,124] [WARNING] Found 4/6 markers. [/var/lib/cwl/stgd2e368eb-9ff4-4adb-b160-e355755d40ba/OceanDNA-b29505.fa]
[2023-03-19 01:26:19,151] [INFO] Query marker FASTA was written to OceanDNA-b29505/markers.fasta
[2023-03-19 01:26:19,152] [INFO] Task started: Blastn
[2023-03-19 01:26:19,152] [INFO] Running command: blastn -query OceanDNA-b29505/markers.fasta -db /var/lib/cwl/stga9f7a561-222b-4a9f-a3d5-96f8bf676178/dqc_reference/reference_markers.fasta -out OceanDNA-b29505/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-19 01:26:19,783] [INFO] Task succeeded: Blastn
[2023-03-19 01:26:19,784] [INFO] Selected 29 target genomes.
[2023-03-19 01:26:19,785] [INFO] Target genome list was writen to OceanDNA-b29505/target_genomes.txt
[2023-03-19 01:26:19,822] [INFO] Task started: fastANI
[2023-03-19 01:26:19,822] [INFO] Running command: fastANI --query /var/lib/cwl/stgd2e368eb-9ff4-4adb-b160-e355755d40ba/OceanDNA-b29505.fa --refList OceanDNA-b29505/target_genomes.txt --output OceanDNA-b29505/fastani_result.tsv --threads 1
[2023-03-19 01:26:37,742] [INFO] Task succeeded: fastANI
[2023-03-19 01:26:37,743] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stga9f7a561-222b-4a9f-a3d5-96f8bf676178/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-19 01:26:37,743] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stga9f7a561-222b-4a9f-a3d5-96f8bf676178/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-19 01:26:37,758] [INFO] Found 27 fastANI hits (0 hits with ANI > threshold)
[2023-03-19 01:26:37,758] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-19 01:26:37,758] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Antarcticimicrobium luteum	strain=318-1	GCA_004358185.1	2547397	2547397	type	True	77.2237	188	751	95	below_threshold
Aestuariivita boseongensis	strain=BS-B2	GCA_001262635.1	1470562	1470562	type	True	77.132	192	751	95	below_threshold
Ruegeria pomeroyi	strain=DSS-3	GCA_000011965.2	89184	89184	suspected-type	True	77.0661	208	751	95	below_threshold
Falsiruegeria litorea	strain=CECT 7639	GCA_900172225.1	1280831	1280831	type	True	76.9047	150	751	95	below_threshold
Ruegeria sediminis	strain=CAU 1488	GCA_005938215.1	2583820	2583820	type	True	76.8207	158	751	95	below_threshold
Pseudodonghicola xiamenensis	strain=DSM 18339	GCA_000429365.1	337702	337702	type	True	76.6176	149	751	95	below_threshold
Roseovarius litoreus	strain=DSM 28249	GCA_900142765.1	1155722	1155722	type	True	76.6066	109	751	95	below_threshold
Leisingera aquimarina	strain=DSM 24565	GCA_000473165.1	476529	476529	type	True	76.5374	143	751	95	below_threshold
Marivita hallyeonensis	strain=DSM 29431	GCA_900129875.1	996342	996342	type	True	76.5155	101	751	95	below_threshold
Rhodovulum steppense	strain=DSM 21153	GCA_004339675.1	540251	540251	type	True	76.4989	108	751	95	below_threshold
Roseovarius confluentis	strain=SAG6	GCA_002917925.1	1852027	1852027	type	True	76.4542	116	751	95	below_threshold
Pseudooceanicola aestuarii	strain=E2-1	GCA_010614805.1	2697319	2697319	type	True	76.4366	111	751	95	below_threshold
Pseudosulfitobacter pseudonitzschiae	strain=H3	GCA_000712315.1	1402135	1402135	type	True	76.436	150	751	95	below_threshold
Pseudosulfitobacter pseudonitzschiae	strain=DSM 26824	GCA_900129395.1	1402135	1402135	type	True	76.4328	150	751	95	below_threshold
Ruegeria faecimaris	strain=DSM 28009	GCA_900182615.1	686389	686389	type	True	76.4061	90	751	95	below_threshold
Pseudophaeobacter flagellatus	strain=MA21411-1	GCA_021228235.1	2899119	2899119	type	True	76.364	98	751	95	below_threshold
Mangrovicoccus algicola	strain=HB182678	GCA_014903745.1	2771008	2771008	type	True	76.3221	91	751	95	below_threshold
Sulfitobacter indolifex	strain=DSM 14862	GCA_022788655.1	225422	225422	type	True	76.3126	79	751	95	below_threshold
Rhodovulum euryhalinum	strain=DSM 4868	GCA_004342445.1	35805	35805	type	True	76.2793	108	751	95	below_threshold
Tritonibacter scottomollicae	strain=DSM 25328	GCA_003003215.1	483013	483013	type	True	76.2291	109	751	95	below_threshold
Cereibacter ovatus	strain=JA234	GCA_900207575.1	439529	439529	type	True	76.0995	89	751	95	below_threshold
Halovulum dunhuangense	strain=YYQ-30	GCA_013093415.1	1505036	1505036	type	True	76.0628	68	751	95	below_threshold
Phaeovulum veldkampii	strain=DSM 11550	GCA_003034995.1	33049	33049	type	True	75.9887	97	751	95	below_threshold
Phaeovulum veldkampii	strain=DSM 11550	GCA_004363195.1	33049	33049	type	True	75.9756	98	751	95	below_threshold
Phaeovulum veldkampii	strain=DSM 11550	GCA_016653435.1	33049	33049	type	True	75.9319	98	751	95	below_threshold
Alexandriicola marinus	strain=LZ-14	GCA_004000435.1	2081710	2081710	type	True	75.9291	117	751	95	below_threshold
Rhodobacter tardus	strain=CYK-10	GCA_009925085.1	2699202	2699202	type	True	75.9039	55	751	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-19 01:26:37,759] [INFO] DFAST Taxonomy check result was written to OceanDNA-b29505/tc_result.tsv
[2023-03-19 01:26:37,759] [INFO] ===== Taxonomy check completed =====
[2023-03-19 01:26:37,759] [INFO] ===== Start completeness check using CheckM =====
[2023-03-19 01:26:37,759] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stga9f7a561-222b-4a9f-a3d5-96f8bf676178/dqc_reference/checkm_data
[2023-03-19 01:26:37,760] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-19 01:26:37,773] [INFO] Task started: CheckM
[2023-03-19 01:26:37,773] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b29505/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b29505/checkm_input OceanDNA-b29505/checkm_result
[2023-03-19 01:27:18,396] [INFO] Task succeeded: CheckM
[2023-03-19 01:27:18,397] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 41.67%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-03-19 01:27:18,478] [INFO] ===== Completeness check finished =====
[2023-03-19 01:27:18,478] [INFO] ===== Start GTDB Search =====
[2023-03-19 01:27:18,478] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b29505/markers.fasta)
[2023-03-19 01:27:18,480] [INFO] Task started: Blastn
[2023-03-19 01:27:18,480] [INFO] Running command: blastn -query OceanDNA-b29505/markers.fasta -db /var/lib/cwl/stga9f7a561-222b-4a9f-a3d5-96f8bf676178/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b29505/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-19 01:27:19,525] [INFO] Task succeeded: Blastn
[2023-03-19 01:27:19,528] [INFO] Selected 24 target genomes.
[2023-03-19 01:27:19,528] [INFO] Target genome list was writen to OceanDNA-b29505/target_genomes_gtdb.txt
[2023-03-19 01:27:19,556] [INFO] Task started: fastANI
[2023-03-19 01:27:19,556] [INFO] Running command: fastANI --query /var/lib/cwl/stgd2e368eb-9ff4-4adb-b160-e355755d40ba/OceanDNA-b29505.fa --refList OceanDNA-b29505/target_genomes_gtdb.txt --output OceanDNA-b29505/fastani_result_gtdb.tsv --threads 1
[2023-03-19 01:27:35,881] [INFO] Task succeeded: fastANI
[2023-03-19 01:27:35,895] [INFO] Found 24 fastANI hits (1 hits with ANI > circumscription radius)
[2023-03-19 01:27:35,895] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCA_017643305.1	s__HXMU1420-2 sp017643305	95.8906	645	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__HXMU1420-2	95.0	N/A	N/A	N/A	N/A	1	conclusive
GCF_002786325.1	s__Pseudooceanicola_C lipolyticus	77.1593	195	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Pseudooceanicola_C	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000161775.1	s__Ruegeria lacuscaerulensis	77.1361	170	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Ruegeria	95.0	99.98	99.98	1.00	1.00	2	-
GCF_003008555.2	s__Pukyongiella litopenaei	77.0996	185	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Pukyongiella	95.0	N/A	N/A	N/A	N/A	1	-
GCF_013032025.1	s__Ruegeria atlantica_C	77.0897	128	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Ruegeria	95.0	100.00	100.00	0.99	0.99	2	-
GCF_000011965.2	s__Ruegeria_B pomeroyi	77.079	207	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Ruegeria_B	95.0	99.97	99.92	0.99	0.98	5	-
GCF_900142185.1	s__Lutimaribacter pacificus	77.0782	174	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Lutimaribacter	95.0	100.00	100.00	1.00	1.00	2	-
GCF_900108275.1	s__Jhaorihella thermophila	77.0323	162	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Jhaorihella	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002210095.1	s__Marinibacterium profundimaris	76.9857	176	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Marinibacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_000158135.1	s__Ruegeria sp000158135	76.9743	125	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Ruegeria	95.0	N/A	N/A	N/A	N/A	1	-
GCF_005938225.1	s__Arenibacterium halophilum	76.9007	189	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Arenibacterium	95.0	96.35	96.35	0.87	0.87	2	-
GCA_003217735.2	s__Marinibacterium sp003217735	76.8977	201	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Marinibacterium	95.0	98.81	98.75	0.89	0.89	3	-
GCF_002210165.1	s__CAU-1492 sp002210165	76.8923	155	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__CAU-1492	95.0	N/A	N/A	N/A	N/A	1	-
GCF_013031405.1	s__Ruegeria sp013031405	76.8862	199	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Ruegeria	95.0	N/A	N/A	N/A	N/A	1	-
GCA_018401465.1	s__Aestuariivita sp018401465	76.8808	167	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Aestuariivita	95.0	N/A	N/A	N/A	N/A	1	-
GCA_014762635.1	s__JABURR01 sp014762635	76.7019	101	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__JABURR01	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000203975.2	s__Leisingera sp000203975	76.6902	156	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Leisingera	95.0	98.92	98.17	0.96	0.94	6	-
GCF_009789075.1	s__Pseudooceanicola pacificus	76.6589	90	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Pseudooceanicola	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000429365.1	s__Pseudodonghicola xiamenensis	76.6156	148	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Pseudodonghicola	95.0	99.98	99.98	0.98	0.98	2	-
GCF_900142765.1	s__Roseovarius litoreus	76.6022	107	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius	95.0	95.85	95.85	0.86	0.86	2	-
GCF_900129875.1	s__Marivita hallyeonensis	76.5155	101	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Marivita	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002917925.1	s__Roseovarius confluentis	76.4542	116	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius	95.0	98.93	98.93	0.93	0.93	2	-
GCF_004342445.1	s__Rhodovulum euryhalinum	76.2733	107	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum	95.0	N/A	N/A	N/A	N/A	1	-
GCA_903917595.1	s__Tabrizicola sp903917595	75.8395	63	751	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Tabrizicola	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-03-19 01:27:35,896] [INFO] GTDB search result was written to OceanDNA-b29505/result_gtdb.tsv
[2023-03-19 01:27:35,896] [INFO] ===== GTDB Search completed =====
[2023-03-19 01:27:35,903] [INFO] DFAST_QC result json was written to OceanDNA-b29505/dqc_result.json
[2023-03-19 01:27:35,903] [INFO] DFAST_QC completed!
[2023-03-19 01:27:35,903] [INFO] Total running time: 0h1m34s
