[2023-03-17 23:15:40,546] [INFO] DFAST_QC pipeline started.
[2023-03-17 23:15:40,547] [INFO] DFAST_QC version: 0.5.7
[2023-03-17 23:15:40,547] [INFO] DQC Reference Directory: /var/lib/cwl/stg54fcbfd4-6ce8-40de-b0fe-a444420892d4/dqc_reference
[2023-03-17 23:15:41,622] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-17 23:15:41,623] [INFO] Task started: Prodigal
[2023-03-17 23:15:41,623] [INFO] Running command: cat /var/lib/cwl/stg5a750e88-ac65-45cb-876f-7c8222384ccc/OceanDNA-b32665.fa | prodigal -d OceanDNA-b32665/cds.fna -a OceanDNA-b32665/protein.faa -g 11 -q > /dev/null
[2023-03-17 23:15:55,572] [INFO] Task succeeded: Prodigal
[2023-03-17 23:15:55,572] [INFO] Task started: HMMsearch
[2023-03-17 23:15:55,572] [INFO] Running command: hmmsearch --tblout OceanDNA-b32665/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg54fcbfd4-6ce8-40de-b0fe-a444420892d4/dqc_reference/reference_markers.hmm OceanDNA-b32665/protein.faa > /dev/null
[2023-03-17 23:15:55,754] [INFO] Task succeeded: HMMsearch
[2023-03-17 23:15:55,754] [WARNING] Found 5/6 markers. [/var/lib/cwl/stg5a750e88-ac65-45cb-876f-7c8222384ccc/OceanDNA-b32665.fa]
[2023-03-17 23:15:55,777] [INFO] Query marker FASTA was written to OceanDNA-b32665/markers.fasta
[2023-03-17 23:15:55,778] [INFO] Task started: Blastn
[2023-03-17 23:15:55,778] [INFO] Running command: blastn -query OceanDNA-b32665/markers.fasta -db /var/lib/cwl/stg54fcbfd4-6ce8-40de-b0fe-a444420892d4/dqc_reference/reference_markers.fasta -out OceanDNA-b32665/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-17 23:15:56,385] [INFO] Task succeeded: Blastn
[2023-03-17 23:15:56,387] [INFO] Selected 23 target genomes.
[2023-03-17 23:15:56,387] [INFO] Target genome list was writen to OceanDNA-b32665/target_genomes.txt
[2023-03-17 23:15:56,395] [INFO] Task started: fastANI
[2023-03-17 23:15:56,396] [INFO] Running command: fastANI --query /var/lib/cwl/stg5a750e88-ac65-45cb-876f-7c8222384ccc/OceanDNA-b32665.fa --refList OceanDNA-b32665/target_genomes.txt --output OceanDNA-b32665/fastani_result.tsv --threads 1
[2023-03-17 23:16:12,456] [INFO] Task succeeded: fastANI
[2023-03-17 23:16:12,457] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg54fcbfd4-6ce8-40de-b0fe-a444420892d4/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-17 23:16:12,457] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg54fcbfd4-6ce8-40de-b0fe-a444420892d4/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-17 23:16:12,512] [INFO] Found 13 fastANI hits (0 hits with ANI > threshold)
[2023-03-17 23:16:12,512] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-17 23:16:12,512] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Zavarzinia aquatilis	strain=HR-AS	GCA_003173035.1	2211142	2211142	type	True	75.6233	50	710	95	below_threshold
Zavarzinia compransoris	strain=DSM 1231	GCA_004362995.1	1264899	1264899	type	True	75.4245	64	710	95	below_threshold
Azospirillum oryzae	strain=COC8	GCA_008364795.1	286727	286727	type	True	75.422	58	710	95	below_threshold
Zavarzinia compransoris	strain=DSM 1231	GCA_003173055.1	1264899	1264899	type	True	75.3966	66	710	95	below_threshold
Skermanella rosea	strain=KEMB 2255-458	GCA_016806835.2	1817965	1817965	type	True	75.3576	61	710	95	below_threshold
Azospirillum brasilense	strain=Sp 7	GCA_008274945.1	192	192	type	True	75.3088	63	710	95	below_threshold
Azospirillum brasilense	strain=Sp 7	GCA_001315015.1	192	192	type	True	75.2907	67	710	95	below_threshold
Azospirillum brasilense	strain=Sp 7	GCA_002027385.1	192	192	type	True	75.2869	65	710	95	below_threshold
Azospirillum brasilense	strain=Sp 7	GCA_007827425.1	192	192	type	True	75.284	68	710	95	below_threshold
Azospirillum griseum	strain=L-25-5 w-1	GCA_003966125.1	2496639	2496639	type	True	75.2646	60	710	95	below_threshold
Ferrovibrio terrae	strain=K5	GCA_007197755.1	2594003	2594003	type	True	75.2515	52	710	95	below_threshold
Oceanicella actignis	strain=DSM 22673	GCA_008124525.1	1189325	1189325	type	True	75.1809	50	710	95	below_threshold
Azospirillum formosense	strain=CC-NFb-7	GCA_013340925.1	861533	861533	type	True	75.0233	61	710	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-17 23:16:12,514] [INFO] DFAST Taxonomy check result was written to OceanDNA-b32665/tc_result.tsv
[2023-03-17 23:16:12,514] [INFO] ===== Taxonomy check completed =====
[2023-03-17 23:16:12,514] [INFO] ===== Start completeness check using CheckM =====
[2023-03-17 23:16:12,514] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg54fcbfd4-6ce8-40de-b0fe-a444420892d4/dqc_reference/checkm_data
[2023-03-17 23:16:12,515] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-17 23:16:12,554] [INFO] Task started: CheckM
[2023-03-17 23:16:12,554] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b32665/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b32665/checkm_input OceanDNA-b32665/checkm_result
[2023-03-17 23:16:50,861] [INFO] Task succeeded: CheckM
[2023-03-17 23:16:50,862] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 75.46%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-03-17 23:16:50,864] [INFO] ===== Completeness check finished =====
[2023-03-17 23:16:50,864] [INFO] ===== Start GTDB Search =====
[2023-03-17 23:16:50,864] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b32665/markers.fasta)
[2023-03-17 23:16:50,866] [INFO] Task started: Blastn
[2023-03-17 23:16:50,866] [INFO] Running command: blastn -query OceanDNA-b32665/markers.fasta -db /var/lib/cwl/stg54fcbfd4-6ce8-40de-b0fe-a444420892d4/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b32665/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-17 23:16:51,972] [INFO] Task succeeded: Blastn
[2023-03-17 23:16:51,973] [INFO] Selected 21 target genomes.
[2023-03-17 23:16:51,973] [INFO] Target genome list was writen to OceanDNA-b32665/target_genomes_gtdb.txt
[2023-03-17 23:16:51,992] [INFO] Task started: fastANI
[2023-03-17 23:16:51,992] [INFO] Running command: fastANI --query /var/lib/cwl/stg5a750e88-ac65-45cb-876f-7c8222384ccc/OceanDNA-b32665.fa --refList OceanDNA-b32665/target_genomes_gtdb.txt --output OceanDNA-b32665/fastani_result_gtdb.tsv --threads 1
[2023-03-17 23:17:06,608] [INFO] Task succeeded: fastANI
[2023-03-17 23:17:06,617] [INFO] Found 16 fastANI hits (0 hits with ANI > circumscription radius)
[2023-03-17 23:17:06,617] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCA_018650465.1	s__JABILA01 sp018650465	76.1685	103	710	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__UBA2966;f__UBA2966;g__JABILA01	95.0	99.90	99.89	0.96	0.95	4	-
GCA_018654745.1	s__JABILA01 sp018654745	76.0146	102	710	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__UBA2966;f__UBA2966;g__JABILA01	95.0	99.91	99.83	0.93	0.89	4	-
GCA_002690215.1	s__GCA-2690215 sp002690215	75.949	83	710	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__UBA2966;f__UBA2966;g__GCA-2690215	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016869175.1	s__VGEV01 sp016869175	75.8158	50	710	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__GCA-2731375;f__GCA-2731375;g__VGEV01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_017307375.1	s__JAFKFH01 sp017307375	75.8081	94	710	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Ferrovibrionales;f__Ferrovibrionaceae;g__JAFKFH01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_008012315.1	s__SSEL01 sp008012315	75.6705	52	710	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Dongiales;f__Dongiaceae;g__SSEL01	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003590945.1	s__Zavarzinia sp003590945	75.6697	64	710	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Zavarziniales;f__Zavarziniaceae;g__Zavarzinia	95.0	N/A	N/A	N/A	N/A	1	-
GCA_018336035.1	s__Ferrovibrio sp018336035	75.6397	59	710	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Ferrovibrionales;f__Ferrovibrionaceae;g__Ferrovibrio	95.0	N/A	N/A	N/A	N/A	1	-
GCA_003576705.1	s__SYSU-D60015 sp003576705	75.544	67	710	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Ferrovibrionales;f__Ferrovibrionaceae;g__SYSU-D60015	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016185145.1	s__Reyranella sp016185145	75.5396	68	710	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Reyranellales;f__Reyranellaceae;g__Reyranella	95.0	N/A	N/A	N/A	N/A	1	-
GCF_902729435.1	s__Magnetospirillum sp902729435	75.4853	64	710	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Magnetospirillaceae;g__Magnetospirillum	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002796975.1	s__Ferrovibrio sp002796975	75.294	63	710	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Ferrovibrionales;f__Ferrovibrionaceae;g__Ferrovibrio	95.0	N/A	N/A	N/A	N/A	1	-
GCF_007827815.1	s__Azospirillum brasilense_C	75.2791	68	710	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Azospirillum	95.0	97.36	95.98	0.93	0.90	3	-
GCF_007197755.1	s__Ferrovibrio terrae	75.2515	52	710	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Ferrovibrionales;f__Ferrovibrionaceae;g__Ferrovibrio	95.0	N/A	N/A	N/A	N/A	1	-
GCF_008124525.1	s__Oceanicella actignis	75.1809	50	710	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Oceanicella	95.0	98.87	98.87	0.96	0.96	3	-
GCF_000620685.1	s__Dongia sp000620685	75.1017	50	710	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Dongiales;f__Dongiaceae;g__Dongia	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-03-17 23:17:06,618] [INFO] GTDB search result was written to OceanDNA-b32665/result_gtdb.tsv
[2023-03-17 23:17:06,618] [INFO] ===== GTDB Search completed =====
[2023-03-17 23:17:06,620] [INFO] DFAST_QC result json was written to OceanDNA-b32665/dqc_result.json
[2023-03-17 23:17:06,620] [INFO] DFAST_QC completed!
[2023-03-17 23:17:06,620] [INFO] Total running time: 0h1m26s
