[2023-03-18 09:11:00,871] [INFO] DFAST_QC pipeline started.
[2023-03-18 09:11:00,872] [INFO] DFAST_QC version: 0.5.7
[2023-03-18 09:11:00,872] [INFO] DQC Reference Directory: /var/lib/cwl/stg8d3f8e0c-2b7f-4224-99cd-548c2cf9e8f7/dqc_reference
[2023-03-18 09:11:01,986] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-18 09:11:01,986] [INFO] Task started: Prodigal
[2023-03-18 09:11:01,986] [INFO] Running command: cat /var/lib/cwl/stg87a95e54-150c-4819-bdbd-05fa5816698c/OceanDNA-b26398.fa | prodigal -d OceanDNA-b26398/cds.fna -a OceanDNA-b26398/protein.faa -g 11 -q > /dev/null
[2023-03-18 09:11:13,388] [INFO] Task succeeded: Prodigal
[2023-03-18 09:11:13,389] [INFO] Task started: HMMsearch
[2023-03-18 09:11:13,389] [INFO] Running command: hmmsearch --tblout OceanDNA-b26398/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg8d3f8e0c-2b7f-4224-99cd-548c2cf9e8f7/dqc_reference/reference_markers.hmm OceanDNA-b26398/protein.faa > /dev/null
[2023-03-18 09:11:13,556] [INFO] Task succeeded: HMMsearch
[2023-03-18 09:11:13,556] [WARNING] Found 5/6 markers. [/var/lib/cwl/stg87a95e54-150c-4819-bdbd-05fa5816698c/OceanDNA-b26398.fa]
[2023-03-18 09:11:13,571] [INFO] Query marker FASTA was written to OceanDNA-b26398/markers.fasta
[2023-03-18 09:11:13,572] [INFO] Task started: Blastn
[2023-03-18 09:11:13,572] [INFO] Running command: blastn -query OceanDNA-b26398/markers.fasta -db /var/lib/cwl/stg8d3f8e0c-2b7f-4224-99cd-548c2cf9e8f7/dqc_reference/reference_markers.fasta -out OceanDNA-b26398/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-18 09:11:14,236] [INFO] Task succeeded: Blastn
[2023-03-18 09:11:14,237] [INFO] Selected 28 target genomes.
[2023-03-18 09:11:14,237] [INFO] Target genome list was writen to OceanDNA-b26398/target_genomes.txt
[2023-03-18 09:11:14,248] [INFO] Task started: fastANI
[2023-03-18 09:11:14,248] [INFO] Running command: fastANI --query /var/lib/cwl/stg87a95e54-150c-4819-bdbd-05fa5816698c/OceanDNA-b26398.fa --refList OceanDNA-b26398/target_genomes.txt --output OceanDNA-b26398/fastani_result.tsv --threads 1
[2023-03-18 09:11:34,563] [INFO] Task succeeded: fastANI
[2023-03-18 09:11:34,563] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg8d3f8e0c-2b7f-4224-99cd-548c2cf9e8f7/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-18 09:11:34,563] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg8d3f8e0c-2b7f-4224-99cd-548c2cf9e8f7/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-18 09:11:34,572] [INFO] Found 13 fastANI hits (0 hits with ANI > threshold)
[2023-03-18 09:11:34,572] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-18 09:11:34,572] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Oceanibaculum indicum	strain=P24	GCA_000299935.1	526216	526216	type	True	76.5417	72	589	95	below_threshold
Oceanibaculum pacificum	strain=MCCC 1A02656	GCA_001618175.1	580166	580166	type	True	76.2736	69	589	95	below_threshold
Thalassobaculum fulvum	strain=KCTC 42651	GCA_014652915.1	1633335	1633335	type	True	75.7496	80	589	95	below_threshold
Nisaea sediminum	strain=NBU1469	GCA_014904705.1	2775867	2775867	type	True	75.7302	58	589	95	below_threshold
Azospirillum soli	strain=CC-LY788	GCA_017876165.1	1304799	1304799	type	True	75.6708	54	589	95	below_threshold
Tistrella bauzanensis	strain=CGMCC 1.10188	GCA_014636235.1	657419	657419	type	True	75.6347	57	589	95	below_threshold
Azospirillum oryzae	strain=COC8	GCA_008364795.1	286727	286727	type	True	75.5517	59	589	95	below_threshold
Ferrovibrio terrae	strain=K5	GCA_007197755.1	2594003	2594003	type	True	75.455	55	589	95	below_threshold
Azospirillum ramasamyi	strain=M2T2B2	GCA_003233655.1	682998	682998	type	True	75.4204	58	589	95	below_threshold
Methylobacterium terricola	strain=17Sr1-39	GCA_006151805.1	2583531	2583531	type	True	75.3551	52	589	95	below_threshold
Azospirillum thiophilum	strain=BV-S	GCA_001305595.1	528244	528244	type	True	75.3494	64	589	95	below_threshold
Azospirillum thiophilum	strain=DSM 21654	GCA_000960825.1	528244	528244	type	True	75.349	64	589	95	below_threshold
Mesorhizobium qingshengii	strain=CGMCC 1.12097	GCA_900103325.1	1165689	1165689	type	True	75.2122	65	589	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-18 09:11:34,572] [INFO] DFAST Taxonomy check result was written to OceanDNA-b26398/tc_result.tsv
[2023-03-18 09:11:34,573] [INFO] ===== Taxonomy check completed =====
[2023-03-18 09:11:34,573] [INFO] ===== Start completeness check using CheckM =====
[2023-03-18 09:11:34,573] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg8d3f8e0c-2b7f-4224-99cd-548c2cf9e8f7/dqc_reference/checkm_data
[2023-03-18 09:11:34,573] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-18 09:11:34,577] [INFO] Task started: CheckM
[2023-03-18 09:11:34,577] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b26398/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b26398/checkm_input OceanDNA-b26398/checkm_result
[2023-03-18 09:12:06,964] [INFO] Task succeeded: CheckM
[2023-03-18 09:12:06,964] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 62.15%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-03-18 09:12:06,998] [INFO] ===== Completeness check finished =====
[2023-03-18 09:12:06,998] [INFO] ===== Start GTDB Search =====
[2023-03-18 09:12:06,998] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b26398/markers.fasta)
[2023-03-18 09:12:06,999] [INFO] Task started: Blastn
[2023-03-18 09:12:06,999] [INFO] Running command: blastn -query OceanDNA-b26398/markers.fasta -db /var/lib/cwl/stg8d3f8e0c-2b7f-4224-99cd-548c2cf9e8f7/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b26398/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-18 09:12:08,170] [INFO] Task succeeded: Blastn
[2023-03-18 09:12:08,171] [INFO] Selected 15 target genomes.
[2023-03-18 09:12:08,171] [INFO] Target genome list was writen to OceanDNA-b26398/target_genomes_gtdb.txt
[2023-03-18 09:12:08,203] [INFO] Task started: fastANI
[2023-03-18 09:12:08,203] [INFO] Running command: fastANI --query /var/lib/cwl/stg87a95e54-150c-4819-bdbd-05fa5816698c/OceanDNA-b26398.fa --refList OceanDNA-b26398/target_genomes_gtdb.txt --output OceanDNA-b26398/fastani_result_gtdb.tsv --threads 1
[2023-03-18 09:12:15,498] [INFO] Task succeeded: fastANI
[2023-03-18 09:12:15,503] [INFO] Found 7 fastANI hits (0 hits with ANI > circumscription radius)
[2023-03-18 09:12:15,503] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCA_002457745.1	s__UBA8309 sp002457745	84.3999	341	589	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Puniceispirillales;f__Puniceispirillaceae;g__UBA8309	95.0	N/A	N/A	N/A	N/A	1	-
GCA_011522725.1	s__UBA8309 sp001627655	80.8744	396	589	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Puniceispirillales;f__Puniceispirillaceae;g__UBA8309	95.0	98.91	98.58	0.92	0.89	8	-
GCA_018624035.1	s__UBA8309 sp018624035	79.613	343	589	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Puniceispirillales;f__Puniceispirillaceae;g__UBA8309	95.0	N/A	N/A	N/A	N/A	1	-
GCA_009920665.1	s__UBA8309 sp009920665	79.6055	363	589	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Puniceispirillales;f__Puniceispirillaceae;g__UBA8309	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016780765.1	s__UBA8309 sp016780765	77.0631	173	589	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Puniceispirillales;f__Puniceispirillaceae;g__UBA8309	95.0	98.39	97.46	0.74	0.59	14	-
GCA_011524735.1	s__UBA8309 sp011524735	76.9443	133	589	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Puniceispirillales;f__Puniceispirillaceae;g__UBA8309	95.0	N/A	N/A	N/A	N/A	1	-
GCA_004212735.1	s__MED-G116 sp004212735	76.2519	101	589	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Puniceispirillales;f__Puniceispirillaceae;g__MED-G116	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-03-18 09:12:15,504] [INFO] GTDB search result was written to OceanDNA-b26398/result_gtdb.tsv
[2023-03-18 09:12:15,504] [INFO] ===== GTDB Search completed =====
[2023-03-18 09:12:15,505] [INFO] DFAST_QC result json was written to OceanDNA-b26398/dqc_result.json
[2023-03-18 09:12:15,505] [INFO] DFAST_QC completed!
[2023-03-18 09:12:15,505] [INFO] Total running time: 0h1m15s
