[2023-03-18 10:05:40,212] [INFO] DFAST_QC pipeline started.
[2023-03-18 10:05:40,212] [INFO] DFAST_QC version: 0.5.7
[2023-03-18 10:05:40,212] [INFO] DQC Reference Directory: /var/lib/cwl/stg312497ea-3225-4159-b529-f0f43981b703/dqc_reference
[2023-03-18 10:05:41,301] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-18 10:05:41,302] [INFO] Task started: Prodigal
[2023-03-18 10:05:41,302] [INFO] Running command: cat /var/lib/cwl/stgfe383055-6775-4d8b-a53c-02f818a23de8/OceanDNA-b11434.fa | prodigal -d OceanDNA-b11434/cds.fna -a OceanDNA-b11434/protein.faa -g 11 -q > /dev/null
[2023-03-18 10:05:52,118] [INFO] Task succeeded: Prodigal
[2023-03-18 10:05:52,118] [INFO] Task started: HMMsearch
[2023-03-18 10:05:52,118] [INFO] Running command: hmmsearch --tblout OceanDNA-b11434/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg312497ea-3225-4159-b529-f0f43981b703/dqc_reference/reference_markers.hmm OceanDNA-b11434/protein.faa > /dev/null
[2023-03-18 10:05:52,276] [INFO] Task succeeded: HMMsearch
[2023-03-18 10:05:52,277] [WARNING] Found 4/6 markers. [/var/lib/cwl/stgfe383055-6775-4d8b-a53c-02f818a23de8/OceanDNA-b11434.fa]
[2023-03-18 10:05:52,305] [INFO] Query marker FASTA was written to OceanDNA-b11434/markers.fasta
[2023-03-18 10:05:52,306] [INFO] Task started: Blastn
[2023-03-18 10:05:52,306] [INFO] Running command: blastn -query OceanDNA-b11434/markers.fasta -db /var/lib/cwl/stg312497ea-3225-4159-b529-f0f43981b703/dqc_reference/reference_markers.fasta -out OceanDNA-b11434/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-18 10:05:52,790] [INFO] Task succeeded: Blastn
[2023-03-18 10:05:52,794] [INFO] Selected 19 target genomes.
[2023-03-18 10:05:52,794] [INFO] Target genome list was writen to OceanDNA-b11434/target_genomes.txt
[2023-03-18 10:05:52,806] [INFO] Task started: fastANI
[2023-03-18 10:05:52,806] [INFO] Running command: fastANI --query /var/lib/cwl/stgfe383055-6775-4d8b-a53c-02f818a23de8/OceanDNA-b11434.fa --refList OceanDNA-b11434/target_genomes.txt --output OceanDNA-b11434/fastani_result.tsv --threads 1
[2023-03-18 10:06:04,126] [INFO] Task succeeded: fastANI
[2023-03-18 10:06:04,127] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg312497ea-3225-4159-b529-f0f43981b703/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-18 10:06:04,127] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg312497ea-3225-4159-b529-f0f43981b703/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-18 10:06:04,137] [INFO] Found 17 fastANI hits (0 hits with ANI > threshold)
[2023-03-18 10:06:04,137] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-18 10:06:04,137] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Joostella marina	strain=DSM 19592	GCA_000260115.1	453852	453852	type	True	76.6534	86	547	95	below_threshold
Abyssalbus ytuae	strain=MT3330	GCA_022807975.1	2926907	2926907	type	True	76.5501	131	547	95	below_threshold
Mesoflavibacter profundi	strain=YC1039	GCA_014764305.1	2708110	2708110	type	True	76.3353	64	547	95	below_threshold
Bizionia algoritergicola	strain=APA-1	GCA_008086165.1	291187	291187	type	True	76.2977	51	547	95	below_threshold
Arenitalea lutea	strain=CGMCC 1.12213	GCA_900141715.1	1178825	1178825	type	True	76.2174	60	547	95	below_threshold
Arenitalea lutea	strain=P7-3-5	GCA_000283015.1	1178825	1178825	type	True	76.1756	60	547	95	below_threshold
Cellulophaga fucicola	strain=DSM 24786	GCA_900119145.1	76595	76595	type	True	76.0366	81	547	95	below_threshold
Winogradskyella vidalii	strain=HL634	GCA_013403955.1	2615024	2615024	type	True	76.0221	52	547	95	below_threshold
Cellulophaga lytica	strain=DSM 7489	GCA_000190595.1	979	979	type	True	75.9776	80	547	95	below_threshold
Cellulophaga baltica	strain=DSM 24729	GCA_900102165.1	76594	76594	type	True	75.9578	73	547	95	below_threshold
Seonamhaeicola algicola	strain=Gy8	GCA_007997385.1	1719036	1719036	type	True	75.9192	93	547	95	below_threshold
Mariniflexile fucanivorans	strain=DSM 18792	GCA_004341235.1	264023	264023	type	True	75.8536	75	547	95	below_threshold
Flavivirga rizhaonensis	strain=RZ03	GCA_004791695.1	2559571	2559571	type	True	75.8524	75	547	95	below_threshold
Hyunsoonleella pacifica	strain=SW033	GCA_004310335.1	1080224	1080224	type	True	75.8161	68	547	95	below_threshold
Hyunsoonleella pacifica	strain=CGMCC 1.11009	GCA_014636335.1	1080224	1080224	type	True	75.8161	68	547	95	below_threshold
Algibacter aestuarii	strain=KCTC 23449	GCA_006148925.1	912802	912802	type	True	75.7662	75	547	95	below_threshold
Algibacter alginicilyticus	strain=HZ22	GCA_001310225.1	1736674	1736674	type	True	75.7512	70	547	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-18 10:06:04,146] [INFO] DFAST Taxonomy check result was written to OceanDNA-b11434/tc_result.tsv
[2023-03-18 10:06:04,146] [INFO] ===== Taxonomy check completed =====
[2023-03-18 10:06:04,146] [INFO] ===== Start completeness check using CheckM =====
[2023-03-18 10:06:04,146] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg312497ea-3225-4159-b529-f0f43981b703/dqc_reference/checkm_data
[2023-03-18 10:06:04,147] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-18 10:06:04,167] [INFO] Task started: CheckM
[2023-03-18 10:06:04,167] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b11434/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b11434/checkm_input OceanDNA-b11434/checkm_result
[2023-03-18 10:06:34,570] [INFO] Task succeeded: CheckM
[2023-03-18 10:06:34,571] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 29.17%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-03-18 10:06:34,594] [INFO] ===== Completeness check finished =====
[2023-03-18 10:06:34,594] [INFO] ===== Start GTDB Search =====
[2023-03-18 10:06:34,595] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b11434/markers.fasta)
[2023-03-18 10:06:34,596] [INFO] Task started: Blastn
[2023-03-18 10:06:34,596] [INFO] Running command: blastn -query OceanDNA-b11434/markers.fasta -db /var/lib/cwl/stg312497ea-3225-4159-b529-f0f43981b703/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b11434/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-18 10:06:35,100] [INFO] Task succeeded: Blastn
[2023-03-18 10:06:35,102] [INFO] Selected 20 target genomes.
[2023-03-18 10:06:35,102] [INFO] Target genome list was writen to OceanDNA-b11434/target_genomes_gtdb.txt
[2023-03-18 10:06:35,119] [INFO] Task started: fastANI
[2023-03-18 10:06:35,119] [INFO] Running command: fastANI --query /var/lib/cwl/stgfe383055-6775-4d8b-a53c-02f818a23de8/OceanDNA-b11434.fa --refList OceanDNA-b11434/target_genomes_gtdb.txt --output OceanDNA-b11434/fastani_result_gtdb.tsv --threads 1
[2023-03-18 10:06:47,128] [INFO] Task succeeded: fastANI
[2023-03-18 10:06:47,139] [INFO] Found 18 fastANI hits (0 hits with ANI > circumscription radius)
[2023-03-18 10:06:47,139] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_900114265.1	s__Flaviramulus basaltis	78.1801	107	547	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flaviramulus	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000260115.1	s__Joostella marina	76.6491	85	547	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Joostella	95.0	N/A	N/A	N/A	N/A	1	-
GCF_018861125.1	s__Cellulophaga baltica_A	76.3719	86	547	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Cellulophaga	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900141715.1	s__Arenitalea lutea	76.2037	59	547	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Arenitalea	95.0	100.00	100.00	1.00	1.00	2	-
GCF_013403955.1	s__Winogradskyella vidalii	76.0221	52	547	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Winogradskyella	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014641635.1	s__Aquaticitalea lipolytica	76.0198	81	547	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Aquaticitalea	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900119145.1	s__Cellulophaga fucicola	76.0053	81	547	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Cellulophaga	95.0	N/A	N/A	N/A	N/A	1	-
GCF_015355625.1	s__Tamlana_A sp015355625	75.9759	51	547	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Tamlana_A	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000190595.1	s__Cellulophaga lytica	75.9473	80	547	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Cellulophaga	95.0	99.01	98.71	0.94	0.93	6	-
GCF_900102165.1	s__Cellulophaga baltica	75.943	72	547	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Cellulophaga	95.0	97.62	97.40	0.90	0.89	7	-
GCF_007997385.1	s__Seonamhaeicola algicola	75.9362	92	547	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Seonamhaeicola	95.0	N/A	N/A	N/A	N/A	1	-
GCA_009901575.1	s__Leptobacterium sp009901575	75.9341	84	547	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Leptobacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_905480415.1	s__Algibacter_B sp905480415	75.9334	65	547	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Algibacter_B	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000211855.2	s__Lacinutrix sp000211855	75.8667	78	547	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Lacinutrix	95.0	N/A	N/A	N/A	N/A	1	-
GCA_001874145.1	s__Lacinutrix sp001874145	75.841	76	547	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Lacinutrix	95.0	N/A	N/A	N/A	N/A	1	-
GCF_004310335.1	s__Jejuia pacifica	75.7981	67	547	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Jejuia	95.0	100.00	100.00	1.00	1.00	2	-
GCF_006148925.1	s__Algibacter_D aestuarii	75.7492	74	547	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Algibacter_D	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002366675.1	s__Xanthomarina sp002366675	75.5333	65	547	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Xanthomarina	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-03-18 10:06:47,143] [INFO] GTDB search result was written to OceanDNA-b11434/result_gtdb.tsv
[2023-03-18 10:06:47,144] [INFO] ===== GTDB Search completed =====
[2023-03-18 10:06:47,145] [INFO] DFAST_QC result json was written to OceanDNA-b11434/dqc_result.json
[2023-03-18 10:06:47,146] [INFO] DFAST_QC completed!
[2023-03-18 10:06:47,146] [INFO] Total running time: 0h1m7s
