[2023-03-15 10:21:47,925] [INFO] DFAST_QC pipeline started.
[2023-03-15 10:21:47,925] [INFO] DFAST_QC version: 0.5.7
[2023-03-15 10:21:47,925] [INFO] DQC Reference Directory: /var/lib/cwl/stg0364e319-0867-4fc2-8786-0a9872ddb948/dqc_reference
[2023-03-15 10:21:49,163] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-15 10:21:49,163] [INFO] Task started: Prodigal
[2023-03-15 10:21:49,164] [INFO] Running command: cat /var/lib/cwl/stgeaa26cfc-4081-4db0-a8e9-fb1346baf66b/OceanDNA-b7004.fa | prodigal -d OceanDNA-b7004/cds.fna -a OceanDNA-b7004/protein.faa -g 11 -q > /dev/null
[2023-03-15 10:22:02,899] [INFO] Task succeeded: Prodigal
[2023-03-15 10:22:02,899] [INFO] Task started: HMMsearch
[2023-03-15 10:22:02,899] [INFO] Running command: hmmsearch --tblout OceanDNA-b7004/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg0364e319-0867-4fc2-8786-0a9872ddb948/dqc_reference/reference_markers.hmm OceanDNA-b7004/protein.faa > /dev/null
[2023-03-15 10:22:03,143] [INFO] Task succeeded: HMMsearch
[2023-03-15 10:22:03,143] [INFO] Found 6/6 markers.
[2023-03-15 10:22:03,157] [INFO] Query marker FASTA was written to OceanDNA-b7004/markers.fasta
[2023-03-15 10:22:03,158] [INFO] Task started: Blastn
[2023-03-15 10:22:03,158] [INFO] Running command: blastn -query OceanDNA-b7004/markers.fasta -db /var/lib/cwl/stg0364e319-0867-4fc2-8786-0a9872ddb948/dqc_reference/reference_markers.fasta -out OceanDNA-b7004/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-15 10:22:03,816] [INFO] Task succeeded: Blastn
[2023-03-15 10:22:03,817] [INFO] Selected 35 target genomes.
[2023-03-15 10:22:03,817] [INFO] Target genome list was writen to OceanDNA-b7004/target_genomes.txt
[2023-03-15 10:22:03,881] [INFO] Task started: fastANI
[2023-03-15 10:22:03,881] [INFO] Running command: fastANI --query /var/lib/cwl/stgeaa26cfc-4081-4db0-a8e9-fb1346baf66b/OceanDNA-b7004.fa --refList OceanDNA-b7004/target_genomes.txt --output OceanDNA-b7004/fastani_result.tsv --threads 1
[2023-03-15 10:22:26,041] [INFO] Task succeeded: fastANI
[2023-03-15 10:22:26,042] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg0364e319-0867-4fc2-8786-0a9872ddb948/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-15 10:22:26,042] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg0364e319-0867-4fc2-8786-0a9872ddb948/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-15 10:22:26,061] [INFO] Found 31 fastANI hits (0 hits with ANI > threshold)
[2023-03-15 10:22:26,062] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-15 10:22:26,062] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Patiriisocius marinus	strain=NBRC 109484	GCA_008974325.1	1397112	1397112	type	True	76.6612	139	755	95	below_threshold
Pontimicrobium aquaticum	strain=CAU 1491	GCA_005047595.1	2565367	2565367	type	True	76.6227	103	755	95	below_threshold
Cochleicola gelatinilyticus	strain=LPB0005	GCA_001637325.1	1763537	1763537	type	True	76.562	137	755	95	below_threshold
Psychroserpens jangbogonensis	strain=PAMC 27130	GCA_000797465.1	1484460	1484460	type	True	76.553	113	755	95	below_threshold
Algibacter alginicilyticus	strain=HZ22	GCA_001310225.1	1736674	1736674	type	True	76.5245	106	755	95	below_threshold
Winogradskyella litoriviva	strain=KMM6491	GCA_013249065.1	1220182	1220182	type	True	76.4887	118	755	95	below_threshold
Winogradskyella wichelsiae	strain=Z738	GCA_013403925.1	2697007	2697007	type	True	76.4062	118	755	95	below_threshold
Flavivirga algicola	strain=Y03	GCA_012910715.1	2729136	2729136	type	True	76.3921	90	755	95	below_threshold
Algibacter pacificus	strain=H164	GCA_008033385.1	2599389	2599389	type	True	76.3505	98	755	95	below_threshold
Winogradskyella eckloniae	strain=EC29	GCA_013249045.1	1089306	1089306	type	True	76.3484	102	755	95	below_threshold
Formosa algae	strain=KMM 3553	GCA_001439665.1	225843	225843	type	True	76.348	91	755	95	below_threshold
Winogradskyella ludwigii	strain=HL116	GCA_013403985.1	2686076	2686076	type	True	76.3478	120	755	95	below_threshold
Winogradskyella echinorum	strain=KCTC 22026	GCA_014284085.1	538189	538189	type	True	76.3446	115	755	95	below_threshold
Winogradskyella echinorum	strain=KCTC 22026	GCA_014297365.1	538189	538189	type	True	76.3446	115	755	95	below_threshold
Bizionia argentinensis	strain=JUB59	GCA_000224335.2	456455	456455	type	True	76.3403	98	755	95	below_threshold
Marixanthomonas ophiurae	strain=KMM 3046	GCA_003413745.1	387659	387659	type	True	76.2861	124	755	95	below_threshold
Winogradskyella vidalii	strain=HL634	GCA_013403955.1	2615024	2615024	type	True	76.2771	97	755	95	below_threshold
Formosa agariphila	strain=type strain: KMM 3901	GCA_000723205.1	320324	320324	type	True	76.2678	109	755	95	below_threshold
Winogradskyella sediminis	strain=DSM 28134	GCA_003387355.1	1382466	1382466	type	True	76.2371	92	755	95	below_threshold
Aequorivita sinensis	strain=S1-10	GCA_006346335.1	1382458	1382458	type	True	76.2343	108	755	95	below_threshold
Psychroserpens mesophilus	strain=JCM 13413	GCA_000826645.1	325473	325473	type	True	76.1778	103	755	95	below_threshold
Arenitalea lutea	strain=CGMCC 1.12213	GCA_900141715.1	1178825	1178825	type	True	76.0957	110	755	95	below_threshold
Aquimarina amphilecti	strain=DSM 25232	GCA_900109375.1	1038014	1038014	type	True	76.0875	100	755	95	below_threshold
Snuella sedimenti	strain=CAU 1569	GCA_016428595.1	2798802	2798802	type	True	76.0804	69	755	95	below_threshold
Mangrovimonas spongiae	strain=HN-E26	GCA_003944795.1	2494697	2494697	type	True	76.0765	78	755	95	below_threshold
Arenitalea lutea	strain=P7-3-5	GCA_000283015.1	1178825	1178825	type	True	76.0165	114	755	95	below_threshold
Leeuwenhoekiella aequorea	strain=LMG 22550	GCA_004104375.1	283736	283736	type	True	75.9871	64	755	95	below_threshold
Cellulophaga algicola	strain=DSM 14237	GCA_000186265.1	59600	59600	type	True	75.9385	81	755	95	below_threshold
Aquimarina pacifica	strain=SW150	GCA_000520955.1	1296415	1296415	type	True	75.8455	70	755	95	below_threshold
Flavobacterium celericrescens	strain=TWA-26	GCA_011392075.1	2709780	2709780	type	True	75.6796	67	755	95	below_threshold
Flavobacterium davisii	strain=90-106	GCA_019565505.1	2906077	2906077	type	True	75.4734	57	755	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-15 10:22:26,062] [INFO] DFAST Taxonomy check result was written to OceanDNA-b7004/tc_result.tsv
[2023-03-15 10:22:26,062] [INFO] ===== Taxonomy check completed =====
[2023-03-15 10:22:26,062] [INFO] ===== Start completeness check using CheckM =====
[2023-03-15 10:22:26,062] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg0364e319-0867-4fc2-8786-0a9872ddb948/dqc_reference/checkm_data
[2023-03-15 10:22:26,063] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-15 10:22:26,066] [INFO] Task started: CheckM
[2023-03-15 10:22:26,066] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b7004/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b7004/checkm_input OceanDNA-b7004/checkm_result
[2023-03-15 10:23:03,017] [INFO] Task succeeded: CheckM
[2023-03-15 10:23:03,018] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 83.33%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-03-15 10:23:03,020] [INFO] ===== Completeness check finished =====
[2023-03-15 10:23:03,020] [INFO] ===== Start GTDB Search =====
[2023-03-15 10:23:03,020] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b7004/markers.fasta)
[2023-03-15 10:23:03,020] [INFO] Task started: Blastn
[2023-03-15 10:23:03,021] [INFO] Running command: blastn -query OceanDNA-b7004/markers.fasta -db /var/lib/cwl/stg0364e319-0867-4fc2-8786-0a9872ddb948/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b7004/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-15 10:23:03,877] [INFO] Task succeeded: Blastn
[2023-03-15 10:23:03,878] [INFO] Selected 9 target genomes.
[2023-03-15 10:23:03,878] [INFO] Target genome list was writen to OceanDNA-b7004/target_genomes_gtdb.txt
[2023-03-15 10:23:04,289] [INFO] Task started: fastANI
[2023-03-15 10:23:04,289] [INFO] Running command: fastANI --query /var/lib/cwl/stgeaa26cfc-4081-4db0-a8e9-fb1346baf66b/OceanDNA-b7004.fa --refList OceanDNA-b7004/target_genomes_gtdb.txt --output OceanDNA-b7004/fastani_result_gtdb.tsv --threads 1
[2023-03-15 10:23:08,539] [INFO] Task succeeded: fastANI
[2023-03-15 10:23:08,545] [INFO] Found 9 fastANI hits (0 hits with ANI > circumscription radius)
[2023-03-15 10:23:08,545] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCA_018669795.1	s__GCA-002733185 sp018669795	87.9225	561	755	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__GCA-002733185	95.0	98.54	95.82	0.94	0.91	4	-
GCA_002713705.1	s__GCA-002733185 sp002713705	84.9519	564	755	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__GCA-002733185	95.0	97.32	97.11	0.93	0.92	3	-
GCA_905181835.1	s__GCA-002733185 sp905181835	78.8988	314	755	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__GCA-002733185	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002733185.1	s__GCA-002733185 sp002733185	78.8135	353	755	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__GCA-002733185	95.0	N/A	N/A	N/A	N/A	1	-
GCA_004214175.1	s__GCA-002733185 sp004214175	78.3409	192	755	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__GCA-002733185	95.0	N/A	N/A	N/A	N/A	1	-
GCA_004213605.1	s__GCA-002733185 sp004213605	78.3121	221	755	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__GCA-002733185	95.0	95.79	95.79	0.89	0.89	2	-
GCA_019090365.1	s__GCA-002733185 sp019090365	77.2859	206	755	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__GCA-002733185	95.0	N/A	N/A	N/A	N/A	1	-
GCF_009796865.1	s__Formosa sp009796865	76.2973	110	755	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Formosa	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002441545.1	s__Yeosuana sp002441545	76.0683	101	755	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Yeosuana	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-03-15 10:23:08,545] [INFO] GTDB search result was written to OceanDNA-b7004/result_gtdb.tsv
[2023-03-15 10:23:08,545] [INFO] ===== GTDB Search completed =====
[2023-03-15 10:23:08,547] [INFO] DFAST_QC result json was written to OceanDNA-b7004/dqc_result.json
[2023-03-15 10:23:08,548] [INFO] DFAST_QC completed!
[2023-03-15 10:23:08,548] [INFO] Total running time: 0h1m21s
