[2023-03-19 05:01:20,170] [INFO] DFAST_QC pipeline started.
[2023-03-19 05:01:20,170] [INFO] DFAST_QC version: 0.5.7
[2023-03-19 05:01:20,170] [INFO] DQC Reference Directory: /var/lib/cwl/stge8f88aaa-8ec8-4da6-8964-ceacae2198e7/dqc_reference
[2023-03-19 05:01:21,332] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-19 05:01:21,332] [INFO] Task started: Prodigal
[2023-03-19 05:01:21,333] [INFO] Running command: cat /var/lib/cwl/stg6a94cb44-a292-43f5-846b-9a1c20d38408/OceanDNA-b31070.fa | prodigal -d OceanDNA-b31070/cds.fna -a OceanDNA-b31070/protein.faa -g 11 -q > /dev/null
[2023-03-19 05:01:46,427] [INFO] Task succeeded: Prodigal
[2023-03-19 05:01:46,427] [INFO] Task started: HMMsearch
[2023-03-19 05:01:46,427] [INFO] Running command: hmmsearch --tblout OceanDNA-b31070/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stge8f88aaa-8ec8-4da6-8964-ceacae2198e7/dqc_reference/reference_markers.hmm OceanDNA-b31070/protein.faa > /dev/null
[2023-03-19 05:01:46,621] [INFO] Task succeeded: HMMsearch
[2023-03-19 05:01:46,622] [WARNING] Found 5/6 markers. [/var/lib/cwl/stg6a94cb44-a292-43f5-846b-9a1c20d38408/OceanDNA-b31070.fa]
[2023-03-19 05:01:46,647] [INFO] Query marker FASTA was written to OceanDNA-b31070/markers.fasta
[2023-03-19 05:01:46,649] [INFO] Task started: Blastn
[2023-03-19 05:01:46,649] [INFO] Running command: blastn -query OceanDNA-b31070/markers.fasta -db /var/lib/cwl/stge8f88aaa-8ec8-4da6-8964-ceacae2198e7/dqc_reference/reference_markers.fasta -out OceanDNA-b31070/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-19 05:01:47,283] [INFO] Task succeeded: Blastn
[2023-03-19 05:01:47,284] [INFO] Selected 26 target genomes.
[2023-03-19 05:01:47,284] [INFO] Target genome list was writen to OceanDNA-b31070/target_genomes.txt
[2023-03-19 05:01:47,301] [INFO] Task started: fastANI
[2023-03-19 05:01:47,301] [INFO] Running command: fastANI --query /var/lib/cwl/stg6a94cb44-a292-43f5-846b-9a1c20d38408/OceanDNA-b31070.fa --refList OceanDNA-b31070/target_genomes.txt --output OceanDNA-b31070/fastani_result.tsv --threads 1
[2023-03-19 05:02:07,199] [INFO] Task succeeded: fastANI
[2023-03-19 05:02:07,200] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stge8f88aaa-8ec8-4da6-8964-ceacae2198e7/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-19 05:02:07,200] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stge8f88aaa-8ec8-4da6-8964-ceacae2198e7/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-19 05:02:07,212] [INFO] Found 20 fastANI hits (0 hits with ANI > threshold)
[2023-03-19 05:02:07,212] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-19 05:02:07,212] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Magnetospirillum moscoviense	strain=BB-1	GCA_001650635.1	1437059	1437059	type	True	76.0242	129	1317	95	below_threshold
Oceanibaculum pacificum	strain=MCCC 1A02656	GCA_001618175.1	580166	580166	type	True	76.0178	118	1317	95	below_threshold
Pelagibius marinus	strain=NBU2595	GCA_014925385.1	2762760	2762760	type	True	76.0046	129	1317	95	below_threshold
Magnetospirillum kuznetsovii	strain=LBB-42	GCA_003284725.1	2053833	2053833	type	True	75.9939	143	1317	95	below_threshold
Magnetospirillum aberrantis	strain=SpK	GCA_011022235.1	1105283	1105283	type	True	75.9429	103	1317	95	below_threshold
Magnetospirillum gryphiswaldense	strain=MSR-1	GCA_000513295.1	55518	55518	type	True	75.9179	131	1317	95	below_threshold
Magnetospirillum gryphiswaldense	strain=MSR-1	GCA_002995515.1	55518	55518	type	True	75.8826	127	1317	95	below_threshold
Stella humosa	strain=ATCC 43930	GCA_006738645.1	94	94	type	True	75.5836	115	1317	95	below_threshold
Nisaea nitritireducens	strain=DSM 19540	GCA_014904795.1	568392	568392	type	True	75.5768	61	1317	95	below_threshold
Stella humosa	strain=DSM 5900	GCA_003751345.1	94	94	type	True	75.5472	119	1317	95	below_threshold
Mesorhizobium carmichaelinearum	strain=ICMP 18942	GCA_900199455.1	1208188	1208188	type	True	75.506	106	1317	95	below_threshold
Xanthobacter aminoxidans	strain=ATCC BAA-299	GCA_023571765.1	186280	186280	type	True	75.4017	98	1317	95	below_threshold
Reyranella aquatilis	strain=KCTC 52223	GCA_020880995.1	2035356	2035356	type	True	75.3896	89	1317	95	below_threshold
Bradyrhizobium neotropicale	strain=BR 10247	GCA_001641695.1	1497615	1497615	type	True	75.3659	91	1317	95	below_threshold
Rhodopseudomonas rhenobacensis	strain=DSM 12706	GCA_014203125.1	87461	87461	type	True	75.3214	102	1317	95	below_threshold
Terrihabitans soli	strain=IZ6	GCA_014191545.1	708113	708113	type	True	75.2845	58	1317	95	below_threshold
Xanthobacter agilis	strain=LMG 16336	GCA_021730435.1	47492	47492	type	True	75.2763	83	1317	95	below_threshold
Vineibacter terrae	strain=CC-CFT640	GCA_008039615.1	2586908	2586908	type	True	75.2686	142	1317	95	below_threshold
Limimaricola pyoseonensis	strain=DSM 21424	GCA_900102015.1	521013	521013	type	True	75.1834	92	1317	95	below_threshold
Xanthobacter oligotrophicus	strain=29k	GCA_008364685.1	2607286	2607286	type	True	75.1403	102	1317	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-19 05:02:07,212] [INFO] DFAST Taxonomy check result was written to OceanDNA-b31070/tc_result.tsv
[2023-03-19 05:02:07,212] [INFO] ===== Taxonomy check completed =====
[2023-03-19 05:02:07,212] [INFO] ===== Start completeness check using CheckM =====
[2023-03-19 05:02:07,213] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stge8f88aaa-8ec8-4da6-8964-ceacae2198e7/dqc_reference/checkm_data
[2023-03-19 05:02:07,213] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-19 05:02:07,505] [INFO] Task started: CheckM
[2023-03-19 05:02:07,505] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b31070/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b31070/checkm_input OceanDNA-b31070/checkm_result
[2023-03-19 05:03:08,250] [INFO] Task succeeded: CheckM
[2023-03-19 05:03:08,251] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 90.36%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-03-19 05:03:08,253] [INFO] ===== Completeness check finished =====
[2023-03-19 05:03:08,254] [INFO] ===== Start GTDB Search =====
[2023-03-19 05:03:08,254] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b31070/markers.fasta)
[2023-03-19 05:03:08,255] [INFO] Task started: Blastn
[2023-03-19 05:03:08,255] [INFO] Running command: blastn -query OceanDNA-b31070/markers.fasta -db /var/lib/cwl/stge8f88aaa-8ec8-4da6-8964-ceacae2198e7/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b31070/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-19 05:03:09,303] [INFO] Task succeeded: Blastn
[2023-03-19 05:03:09,304] [INFO] Selected 23 target genomes.
[2023-03-19 05:03:09,304] [INFO] Target genome list was writen to OceanDNA-b31070/target_genomes_gtdb.txt
[2023-03-19 05:03:09,323] [INFO] Task started: fastANI
[2023-03-19 05:03:09,323] [INFO] Running command: fastANI --query /var/lib/cwl/stg6a94cb44-a292-43f5-846b-9a1c20d38408/OceanDNA-b31070.fa --refList OceanDNA-b31070/target_genomes_gtdb.txt --output OceanDNA-b31070/fastani_result_gtdb.tsv --threads 1
[2023-03-19 05:03:24,301] [INFO] Task succeeded: fastANI
[2023-03-19 05:03:24,313] [INFO] Found 20 fastANI hits (0 hits with ANI > circumscription radius)
[2023-03-19 05:03:24,313] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCA_018674495.1	s__GCA-2687515 sp018674495	88.5641	1071	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__UBA2165;g__GCA-2687515	95.0	99.77	99.72	0.95	0.94	13	-
GCA_002687515.1	s__GCA-2687515 sp002687515	80.4585	646	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__UBA2165;g__GCA-2687515	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000968135.1	s__Magnetospira sp000968135	76.2146	88	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Magnetospiraceae;g__Magnetospira	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002706405.1	s__UBA1479 sp002706405	76.0432	108	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Casp-alpha2;g__UBA1479	95.0	99.68	99.68	0.89	0.89	2	-
GCA_002325375.1	s__UBA1479 sp002325375	76.03	133	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Casp-alpha2;g__UBA1479	95.0	99.20	96.43	0.95	0.85	6	-
GCF_902729435.1	s__Magnetospirillum sp902729435	76.0231	141	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Magnetospirillaceae;g__Magnetospirillum	95.0	N/A	N/A	N/A	N/A	1	-
GCF_018224925.1	s__Magnetospirillum sp018224925	75.9479	108	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Magnetospirillaceae;g__Magnetospirillum	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002696205.1	s__UBA1479 sp002696205	75.9053	106	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Casp-alpha2;g__UBA1479	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002725625.1	s__GCA-2725625 sp002725625	75.9053	72	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__UBA2165;g__GCA-2725625	95.0	N/A	N/A	N/A	N/A	1	-
GCA_015231605.1	s__JADGBG01 sp015231605	75.8787	86	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Magnetovibrionaceae;g__JADGBG01	95.0	99.19	99.19	0.87	0.87	2	-
GCA_006218045.1	s__Magnetospirillum sp006218045	75.7979	151	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Magnetospirillaceae;g__Magnetospirillum	95.0	99.95	99.95	0.98	0.98	2	-
GCF_900101965.1	s__Rhodospira trueperi	75.7622	95	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Rhodospirillaceae;g__Rhodospira	95.0	N/A	N/A	N/A	N/A	1	-
GCA_015232025.1	s__JADFZP01 sp015232025	75.7525	88	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Magnetospirillaceae;g__JADFZP01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_018336035.1	s__Ferrovibrio sp018336035	75.5945	81	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Ferrovibrionales;f__Ferrovibrionaceae;g__Ferrovibrio	95.0	N/A	N/A	N/A	N/A	1	-
GCA_000961745.1	s__BRH-c57 sp000961745	75.5129	63	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Rhodospirillaceae;g__BRH-c57	95.0	N/A	N/A	N/A	N/A	1	-
GCF_008364955.1	s__Azospirillum lipoferum	75.4803	144	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Azospirillum	95.0	N/A	N/A	N/A	N/A	1	-
GCF_004923295.1	s__Azospirillum sp003115975	75.3941	140	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Azospirillum	95.0	99.99	99.99	0.99	0.99	2	-
GCA_014191545.1	s__Ga0077545 sp014191545	75.2898	57	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Methylopilaceae;g__Ga0077545	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900102015.1	s__Limimaricola pyoseonensis	75.1771	93	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Limimaricola	95.0	N/A	N/A	N/A	N/A	1	-
GCA_903839965.1	s__BOG-930 sp903839965	75.0613	75	1317	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Acetobacterales;f__Acetobacteraceae;g__BOG-930	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-03-19 05:03:24,313] [INFO] GTDB search result was written to OceanDNA-b31070/result_gtdb.tsv
[2023-03-19 05:03:24,313] [INFO] ===== GTDB Search completed =====
[2023-03-19 05:03:24,315] [INFO] DFAST_QC result json was written to OceanDNA-b31070/dqc_result.json
[2023-03-19 05:03:24,315] [INFO] DFAST_QC completed!
[2023-03-19 05:03:24,315] [INFO] Total running time: 0h2m4s
