[2023-03-18 22:34:59,279] [INFO] DFAST_QC pipeline started.
[2023-03-18 22:34:59,280] [INFO] DFAST_QC version: 0.5.7
[2023-03-18 22:34:59,280] [INFO] DQC Reference Directory: /var/lib/cwl/stgc04cc6b4-ce05-44e9-a52a-baa1ff3a4aae/dqc_reference
[2023-03-18 22:35:00,397] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-18 22:35:00,397] [INFO] Task started: Prodigal
[2023-03-18 22:35:00,398] [INFO] Running command: cat /var/lib/cwl/stg2619a48a-b3c1-48aa-b4b5-7ee1a0816209/OceanDNA-b24286.fa | prodigal -d OceanDNA-b24286/cds.fna -a OceanDNA-b24286/protein.faa -g 11 -q > /dev/null
[2023-03-18 22:35:32,587] [INFO] Task succeeded: Prodigal
[2023-03-18 22:35:32,588] [INFO] Task started: HMMsearch
[2023-03-18 22:35:32,588] [INFO] Running command: hmmsearch --tblout OceanDNA-b24286/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgc04cc6b4-ce05-44e9-a52a-baa1ff3a4aae/dqc_reference/reference_markers.hmm OceanDNA-b24286/protein.faa > /dev/null
[2023-03-18 22:35:32,822] [INFO] Task succeeded: HMMsearch
[2023-03-18 22:35:32,822] [INFO] Found 6/6 markers.
[2023-03-18 22:35:32,853] [INFO] Query marker FASTA was written to OceanDNA-b24286/markers.fasta
[2023-03-18 22:35:32,854] [INFO] Task started: Blastn
[2023-03-18 22:35:32,854] [INFO] Running command: blastn -query OceanDNA-b24286/markers.fasta -db /var/lib/cwl/stgc04cc6b4-ce05-44e9-a52a-baa1ff3a4aae/dqc_reference/reference_markers.fasta -out OceanDNA-b24286/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-18 22:35:33,529] [INFO] Task succeeded: Blastn
[2023-03-18 22:35:33,530] [INFO] Selected 33 target genomes.
[2023-03-18 22:35:33,530] [INFO] Target genome list was writen to OceanDNA-b24286/target_genomes.txt
[2023-03-18 22:35:33,546] [INFO] Task started: fastANI
[2023-03-18 22:35:33,546] [INFO] Running command: fastANI --query /var/lib/cwl/stg2619a48a-b3c1-48aa-b4b5-7ee1a0816209/OceanDNA-b24286.fa --refList OceanDNA-b24286/target_genomes.txt --output OceanDNA-b24286/fastani_result.tsv --threads 1
[2023-03-18 22:35:55,498] [INFO] Task succeeded: fastANI
[2023-03-18 22:35:55,498] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgc04cc6b4-ce05-44e9-a52a-baa1ff3a4aae/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-18 22:35:55,499] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgc04cc6b4-ce05-44e9-a52a-baa1ff3a4aae/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-18 22:35:55,513] [INFO] Found 27 fastANI hits (0 hits with ANI > threshold)
[2023-03-18 22:35:55,513] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-18 22:35:55,513] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Pyruvatibacter mobilis	strain=GYP-11	GCA_009910475.1	1712261	1712261	type	True	75.8225	65	1670	95	below_threshold
Oceanibaculum nanhaiense	strain=L54-1-50	GCA_002148795.1	1909734	1909734	type	True	75.8143	94	1670	95	below_threshold
Pyruvatibacter mobilis	strain=CGMCC 1.15125	GCA_014640905.1	1712261	1712261	type	True	75.8058	64	1670	95	below_threshold
Pyruvatibacter mobilis	strain=CGMCC 1.15125	GCA_012848855.1	1712261	1712261	type	True	75.8027	66	1670	95	below_threshold
Oceanibaculum indicum	strain=P24	GCA_000299935.1	526216	526216	type	True	75.8026	92	1670	95	below_threshold
Ferrovibrio terrae	strain=K5	GCA_007197755.1	2594003	2594003	type	True	75.6879	90	1670	95	below_threshold
Oceanibaculum pacificum	strain=MCCC 1A02656	GCA_001618175.1	580166	580166	type	True	75.6788	96	1670	95	below_threshold
Zavarzinia aquatilis	strain=HR-AS	GCA_003173035.1	2211142	2211142	type	True	75.6388	98	1670	95	below_threshold
Sneathiella chungangensis	strain=KCTC 32476	GCA_009882935.1	1418234	1418234	type	True	75.6383	59	1670	95	below_threshold
Nisaea sediminum	strain=NBU1469	GCA_014904705.1	2775867	2775867	type	True	75.5489	76	1670	95	below_threshold
Skermanella aerolata	strain=5416T-32	GCA_000936425.1	393310	393310	type	True	75.3867	100	1670	95	below_threshold
Hwanghaeella grinnelliae	strain=Gri0909	GCA_004005845.1	2500179	2500179	type	True	75.3595	53	1670	95	below_threshold
Alexandriicola marinus	strain=LZ-14	GCA_004000435.1	2081710	2081710	type	True	75.3519	52	1670	95	below_threshold
Skermanella rosea	strain=KEMB 2255-458	GCA_016806835.2	1817965	1817965	type	True	75.298	101	1670	95	below_threshold
Xanthobacter oligotrophicus	strain=29k	GCA_008364685.1	2607286	2607286	type	True	75.2962	84	1670	95	below_threshold
Limimaricola hongkongensis	strain=UST950701-009P	GCA_000365005.1	278132	278132	type	True	75.2885	56	1670	95	below_threshold
Limimaricola hongkongensis	strain=DSM 17492	GCA_000600975.2	278132	278132	type	True	75.2852	56	1670	95	below_threshold
Candidatus Rhodoblastus alkanivorans		GCA_022760755.1	2954117	2954117	type	True	75.2795	72	1670	95	below_threshold
Rhabdonatronobacter sediminivivens	strain=IM2376	GCA_013415485.1	2743469	2743469	type	True	75.2472	64	1670	95	below_threshold
Kaistia granuli	strain=Ko04	GCA_000380505.1	363259	363259	type	True	75.247	66	1670	95	below_threshold
Skermanella stibiiresistens	strain=SB22	GCA_000576635.1	913326	913326	type	True	75.2115	101	1670	95	below_threshold
Methylobacterium adhaesivum	strain=DSM 17169	GCA_022179065.1	333297	333297	type	True	75.1931	52	1670	95	below_threshold
Geminicoccus roseus	strain=DSM 18922	GCA_000427665.1	404900	404900	type	True	75.1635	67	1670	95	below_threshold
Roseococcus pinisoli	strain=XZZS9	GCA_018413645.1	2835040	2835040	type	True	75.065	68	1670	95	below_threshold
Roseomonas marmotae	strain=1318	GCA_017654485.1	2768161	2768161	type	True	75.0524	78	1670	95	below_threshold
Nioella ostreopsis	strain=Z7-4	GCA_004000255.1	2448479	2448479	type	True	75.009	63	1670	95	below_threshold
Roseomonas haemaphysalidis	strain=546	GCA_017355405.1	2768162	2768162	type	True	74.9363	96	1670	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-18 22:35:55,514] [INFO] DFAST Taxonomy check result was written to OceanDNA-b24286/tc_result.tsv
[2023-03-18 22:35:55,514] [INFO] ===== Taxonomy check completed =====
[2023-03-18 22:35:55,515] [INFO] ===== Start completeness check using CheckM =====
[2023-03-18 22:35:55,515] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgc04cc6b4-ce05-44e9-a52a-baa1ff3a4aae/dqc_reference/checkm_data
[2023-03-18 22:35:55,515] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-18 22:35:55,522] [INFO] Task started: CheckM
[2023-03-18 22:35:55,522] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b24286/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b24286/checkm_input OceanDNA-b24286/checkm_result
[2023-03-18 22:37:11,743] [INFO] Task succeeded: CheckM
[2023-03-18 22:37:11,743] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 83.33%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-03-18 22:37:11,746] [INFO] ===== Completeness check finished =====
[2023-03-18 22:37:11,746] [INFO] ===== Start GTDB Search =====
[2023-03-18 22:37:11,747] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b24286/markers.fasta)
[2023-03-18 22:37:11,747] [INFO] Task started: Blastn
[2023-03-18 22:37:11,748] [INFO] Running command: blastn -query OceanDNA-b24286/markers.fasta -db /var/lib/cwl/stgc04cc6b4-ce05-44e9-a52a-baa1ff3a4aae/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b24286/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-18 22:37:12,973] [INFO] Task succeeded: Blastn
[2023-03-18 22:37:12,973] [INFO] Selected 9 target genomes.
[2023-03-18 22:37:12,973] [INFO] Target genome list was writen to OceanDNA-b24286/target_genomes_gtdb.txt
[2023-03-18 22:37:12,983] [INFO] Task started: fastANI
[2023-03-18 22:37:12,983] [INFO] Running command: fastANI --query /var/lib/cwl/stg2619a48a-b3c1-48aa-b4b5-7ee1a0816209/OceanDNA-b24286.fa --refList OceanDNA-b24286/target_genomes_gtdb.txt --output OceanDNA-b24286/fastani_result_gtdb.tsv --threads 1
[2023-03-18 22:37:21,679] [INFO] Task succeeded: fastANI
[2023-03-18 22:37:21,684] [INFO] Found 7 fastANI hits (1 hits with ANI > circumscription radius)
[2023-03-18 22:37:21,684] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCA_018701215.1	s__GCA-2731375 sp018701215	99.9593	1630	1670	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__GCA-2731375;f__GCA-2731375;g__GCA-2731375	95.0	99.97	99.95	0.98	0.97	15	conclusive
GCA_002731375.1	s__GCA-2731375 sp002731375	85.0778	1108	1670	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__GCA-2731375;f__GCA-2731375;g__GCA-2731375	95.0	99.58	99.58	0.91	0.91	2	-
GCA_016776605.1	s__GCA-2731375 sp016776605	79.2927	884	1670	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__GCA-2731375;f__GCA-2731375;g__GCA-2731375	95.0	N/A	N/A	N/A	N/A	1	-
GCA_018667855.1	s__GCA-2731375 sp018667855	79.1201	847	1670	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__GCA-2731375;f__GCA-2731375;g__GCA-2731375	95.0	99.62	99.58	0.85	0.84	3	-
GCA_018660265.1	s__GCA-2731375 sp018660265	78.578	769	1670	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__GCA-2731375;f__GCA-2731375;g__GCA-2731375	95.0	99.92	99.87	0.96	0.94	11	-
GCF_007197755.1	s__Ferrovibrio terrae	75.6693	92	1670	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Ferrovibrionales;f__Ferrovibrionaceae;g__Ferrovibrio	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003119115.1	s__Azospirillum sp003119115	75.3034	109	1670	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Azospirillum	95.0	97.55	95.10	0.94	0.88	3	-
--------------------------------------------------------------------------------
[2023-03-18 22:37:21,685] [INFO] GTDB search result was written to OceanDNA-b24286/result_gtdb.tsv
[2023-03-18 22:37:21,685] [INFO] ===== GTDB Search completed =====
[2023-03-18 22:37:21,687] [INFO] DFAST_QC result json was written to OceanDNA-b24286/dqc_result.json
[2023-03-18 22:37:21,687] [INFO] DFAST_QC completed!
[2023-03-18 22:37:21,687] [INFO] Total running time: 0h2m22s
