[2023-03-15 21:24:38,452] [INFO] DFAST_QC pipeline started.
[2023-03-15 21:24:38,452] [INFO] DFAST_QC version: 0.5.7
[2023-03-15 21:24:38,452] [INFO] DQC Reference Directory: /var/lib/cwl/stg9476ef5f-51aa-4b55-a9fb-71c3a51d67a1/dqc_reference
[2023-03-15 21:24:39,526] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-15 21:24:39,526] [INFO] Task started: Prodigal
[2023-03-15 21:24:39,526] [INFO] Running command: cat /var/lib/cwl/stg73893f29-51d4-4881-8ae4-8a8b7aa91048/OceanDNA-b27513.fa | prodigal -d OceanDNA-b27513/cds.fna -a OceanDNA-b27513/protein.faa -g 11 -q > /dev/null
[2023-03-15 21:24:52,531] [INFO] Task succeeded: Prodigal
[2023-03-15 21:24:52,531] [INFO] Task started: HMMsearch
[2023-03-15 21:24:52,531] [INFO] Running command: hmmsearch --tblout OceanDNA-b27513/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg9476ef5f-51aa-4b55-a9fb-71c3a51d67a1/dqc_reference/reference_markers.hmm OceanDNA-b27513/protein.faa > /dev/null
[2023-03-15 21:24:52,718] [INFO] Task succeeded: HMMsearch
[2023-03-15 21:24:52,718] [WARNING] Found 4/6 markers. [/var/lib/cwl/stg73893f29-51d4-4881-8ae4-8a8b7aa91048/OceanDNA-b27513.fa]
[2023-03-15 21:24:52,740] [INFO] Query marker FASTA was written to OceanDNA-b27513/markers.fasta
[2023-03-15 21:24:52,740] [INFO] Task started: Blastn
[2023-03-15 21:24:52,740] [INFO] Running command: blastn -query OceanDNA-b27513/markers.fasta -db /var/lib/cwl/stg9476ef5f-51aa-4b55-a9fb-71c3a51d67a1/dqc_reference/reference_markers.fasta -out OceanDNA-b27513/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-15 21:24:53,468] [INFO] Task succeeded: Blastn
[2023-03-15 21:24:53,469] [INFO] Selected 27 target genomes.
[2023-03-15 21:24:53,469] [INFO] Target genome list was writen to OceanDNA-b27513/target_genomes.txt
[2023-03-15 21:24:53,488] [INFO] Task started: fastANI
[2023-03-15 21:24:53,488] [INFO] Running command: fastANI --query /var/lib/cwl/stg73893f29-51d4-4881-8ae4-8a8b7aa91048/OceanDNA-b27513.fa --refList OceanDNA-b27513/target_genomes.txt --output OceanDNA-b27513/fastani_result.tsv --threads 1
[2023-03-15 21:25:22,037] [INFO] Task succeeded: fastANI
[2023-03-15 21:25:22,038] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg9476ef5f-51aa-4b55-a9fb-71c3a51d67a1/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-15 21:25:22,038] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg9476ef5f-51aa-4b55-a9fb-71c3a51d67a1/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-15 21:25:22,053] [INFO] Found 27 fastANI hits (0 hits with ANI > threshold)
[2023-03-15 21:25:22,053] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-15 21:25:22,053] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Bradyrhizobium nitroreducens	strain=TSA1	GCA_002776695.1	709803	709803	type	True	77.6562	197	616	95	below_threshold
Rhodopseudomonas rhenobacensis	strain=DSM 12706	GCA_014203125.1	87461	87461	type	True	77.6016	203	616	95	below_threshold
Bradyrhizobium sediminis	strain=S2-20-1	GCA_018736085.1	2840469	2840469	type	True	77.5719	205	616	95	below_threshold
Bradyrhizobium frederickii	strain=CNPSo 3426	GCA_004570865.1	2560054	2560054	type	True	77.5303	203	616	95	below_threshold
Pseudorhodoplanes sinuspersici	strain=CECT 8374	GCA_003610435.1	1235591	1235591	type	True	77.4669	200	616	95	below_threshold
Afipia broomeae	strain=ATCC 49717	GCA_000314675.2	56946	56946	type	True	77.4626	159	616	95	below_threshold
Pseudorhodoplanes sinuspersici	strain=RIPI110	GCA_002119765.1	1235591	1235591	type	True	77.453	202	616	95	below_threshold
Bradyrhizobium valentinum	strain=LmjM3	GCA_001440405.1	1518501	1518501	type	True	77.3965	188	616	95	below_threshold
Bradyrhizobium elkanii	strain=USDA 76	GCA_000379145.1	29448	29448	type	True	77.3349	208	616	95	below_threshold
Bradyrhizobium elkanii	strain=USDA 76	GCA_023278185.1	29448	29448	type	True	77.3321	206	616	95	below_threshold
Bradyrhizobium oligotrophicum	strain=S58	GCA_000344805.1	44255	44255	type	True	77.325	192	616	95	below_threshold
Bradyrhizobium cenepequi	strain=CNPSo 4026	GCA_020329485.1	2821403	2821403	type	True	77.305	176	616	95	below_threshold
Bradyrhizobium quebecense	strain=66S1MB, /ecotype=symbiovar septentrionalis	GCA_013373795.3	2748629	2748629	type	True	77.286	195	616	95	below_threshold
Bradyrhizobium icense	strain=LMTR 13	GCA_001693385.1	1274631	1274631	type	True	77.2749	179	616	95	below_threshold
Bradyrhizobium elkanii	strain=NBRC 14791	GCA_006539665.1	29448	29448	type	True	77.2694	207	616	95	below_threshold
Bradyrhizobium altum	strain=Pear77	GCA_020889705.1	1571202	1571202	type	True	77.253	193	616	95	below_threshold
Bradyrhizobium cosmicum	strain=58S1	GCA_007290395.1	1404864	1404864	type	True	77.2065	207	616	95	below_threshold
Bradyrhizobium pachyrhizi	strain=PAC 48	GCA_001189245.1	280333	280333	type	True	77.1029	215	616	95	below_threshold
Bradyrhizobium niftali	strain=CNPSo 3448	GCA_004571025.1	2560055	2560055	type	True	77.0883	214	616	95	below_threshold
Rhodoplanes piscinae	strain=DSM 19946	GCA_003258855.1	444923	444923	type	True	76.9335	161	616	95	below_threshold
Rhodoplanes roseus	strain=DSM 5909	GCA_003258865.1	29409	29409	type	True	76.8	188	616	95	below_threshold
Methylobacterium bullatum	strain=DSM 21893	GCA_022179105.1	570505	570505	type	True	76.4677	69	616	95	below_threshold
Methylobacterium crusticola	strain=KCTC 52305	GCA_022179145.1	1697972	1697972	type	True	76.3075	103	616	95	below_threshold
Rhizobium populisoli	strain=XQZ8	GCA_019430945.1	2859785	2859785	type	True	76.1219	76	616	95	below_threshold
Methylobacterium mesophilicum	strain=NBRC 15688	GCA_022179445.1	39956	39956	type	True	76.0593	90	616	95	below_threshold
Methylobacterium haplocladii	strain=DSM 24195	GCA_022179265.1	1176176	1176176	type	True	76.0244	89	616	95	below_threshold
Oricola indica	strain=JL-62	GCA_019966595.1	2872591	2872591	type	True	75.8213	67	616	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-15 21:25:22,054] [INFO] DFAST Taxonomy check result was written to OceanDNA-b27513/tc_result.tsv
[2023-03-15 21:25:22,054] [INFO] ===== Taxonomy check completed =====
[2023-03-15 21:25:22,054] [INFO] ===== Start completeness check using CheckM =====
[2023-03-15 21:25:22,054] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg9476ef5f-51aa-4b55-a9fb-71c3a51d67a1/dqc_reference/checkm_data
[2023-03-15 21:25:22,054] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-15 21:25:22,059] [INFO] Task started: CheckM
[2023-03-15 21:25:22,059] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b27513/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b27513/checkm_input OceanDNA-b27513/checkm_result
[2023-03-15 21:25:57,773] [INFO] Task succeeded: CheckM
[2023-03-15 21:25:57,774] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 71.58%
Contamintation: 6.77%
Strain heterogeneity: 66.67%
--------------------------------------------------------------------------------
[2023-03-15 21:25:57,776] [INFO] ===== Completeness check finished =====
[2023-03-15 21:25:57,776] [INFO] ===== Start GTDB Search =====
[2023-03-15 21:25:57,777] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b27513/markers.fasta)
[2023-03-15 21:25:57,778] [INFO] Task started: Blastn
[2023-03-15 21:25:57,778] [INFO] Running command: blastn -query OceanDNA-b27513/markers.fasta -db /var/lib/cwl/stg9476ef5f-51aa-4b55-a9fb-71c3a51d67a1/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b27513/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-15 21:25:59,181] [INFO] Task succeeded: Blastn
[2023-03-15 21:25:59,181] [INFO] Selected 13 target genomes.
[2023-03-15 21:25:59,181] [INFO] Target genome list was writen to OceanDNA-b27513/target_genomes_gtdb.txt
[2023-03-15 21:25:59,369] [INFO] Task started: fastANI
[2023-03-15 21:25:59,369] [INFO] Running command: fastANI --query /var/lib/cwl/stg73893f29-51d4-4881-8ae4-8a8b7aa91048/OceanDNA-b27513.fa --refList OceanDNA-b27513/target_genomes_gtdb.txt --output OceanDNA-b27513/fastani_result_gtdb.tsv --threads 1
[2023-03-15 21:26:08,696] [INFO] Task succeeded: fastANI
[2023-03-15 21:26:08,704] [INFO] Found 13 fastANI hits (0 hits with ANI > circumscription radius)
[2023-03-15 21:26:08,704] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCA_903839845.1	s__Pseudolabrys sp903839845	81.6195	454	616	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudolabrys	95.0	99.70	99.58	0.91	0.88	17	-
GCA_001464835.1	s__Pseudolabrys sp001464835	81.5309	425	616	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudolabrys	95.0	N/A	N/A	N/A	N/A	1	-
GCA_903892675.1	s__Pseudolabrys sp903892675	80.8608	393	616	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudolabrys	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016124795.1	s__Pseudolabrys sp016124795	80.8413	447	616	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudolabrys	95.0	N/A	N/A	N/A	N/A	1	-
GCA_903878085.1	s__Pseudolabrys sp903878085	80.2475	414	616	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudolabrys	95.0	99.95	99.95	0.96	0.96	2	-
GCA_903885555.1	s__Pseudolabrys sp903885555	79.958	384	616	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudolabrys	95.0	99.81	99.81	0.96	0.96	2	-
GCA_018240595.1	s__Pseudolabrys sp018240595	79.8552	386	616	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudolabrys	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003367195.1	s__Pseudolabrys sp003367195	79.8316	364	616	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudolabrys	95.0	N/A	N/A	N/A	N/A	1	-
GCA_018242205.1	s__Pseudolabrys sp018242205	79.7915	366	616	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudolabrys	95.0	N/A	N/A	N/A	N/A	1	-
GCA_017849615.1	s__Pseudolabrys sp017849615	79.7443	376	616	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudolabrys	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001426945.1	s__Pseudolabrys sp001426945	79.6725	378	616	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudolabrys	95.0	N/A	N/A	N/A	N/A	1	-
GCF_005153485.1	s__Pseudolabrys sp005153485	79.6587	386	616	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudolabrys	95.0	N/A	N/A	N/A	N/A	1	-
GCA_903891395.1	s__Pseudolabrys sp903891395	78.7541	265	616	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudolabrys	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-03-15 21:26:08,704] [INFO] GTDB search result was written to OceanDNA-b27513/result_gtdb.tsv
[2023-03-15 21:26:08,704] [INFO] ===== GTDB Search completed =====
[2023-03-15 21:26:08,706] [INFO] DFAST_QC result json was written to OceanDNA-b27513/dqc_result.json
[2023-03-15 21:26:08,707] [INFO] DFAST_QC completed!
[2023-03-15 21:26:08,707] [INFO] Total running time: 0h1m30s
