[2023-03-19 03:28:07,264] [INFO] DFAST_QC pipeline started.
[2023-03-19 03:28:07,265] [INFO] DFAST_QC version: 0.5.7
[2023-03-19 03:28:07,265] [INFO] DQC Reference Directory: /var/lib/cwl/stg5938aaef-64ad-4175-a806-eac41b7365f6/dqc_reference
[2023-03-19 03:28:08,374] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-19 03:28:08,374] [INFO] Task started: Prodigal
[2023-03-19 03:28:08,374] [INFO] Running command: cat /var/lib/cwl/stgfb0b79b1-43bf-40c6-aa21-449f4c7f7d83/OceanDNA-b27498.fa | prodigal -d OceanDNA-b27498/cds.fna -a OceanDNA-b27498/protein.faa -g 11 -q > /dev/null
[2023-03-19 03:28:25,909] [INFO] Task succeeded: Prodigal
[2023-03-19 03:28:25,909] [INFO] Task started: HMMsearch
[2023-03-19 03:28:25,909] [INFO] Running command: hmmsearch --tblout OceanDNA-b27498/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg5938aaef-64ad-4175-a806-eac41b7365f6/dqc_reference/reference_markers.hmm OceanDNA-b27498/protein.faa > /dev/null
[2023-03-19 03:28:26,093] [INFO] Task succeeded: HMMsearch
[2023-03-19 03:28:26,094] [INFO] Found 6/6 markers.
[2023-03-19 03:28:26,113] [INFO] Query marker FASTA was written to OceanDNA-b27498/markers.fasta
[2023-03-19 03:28:26,113] [INFO] Task started: Blastn
[2023-03-19 03:28:26,113] [INFO] Running command: blastn -query OceanDNA-b27498/markers.fasta -db /var/lib/cwl/stg5938aaef-64ad-4175-a806-eac41b7365f6/dqc_reference/reference_markers.fasta -out OceanDNA-b27498/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-19 03:28:26,994] [INFO] Task succeeded: Blastn
[2023-03-19 03:28:26,995] [INFO] Selected 34 target genomes.
[2023-03-19 03:28:26,996] [INFO] Target genome list was writen to OceanDNA-b27498/target_genomes.txt
[2023-03-19 03:28:27,037] [INFO] Task started: fastANI
[2023-03-19 03:28:27,037] [INFO] Running command: fastANI --query /var/lib/cwl/stgfb0b79b1-43bf-40c6-aa21-449f4c7f7d83/OceanDNA-b27498.fa --refList OceanDNA-b27498/target_genomes.txt --output OceanDNA-b27498/fastani_result.tsv --threads 1
[2023-03-19 03:28:57,934] [INFO] Task succeeded: fastANI
[2023-03-19 03:28:57,934] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg5938aaef-64ad-4175-a806-eac41b7365f6/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-19 03:28:57,934] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg5938aaef-64ad-4175-a806-eac41b7365f6/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-19 03:28:57,951] [INFO] Found 34 fastANI hits (0 hits with ANI > threshold)
[2023-03-19 03:28:57,951] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-19 03:28:57,952] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Bradyrhizobium diazoefficiens	strain=USDA110	GCA_000011365.1	1355477	1355477	type	True	77.4214	264	959	95	below_threshold
Bradyrhizobium niftali	strain=CNPSo 3448	GCA_004571025.1	2560055	2560055	type	True	77.4064	269	959	95	below_threshold
Bradyrhizobium diazoefficiens	strain=USDA 110	GCA_001642675.1	1355477	1355477	type	True	77.3844	265	959	95	below_threshold
Bradyrhizobium vignae	strain=LMG 28791	GCA_004114425.1	1549949	1549949	type	True	77.3184	237	959	95	below_threshold
Bradyrhizobium sediminis	strain=S2-20-1	GCA_018736085.1	2840469	2840469	type	True	77.3123	214	959	95	below_threshold
Bradyrhizobium cosmicum	strain=58S1	GCA_007290395.1	1404864	1404864	type	True	77.252	255	959	95	below_threshold
Rhodopseudomonas rhenobacensis	strain=DSM 12706	GCA_014203125.1	87461	87461	type	True	77.2433	211	959	95	below_threshold
Bradyrhizobium aeschynomenes	strain=83002	GCA_013178945.1	2734909	2734909	type	True	77.2396	231	959	95	below_threshold
Bradyrhizobium frederickii	strain=CNPSo 3426	GCA_004570865.1	2560054	2560054	type	True	77.2136	259	959	95	below_threshold
Blastochloris sulfoviridis	strain=DSM 729	GCA_008630065.1	50712	50712	type	True	77.1936	186	959	95	below_threshold
Bradyrhizobium yuanmingense	strain=CCBAU 10071	GCA_900094575.1	108015	108015	type	True	77.1844	258	959	95	below_threshold
Afipia massiliensis	strain=DSM 17498	GCA_014203115.1	211460	211460	type	True	77.1788	219	959	95	below_threshold
Bradyrhizobium guangxiense	strain=CCBAU 53363	GCA_004114915.1	1325115	1325115	type	True	77.1753	254	959	95	below_threshold
Bradyrhizobium guangdongense	strain=CGMCC 1.15034	GCA_014640515.1	1325090	1325090	type	True	77.1621	243	959	95	below_threshold
Bradyrhizobium rifense	strain=CTAW71	GCA_008123425.1	515499	515499	type	True	77.1408	256	959	95	below_threshold
Bradyrhizobium canariense	strain=BTA-1	GCA_019402665.1	255045	255045	suspected-type	True	77.1162	246	959	95	below_threshold
Bradyrhizobium valentinum	strain=LmjM3	GCA_001440405.1	1518501	1518501	type	True	77.0627	233	959	95	below_threshold
Hansschlegelia beijingensis	strain=DSM 25481	GCA_014196425.1	1133344	1133344	type	True	77.0475	185	959	95	below_threshold
Chelatococcus caeni	strain=DSM 103737	GCA_014196925.1	1348468	1348468	type	True	77.0354	211	959	95	below_threshold
Oharaeibacter diazotrophicus	strain=SM30	GCA_011317485.1	1920512	1920512	type	True	76.896	173	959	95	below_threshold
Rhodomicrobium lacus	strain=JA980	GCA_003992725.1	2498452	2498452	type	True	76.8877	88	959	95	below_threshold
Chelatococcus sambhunathii	strain=DSM 18167	GCA_001517345.1	363953	363953	type	True	76.8721	175	959	95	below_threshold
Oharaeibacter diazotrophicus	strain=DSM 102969	GCA_004362745.1	1920512	1920512	type	True	76.7996	186	959	95	below_threshold
Chelatococcus composti	strain=DSM 101465	GCA_018398355.1	1743235	1743235	type	True	76.7642	151	959	95	below_threshold
Chelatococcus composti	strain=CGMCC 1.15283	GCA_014641535.1	1743235	1743235	type	True	76.7638	152	959	95	below_threshold
Phreatobacter cathodiphilus	strain=S-12	GCA_003008515.1	1868589	1868589	type	True	76.759	218	959	95	below_threshold
Chelatococcus composti	strain=DSM 101465	GCA_014201415.1	1743235	1743235	type	True	76.7474	154	959	95	below_threshold
Mesorhizobium composti	strain=CC-YTH430	GCA_004801285.1	2675109	2675109	type	True	76.7343	163	959	95	below_threshold
Phreatobacter oligotrophus	strain=DSM 25521	GCA_003046185.1	1122261	1122261	type	True	76.691	214	959	95	below_threshold
Xanthobacter oligotrophicus	strain=29k	GCA_008364685.1	2607286	2607286	type	True	76.6734	158	959	95	below_threshold
Methylobacterium crusticola	strain=KCTC 52305	GCA_022179145.1	1697972	1697972	type	True	76.6579	188	959	95	below_threshold
Xanthobacter dioxanivorans	strain=YN2	GCA_016807805.1	2528964	2528964	type	True	76.6439	170	959	95	below_threshold
Methylobacterium gregans	strain=NBRC 103626	GCA_022179245.1	374424	374424	type	True	76.6257	169	959	95	below_threshold
Lichenibacterium ramalinae	strain=RmlP001	GCA_004137085.1	2316527	2316527	type	True	76.4284	125	959	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-19 03:28:57,952] [INFO] DFAST Taxonomy check result was written to OceanDNA-b27498/tc_result.tsv
[2023-03-19 03:28:57,952] [INFO] ===== Taxonomy check completed =====
[2023-03-19 03:28:57,952] [INFO] ===== Start completeness check using CheckM =====
[2023-03-19 03:28:57,952] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg5938aaef-64ad-4175-a806-eac41b7365f6/dqc_reference/checkm_data
[2023-03-19 03:28:57,953] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-19 03:28:58,358] [INFO] Task started: CheckM
[2023-03-19 03:28:58,358] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b27498/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b27498/checkm_input OceanDNA-b27498/checkm_result
[2023-03-19 03:29:43,085] [INFO] Task succeeded: CheckM
[2023-03-19 03:29:43,085] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 99.54%
Contamintation: 0.46%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-03-19 03:29:43,087] [INFO] ===== Completeness check finished =====
[2023-03-19 03:29:43,087] [INFO] ===== Start GTDB Search =====
[2023-03-19 03:29:43,088] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b27498/markers.fasta)
[2023-03-19 03:29:43,088] [INFO] Task started: Blastn
[2023-03-19 03:29:43,088] [INFO] Running command: blastn -query OceanDNA-b27498/markers.fasta -db /var/lib/cwl/stg5938aaef-64ad-4175-a806-eac41b7365f6/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b27498/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-19 03:29:44,799] [INFO] Task succeeded: Blastn
[2023-03-19 03:29:44,800] [INFO] Selected 19 target genomes.
[2023-03-19 03:29:44,800] [INFO] Target genome list was writen to OceanDNA-b27498/target_genomes_gtdb.txt
[2023-03-19 03:29:44,815] [INFO] Task started: fastANI
[2023-03-19 03:29:44,815] [INFO] Running command: fastANI --query /var/lib/cwl/stgfb0b79b1-43bf-40c6-aa21-449f4c7f7d83/OceanDNA-b27498.fa --refList OceanDNA-b27498/target_genomes_gtdb.txt --output OceanDNA-b27498/fastani_result_gtdb.tsv --threads 1
[2023-03-19 03:30:01,756] [INFO] Task succeeded: fastANI
[2023-03-19 03:30:01,766] [INFO] Found 19 fastANI hits (1 hits with ANI > circumscription radius)
[2023-03-19 03:30:01,767] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCA_018335895.1	s__Ga0077548 sp018335895	99.1023	800	959	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Ga0077548	95.0	N/A	N/A	N/A	N/A	1	conclusive
GCA_016713235.1	s__Ga0077548 sp016713235	82.2321	677	959	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Ga0077548	95.0	N/A	N/A	N/A	N/A	1	-
GCA_001464215.1	s__Ga0077548 sp001464215	81.9799	663	959	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Ga0077548	95.0	N/A	N/A	N/A	N/A	1	-
GCA_003135415.1	s__Palsa-892 sp003135415	77.56	278	959	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Palsa-892	95.0	96.11	96.04	0.87	0.83	3	-
GCF_001579845.1	s__Z2-YC6860 sp001579845	77.5021	267	959	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Z2-YC6860	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900103365.1	s__Bradyrhizobium sp900103365	77.3251	229	959	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Bradyrhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001590795.1	s__Bradyrhizobium sp001590795	77.2101	256	959	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Bradyrhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900099825.1	s__Bradyrhizobium ottawaense_A	77.1413	239	959	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Bradyrhizobium	95.0	99.73	99.45	0.98	0.96	3	-
GCF_015291645.1	s__Bradyrhizobium sp015291645	77.1309	249	959	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Bradyrhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000426105.1	s__Bradyrhizobium sp000426105	77.1233	216	959	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Bradyrhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000617845.2	s__Bradyrhizobium sp000617845	77.1205	264	959	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Bradyrhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_016616425.1	s__Bradyrhizobium diazoefficiens_E	77.0842	235	959	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Bradyrhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014196925.1	s__Chelatococcus_A caeni	77.0469	210	959	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Beijerinckiaceae;g__Chelatococcus_A	95.0	98.75	98.72	0.96	0.95	3	-
GCF_000935205.1	s__Rhodopseudomonas palustris_E	76.8804	183	959	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Rhodopseudomonas	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001418005.1	s__Chelatococcus_A sambhunathii	76.8721	175	959	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Beijerinckiaceae;g__Chelatococcus_A	95.0	99.33	99.03	0.97	0.94	5	-
GCF_014201415.1	s__Chelatococcus_A composti	76.7474	154	959	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Beijerinckiaceae;g__Chelatococcus_A	95.0	99.84	99.53	0.99	0.98	4	-
GCF_003667445.1	s__Xanthobacter tagetidis	76.7181	202	959	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Xanthobacter	95.0	100.00	100.00	1.00	1.00	2	-
GCF_001429395.1	s__Bosea sp001429395	76.6241	210	959	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Beijerinckiaceae;g__Bosea	95.0	N/A	N/A	N/A	N/A	1	-
GCA_011391185.1	s__JAABRB01 sp011391185	75.6344	59	959	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__SG8-39;g__JAABRB01	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-03-19 03:30:01,767] [INFO] GTDB search result was written to OceanDNA-b27498/result_gtdb.tsv
[2023-03-19 03:30:01,767] [INFO] ===== GTDB Search completed =====
[2023-03-19 03:30:01,770] [INFO] DFAST_QC result json was written to OceanDNA-b27498/dqc_result.json
[2023-03-19 03:30:01,770] [INFO] DFAST_QC completed!
[2023-03-19 03:30:01,770] [INFO] Total running time: 0h1m55s
