[2024-01-24 15:02:07,440] [INFO] DFAST_QC pipeline started.
[2024-01-24 15:02:07,443] [INFO] DFAST_QC version: 0.5.7
[2024-01-24 15:02:07,443] [INFO] DQC Reference Directory: /var/lib/cwl/stge908608a-8e31-4e61-b028-222f5e97186e/dqc_reference
[2024-01-24 15:02:09,600] [INFO] ===== Start taxonomy check using ANI =====
[2024-01-24 15:02:09,601] [INFO] Task started: Prodigal
[2024-01-24 15:02:09,602] [INFO] Running command: gunzip -c /var/lib/cwl/stgc69b4951-dc12-4fc5-aea9-c067f65a04b1/GCF_008711325.1_ASM871132v1_genomic.fna.gz | prodigal -d GCF_008711325.1_ASM871132v1_genomic.fna/cds.fna -a GCF_008711325.1_ASM871132v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2024-01-24 15:02:24,380] [INFO] Task succeeded: Prodigal
[2024-01-24 15:02:24,381] [INFO] Task started: HMMsearch
[2024-01-24 15:02:24,381] [INFO] Running command: hmmsearch --tblout GCF_008711325.1_ASM871132v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stge908608a-8e31-4e61-b028-222f5e97186e/dqc_reference/reference_markers.hmm GCF_008711325.1_ASM871132v1_genomic.fna/protein.faa > /dev/null
[2024-01-24 15:02:24,655] [INFO] Task succeeded: HMMsearch
[2024-01-24 15:02:24,656] [INFO] Found 6/6 markers.
[2024-01-24 15:02:24,703] [INFO] Query marker FASTA was written to GCF_008711325.1_ASM871132v1_genomic.fna/markers.fasta
[2024-01-24 15:02:24,704] [INFO] Task started: Blastn
[2024-01-24 15:02:24,704] [INFO] Running command: blastn -query GCF_008711325.1_ASM871132v1_genomic.fna/markers.fasta -db /var/lib/cwl/stge908608a-8e31-4e61-b028-222f5e97186e/dqc_reference/reference_markers.fasta -out GCF_008711325.1_ASM871132v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 15:02:25,523] [INFO] Task succeeded: Blastn
[2024-01-24 15:02:25,527] [INFO] Selected 20 target genomes.
[2024-01-24 15:02:25,527] [INFO] Target genome list was writen to GCF_008711325.1_ASM871132v1_genomic.fna/target_genomes.txt
[2024-01-24 15:02:25,537] [INFO] Task started: fastANI
[2024-01-24 15:02:25,537] [INFO] Running command: fastANI --query /var/lib/cwl/stgc69b4951-dc12-4fc5-aea9-c067f65a04b1/GCF_008711325.1_ASM871132v1_genomic.fna.gz --refList GCF_008711325.1_ASM871132v1_genomic.fna/target_genomes.txt --output GCF_008711325.1_ASM871132v1_genomic.fna/fastani_result.tsv --threads 1
[2024-01-24 15:02:45,844] [INFO] Task succeeded: fastANI
[2024-01-24 15:02:45,845] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stge908608a-8e31-4e61-b028-222f5e97186e/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2024-01-24 15:02:45,845] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stge908608a-8e31-4e61-b028-222f5e97186e/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2024-01-24 15:02:45,860] [INFO] Found 17 fastANI hits (1 hits with ANI > threshold)
[2024-01-24 15:02:45,860] [INFO] The taxonomy check result is classified as 'conclusive'.
[2024-01-24 15:02:45,861] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Roseibium sediminis	strain=KCTC 52373	GCA_008711325.1	1775174	1775174	type	True	100.0	1613	1615	95	conclusive
Roseibium aestuarii	strain=SYSU M00256-3	GCA_008477545.1	2600299	2600299	type	True	78.7094	390	1615	95	below_threshold
Roseibium suaedae	strain=DSM 22153	GCA_900142725.1	735517	735517	type	True	78.6695	427	1615	95	below_threshold
Roseibium litorale	strain=4C16A	GCA_014842915.1	2803841	2803841	type	True	78.4766	406	1615	95	below_threshold
Roseibium aggregatum	strain=IAM 12614	GCA_000168975.1	187304	187304	suspected-type	True	78.4124	474	1615	95	below_threshold
Roseibium polysiphoniae	strain=KACC 19711	GCA_014842925.1	2571221	2571221	type	True	78.0682	368	1615	95	below_threshold
Pannonibacter carbonis	strain=Q4.6	GCA_003012935.1	2067569	2067569	type	True	78.0228	327	1615	95	below_threshold
Pannonibacter indicus	strain=DSM 23407	GCA_001517385.1	466044	466044	type	True	77.9361	348	1615	95	below_threshold
Pannonibacter indicus	strain=DSM 23407	GCA_001418225.1	466044	466044	type	True	77.9271	349	1615	95	below_threshold
Roseibium aquae	strain=CGMCC 1.12426	GCA_014637575.1	1323746	1323746	type	True	77.8573	349	1615	95	below_threshold
Pannonibacter phragmitetus	strain=NCTC13350	GCA_900454465.1	121719	121719	suspected-type	True	77.8217	355	1615	95	below_threshold
Devosia insulae	strain=DS-56	GCA_000970465.2	408174	408174	type	True	76.6338	77	1615	95	below_threshold
Neorhizobium lilium	strain=24NR	GCA_004053875.1	2503024	2503024	type	True	76.4105	111	1615	95	below_threshold
Pedomonas mirosovicensis	strain=A1X5R2	GCA_022569295.1	2908641	2908641	type	True	76.2978	53	1615	95	below_threshold
Bradyrhizobium frederickii	strain=CNPSo 3426	GCA_004570865.1	2560054	2560054	type	True	76.2187	113	1615	95	below_threshold
Bradyrhizobium lablabi	strain=CCBAU 23086	GCA_001440475.1	722472	722472	suspected-type	True	76.002	115	1615	95	below_threshold
Bradyrhizobium embrapense	strain=SEMIA 6208	GCA_001189235.2	630921	630921	type	True	75.952	121	1615	95	below_threshold
--------------------------------------------------------------------------------
[2024-01-24 15:02:45,862] [INFO] DFAST Taxonomy check result was written to GCF_008711325.1_ASM871132v1_genomic.fna/tc_result.tsv
[2024-01-24 15:02:45,863] [INFO] ===== Taxonomy check completed =====
[2024-01-24 15:02:45,863] [INFO] ===== Start completeness check using CheckM =====
[2024-01-24 15:02:45,863] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stge908608a-8e31-4e61-b028-222f5e97186e/dqc_reference/checkm_data
[2024-01-24 15:02:45,865] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2024-01-24 15:02:45,915] [INFO] Task started: CheckM
[2024-01-24 15:02:45,915] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_008711325.1_ASM871132v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_008711325.1_ASM871132v1_genomic.fna/checkm_input GCF_008711325.1_ASM871132v1_genomic.fna/checkm_result
[2024-01-24 15:03:34,784] [INFO] Task succeeded: CheckM
[2024-01-24 15:03:34,786] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 4.17%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2024-01-24 15:03:34,805] [INFO] ===== Completeness check finished =====
[2024-01-24 15:03:34,806] [INFO] ===== Start GTDB Search =====
[2024-01-24 15:03:34,806] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_008711325.1_ASM871132v1_genomic.fna/markers.fasta)
[2024-01-24 15:03:34,806] [INFO] Task started: Blastn
[2024-01-24 15:03:34,806] [INFO] Running command: blastn -query GCF_008711325.1_ASM871132v1_genomic.fna/markers.fasta -db /var/lib/cwl/stge908608a-8e31-4e61-b028-222f5e97186e/dqc_reference/reference_markers_gtdb.fasta -out GCF_008711325.1_ASM871132v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 15:03:36,165] [INFO] Task succeeded: Blastn
[2024-01-24 15:03:36,169] [INFO] Selected 18 target genomes.
[2024-01-24 15:03:36,169] [INFO] Target genome list was writen to GCF_008711325.1_ASM871132v1_genomic.fna/target_genomes_gtdb.txt
[2024-01-24 15:03:36,185] [INFO] Task started: fastANI
[2024-01-24 15:03:36,185] [INFO] Running command: fastANI --query /var/lib/cwl/stgc69b4951-dc12-4fc5-aea9-c067f65a04b1/GCF_008711325.1_ASM871132v1_genomic.fna.gz --refList GCF_008711325.1_ASM871132v1_genomic.fna/target_genomes_gtdb.txt --output GCF_008711325.1_ASM871132v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2024-01-24 15:03:54,642] [INFO] Task succeeded: fastANI
[2024-01-24 15:03:54,657] [INFO] Found 18 fastANI hits (1 hits with ANI > circumscription radius)
[2024-01-24 15:03:54,657] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_008711325.1	s__Roseibium sediminis	100.0	1613	1615	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Roseibium	95.0	N/A	N/A	N/A	N/A	1	conclusive
GCF_000148725.1	s__Roseibium sp000148725	80.5056	701	1615	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Roseibium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_008477545.1	s__Roseibium aestuarii	78.6993	391	1615	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Roseibium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900142725.1	s__Roseibium suaedae	78.6795	426	1615	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Roseibium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001999245.1	s__Roseibium aggregatum_A	78.6776	433	1615	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Roseibium	95.0	97.61	96.85	0.93	0.89	16	-
GCF_002237595.1	s__Roseibium sp002237595	78.4575	478	1615	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Roseibium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014692675.1	s__Roseibium aggregatum_C	78.3157	443	1615	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Roseibium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003075075.1	s__Roseibium sp003075075	78.2577	396	1615	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Roseibium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014842925.1	s__Roseibium polysiphoniae	78.0682	368	1615	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Roseibium	95.0	96.20	96.20	0.93	0.93	2	-
GCF_003012935.1	s__Pannonibacter carbonis	78.0228	327	1615	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Pannonibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_009650605.1	s__Roseibium sp009650605	78.0183	322	1615	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Roseibium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001418225.1	s__Pannonibacter indicus	77.9262	349	1615	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Pannonibacter	95.0	96.01	95.55	0.94	0.89	12	-
GCF_000382365.1	s__Pannonibacter phragmitetus	77.7657	358	1615	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Pannonibacter	95.0	99.99	99.99	1.00	1.00	2	-
GCA_017643215.1	s__Roseibium sp017643215	77.6244	377	1615	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Roseibium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000282615.1	s__Bradyrhizobium sp000282615	76.222	114	1615	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Bradyrhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_004570865.1	s__Bradyrhizobium frederickii	76.2187	113	1615	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Bradyrhizobium	95.0	97.35	97.32	0.89	0.89	3	-
GCF_001440475.1	s__Bradyrhizobium lablabi	76.002	115	1615	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Bradyrhizobium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_003152455.1	s__Pseudolabrys sp003152455	75.8265	56	1615	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudolabrys	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2024-01-24 15:03:54,659] [INFO] GTDB search result was written to GCF_008711325.1_ASM871132v1_genomic.fna/result_gtdb.tsv
[2024-01-24 15:03:54,660] [INFO] ===== GTDB Search completed =====
[2024-01-24 15:03:54,664] [INFO] DFAST_QC result json was written to GCF_008711325.1_ASM871132v1_genomic.fna/dqc_result.json
[2024-01-24 15:03:54,664] [INFO] DFAST_QC completed!
[2024-01-24 15:03:54,664] [INFO] Total running time: 0h1m47s
