[2023-06-27 21:27:04,853] [INFO] DFAST_QC pipeline started.
[2023-06-27 21:27:04,855] [INFO] DFAST_QC version: 0.5.7
[2023-06-27 21:27:04,855] [INFO] DQC Reference Directory: /var/lib/cwl/stg35c5aa92-5d23-41d3-862d-7e60c699d9f2/dqc_reference
[2023-06-27 21:27:06,009] [INFO] ===== Start taxonomy check using ANI =====
[2023-06-27 21:27:06,010] [INFO] Task started: Prodigal
[2023-06-27 21:27:06,010] [INFO] Running command: gunzip -c /var/lib/cwl/stga9624f0c-bae1-4170-ae2b-1fd5eb612676/GCA_002869085.1_ASM286908v1_genomic.fna.gz | prodigal -d GCA_002869085.1_ASM286908v1_genomic.fna/cds.fna -a GCA_002869085.1_ASM286908v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2023-06-27 21:27:14,762] [INFO] Task succeeded: Prodigal
[2023-06-27 21:27:14,763] [INFO] Task started: HMMsearch
[2023-06-27 21:27:14,763] [INFO] Running command: hmmsearch --tblout GCA_002869085.1_ASM286908v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg35c5aa92-5d23-41d3-862d-7e60c699d9f2/dqc_reference/reference_markers.hmm GCA_002869085.1_ASM286908v1_genomic.fna/protein.faa > /dev/null
[2023-06-27 21:27:14,969] [INFO] Task succeeded: HMMsearch
[2023-06-27 21:27:14,970] [INFO] Found 6/6 markers.
[2023-06-27 21:27:15,001] [INFO] Query marker FASTA was written to GCA_002869085.1_ASM286908v1_genomic.fna/markers.fasta
[2023-06-27 21:27:15,002] [INFO] Task started: Blastn
[2023-06-27 21:27:15,002] [INFO] Running command: blastn -query GCA_002869085.1_ASM286908v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg35c5aa92-5d23-41d3-862d-7e60c699d9f2/dqc_reference/reference_markers.fasta -out GCA_002869085.1_ASM286908v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-06-27 21:27:15,753] [INFO] Task succeeded: Blastn
[2023-06-27 21:27:15,757] [INFO] Selected 34 target genomes.
[2023-06-27 21:27:15,757] [INFO] Target genome list was writen to GCA_002869085.1_ASM286908v1_genomic.fna/target_genomes.txt
[2023-06-27 21:27:15,760] [INFO] Task started: fastANI
[2023-06-27 21:27:15,760] [INFO] Running command: fastANI --query /var/lib/cwl/stga9624f0c-bae1-4170-ae2b-1fd5eb612676/GCA_002869085.1_ASM286908v1_genomic.fna.gz --refList GCA_002869085.1_ASM286908v1_genomic.fna/target_genomes.txt --output GCA_002869085.1_ASM286908v1_genomic.fna/fastani_result.tsv --threads 1
[2023-06-27 21:27:41,076] [INFO] Task succeeded: fastANI
[2023-06-27 21:27:41,077] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg35c5aa92-5d23-41d3-862d-7e60c699d9f2/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-06-27 21:27:41,077] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg35c5aa92-5d23-41d3-862d-7e60c699d9f2/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-06-27 21:27:41,100] [INFO] Found 34 fastANI hits (0 hits with ANI > threshold)
[2023-06-27 21:27:41,100] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-06-27 21:27:41,101] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Rhodobium orientis	strain=DSM 11290	GCA_003258835.1	34017	34017	type	True	76.9064	206	955	95	below_threshold
Tepidicaulis marinus	strain=MA2	GCA_000739695.1	1333998	1333998	type	True	76.84	122	955	95	below_threshold
Tepidamorphus gemmatus	strain=DSM 19345	GCA_004346195.1	747076	747076	type	True	76.7119	134	955	95	below_threshold
Bauldia litoralis	strain=ATCC 35022	GCA_900104485.1	665467	665467	type	True	76.661	143	955	95	below_threshold
Blastochloris tepida	strain=GI	GCA_003966715.1	2233851	2233851	type	True	76.655	160	955	95	below_threshold
Ancylobacter aquaticus	strain=DSM 101	GCA_004339465.1	100	100	type	True	76.6409	135	955	95	below_threshold
Rhodoligotrophos defluvii	strain=lm1	GCA_005281615.1	2561934	2561934	type	True	76.565	113	955	95	below_threshold
Cucumibacter marinus	strain=DSM 18995	GCA_000429865.1	1121252	1121252	type	True	76.4807	113	955	95	below_threshold
Methyloceanibacter marginalis	strain=R-67177	GCA_001723295.1	1774971	1774971	type	True	76.4658	74	955	95	below_threshold
Starkeya novella	strain=DSM 506	GCA_000092925.1	921	921	type	True	76.4344	140	955	95	below_threshold
Pseudorhodoplanes sinuspersici	strain=RIPI110	GCA_002119765.1	1235591	1235591	type	True	76.434	89	955	95	below_threshold
Pseudorhodoplanes sinuspersici	strain=CECT 8374	GCA_003610435.1	1235591	1235591	type	True	76.413	90	955	95	below_threshold
Rhodomicrobium lacus	strain=JA980	GCA_003992725.1	2498452	2498452	type	True	76.3436	68	955	95	below_threshold
Methyloceanibacter stevinii	strain=R-67176	GCA_001723355.1	1774970	1774970	type	True	76.3325	79	955	95	below_threshold
Camelimonas fluminis	strain=KCTC 42282	GCA_014656355.1	1576911	1576911	type	True	76.2887	106	955	95	below_threshold
Bosea vaviloviae	strain=Vaf18	GCA_001741865.1	1526658	1526658	type	True	76.2452	155	955	95	below_threshold
Oceanicella actignis	strain=DSM 22673	GCA_008124525.1	1189325	1189325	type	True	76.2236	135	955	95	below_threshold
Oricola indica	strain=JL-62	GCA_019966595.1	2872591	2872591	type	True	76.1489	104	955	95	below_threshold
Shinella pollutisoli	strain=KCTC 52677	GCA_024609765.1	2250594	2250594	type	True	76.1331	128	955	95	below_threshold
Blastochloris viridis	strain=ATCC 19567	GCA_001402875.1	1079	1079	type	True	76.1248	140	955	95	below_threshold
Chelatococcus asaccharovorans	strain=DSM 6462	GCA_018398275.1	28210	28210	type	True	76.0867	113	955	95	below_threshold
Chelatococcus asaccharovorans	strain=DSM 6462	GCA_003201475.1	28210	28210	type	True	76.0866	113	955	95	below_threshold
Azorhizobium caulinodans	strain=ORS 571	GCA_000010525.1	7	7	type	True	76.0646	123	955	95	below_threshold
Rhodopseudomonas pentothenatexigens	strain=JA575	GCA_003385925.1	999699	999699	type	True	76.0519	139	955	95	below_threshold
Rhodopseudomonas pentothenatexigens	strain=JA575	GCA_900218015.1	999699	999699	type	True	76.0519	139	955	95	below_threshold
Neomegalonema perideroedes	strain=DSM 15528	GCA_000374145.1	217219	217219	type	True	76.047	76	955	95	below_threshold
Shinella kummerowiae	strain=CCBAU 25048	GCA_009827055.1	417745	417745	type	True	75.9597	96	955	95	below_threshold
Methylobacterium terrae	strain=17Sr1-28	GCA_003173755.1	2202827	2202827	type	True	75.8531	135	955	95	below_threshold
Celeribacter neptunius	strain=DSM 26471	GCA_900113955.1	588602	588602	type	True	75.8137	75	955	95	below_threshold
Roseococcus pinisoli	strain=XZZS9	GCA_018413645.1	2835040	2835040	type	True	75.8121	74	955	95	below_threshold
Rhodovastum atsumiense	strain=G2-11	GCA_937425535.1	504468	504468	type	True	75.8054	146	955	95	below_threshold
Rhodovarius lipocyclicus	strain=CCUG 44693	GCA_009900765.1	268410	268410	type	True	75.7581	108	955	95	below_threshold
Shinella yambaruensis	strain=DSM 18801	GCA_022899355.1	415996	415996	type	True	75.7519	143	955	95	below_threshold
Cohaesibacter intestini	strain=YE-B6	GCA_003324485.1	2211145	2211145	type	True	75.6325	76	955	95	below_threshold
--------------------------------------------------------------------------------
[2023-06-27 21:27:41,103] [INFO] DFAST Taxonomy check result was written to GCA_002869085.1_ASM286908v1_genomic.fna/tc_result.tsv
[2023-06-27 21:27:41,103] [INFO] ===== Taxonomy check completed =====
[2023-06-27 21:27:41,103] [INFO] ===== Start completeness check using CheckM =====
[2023-06-27 21:27:41,103] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg35c5aa92-5d23-41d3-862d-7e60c699d9f2/dqc_reference/checkm_data
[2023-06-27 21:27:41,104] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-06-27 21:27:41,140] [INFO] Task started: CheckM
[2023-06-27 21:27:41,140] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_002869085.1_ASM286908v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_002869085.1_ASM286908v1_genomic.fna/checkm_input GCA_002869085.1_ASM286908v1_genomic.fna/checkm_result
[2023-06-27 21:28:10,561] [INFO] Task succeeded: CheckM
[2023-06-27 21:28:10,562] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 0.46%
Strain heterogeneity: 100.00%
--------------------------------------------------------------------------------
[2023-06-27 21:28:10,582] [INFO] ===== Completeness check finished =====
[2023-06-27 21:28:10,582] [INFO] ===== Start GTDB Search =====
[2023-06-27 21:28:10,582] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_002869085.1_ASM286908v1_genomic.fna/markers.fasta)
[2023-06-27 21:28:10,582] [INFO] Task started: Blastn
[2023-06-27 21:28:10,582] [INFO] Running command: blastn -query GCA_002869085.1_ASM286908v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg35c5aa92-5d23-41d3-862d-7e60c699d9f2/dqc_reference/reference_markers_gtdb.fasta -out GCA_002869085.1_ASM286908v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-06-27 21:28:11,922] [INFO] Task succeeded: Blastn
[2023-06-27 21:28:11,934] [INFO] Selected 28 target genomes.
[2023-06-27 21:28:11,934] [INFO] Target genome list was writen to GCA_002869085.1_ASM286908v1_genomic.fna/target_genomes_gtdb.txt
[2023-06-27 21:28:11,949] [INFO] Task started: fastANI
[2023-06-27 21:28:11,949] [INFO] Running command: fastANI --query /var/lib/cwl/stga9624f0c-bae1-4170-ae2b-1fd5eb612676/GCA_002869085.1_ASM286908v1_genomic.fna.gz --refList GCA_002869085.1_ASM286908v1_genomic.fna/target_genomes_gtdb.txt --output GCA_002869085.1_ASM286908v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2023-06-27 21:28:30,853] [INFO] Task succeeded: fastANI
[2023-06-27 21:28:30,874] [INFO] Found 28 fastANI hits (1 hits with ANI > circumscription radius)
[2023-06-27 21:28:30,874] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCA_002869085.1	s__BM303 sp002869085	100.0	940	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__BM303;g__BM303	95.0	N/A	N/A	N/A	N/A	1	conclusive
GCF_013341275.1	s__Methyloligella sp013341275	77.1531	149	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Methyloligellaceae;g__Methyloligella	95.0	N/A	N/A	N/A	N/A	1	-
GCA_003576695.1	s__Nordella sp003576695	76.9454	146	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Aestuariivirgaceae;g__Nordella	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000739695.1	s__Tepidicaulis marinus	76.84	122	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Parvibaculales;f__Parvibaculaceae;g__Tepidicaulis	95.0	98.55	98.55	0.94	0.94	2	-
GCA_003232175.1	s__SZUA-8 sp003232175	76.8234	54	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__SZUA-8;g__SZUA-8	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900104485.1	s__Bauldia litoralis	76.661	143	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Kaistiaceae;g__Bauldia	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003966715.1	s__Blastochloris tepida	76.6421	161	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Blastochloris	95.0	N/A	N/A	N/A	N/A	1	-
GCA_001898995.1	s__63-22 sp001898995	76.5596	123	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales_A;f__Rhizobiaceae_A;g__63-22	95.0	99.99	99.99	0.99	0.99	2	-
GCF_016595235.1	s__Nordella sp016595235	76.5346	151	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Aestuariivirgaceae;g__Nordella	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900156025.1	s__Bosea sp900156025	76.5068	127	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Beijerinckiaceae;g__Bosea	95.0	99.08	99.08	0.82	0.82	2	-
GCF_001938945.1	s__Pararhizobium rhizosphaerae	76.5019	143	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Pararhizobium	95.0	97.96	97.96	0.94	0.94	2	-
GCF_001723295.1	s__Methyloceanibacter marginalis	76.4658	74	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Methyloligellaceae;g__Methyloceanibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002304505.1	s__NCEH01 sp002304505	76.4563	127	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__NCEH01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002389665.1	s__Tepidicaulis sp002389665	76.4536	116	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Parvibaculales;f__Parvibaculaceae;g__Tepidicaulis	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002119765.1	s__Pseudorhodoplanes sinuspersici	76.434	89	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudorhodoplanes	95.0	100.00	100.00	1.00	1.00	2	-
GCF_002198715.1	s__R-RK-3 sp002198715	76.4175	176	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhodomicrobiaceae;g__R-RK-3	95.0	99.98	99.98	0.99	0.99	2	-
GCF_014656355.1	s__Camelimonas fluminis	76.2829	105	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Beijerinckiaceae;g__Camelimonas	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014692675.1	s__Roseibium aggregatum_C	76.2675	131	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Roseibium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_009915765.1	s__Pannonibacter sp009915765	76.2609	135	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Pannonibacter	95.0	98.76	98.76	0.96	0.96	2	-
GCA_005502925.1	s__Nordella sp005502925	76.2594	149	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Aestuariivirgaceae;g__Nordella	95.0	100.00	100.00	1.00	1.00	3	-
GCF_001423265.1	s__Methylobacterium sp001423265	76.1255	95	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Beijerinckiaceae;g__Methylobacterium	95.0	98.78	98.73	0.95	0.94	3	-
GCF_000374145.1	s__Neomegalonema perideroedes	76.0264	77	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Neomegalonema	95.0	N/A	N/A	N/A	N/A	1	-
GCF_017309185.1	s__Roseibium aggregatum_B	75.9146	107	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Roseibium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002383105.1	s__Methyloceanibacter sp002383105	75.9036	77	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Methyloligellaceae;g__Methyloceanibacter	95.0	99.16	99.02	0.80	0.75	10	-
GCA_007131685.1	s__Rhodobaculum sp007131685	75.6934	66	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodobaculum	95.0	99.44	99.41	0.86	0.86	3	-
GCA_003258405.1	s__Amaricoccus sp003258405	75.6358	89	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Amaricoccus	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003324485.1	s__Cohaesibacter intestini	75.6325	76	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Cohaesibacteraceae;g__Cohaesibacter	95.0	98.38	98.38	0.96	0.96	2	-
GCA_903930095.1	s__Aestuariivirga sp903930095	75.5235	76	955	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Aestuariivirgaceae;g__Aestuariivirga	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-06-27 21:28:30,876] [INFO] GTDB search result was written to GCA_002869085.1_ASM286908v1_genomic.fna/result_gtdb.tsv
[2023-06-27 21:28:30,877] [INFO] ===== GTDB Search completed =====
[2023-06-27 21:28:30,882] [INFO] DFAST_QC result json was written to GCA_002869085.1_ASM286908v1_genomic.fna/dqc_result.json
[2023-06-27 21:28:30,882] [INFO] DFAST_QC completed!
[2023-06-27 21:28:30,882] [INFO] Total running time: 0h1m26s
