[2024-01-24 13:45:57,212] [INFO] DFAST_QC pipeline started.
[2024-01-24 13:45:57,241] [INFO] DFAST_QC version: 0.5.7
[2024-01-24 13:45:57,242] [INFO] DQC Reference Directory: /var/lib/cwl/stg7b55e5bb-d680-4654-9af5-37d06e27d40c/dqc_reference
[2024-01-24 13:45:58,441] [INFO] ===== Start taxonomy check using ANI =====
[2024-01-24 13:45:58,442] [INFO] Task started: Prodigal
[2024-01-24 13:45:58,442] [INFO] Running command: gunzip -c /var/lib/cwl/stgb64e5be5-2dfc-4043-8398-69a39f0322c0/GCF_014199445.1_ASM1419944v1_genomic.fna.gz | prodigal -d GCF_014199445.1_ASM1419944v1_genomic.fna/cds.fna -a GCF_014199445.1_ASM1419944v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2024-01-24 13:46:09,345] [INFO] Task succeeded: Prodigal
[2024-01-24 13:46:09,346] [INFO] Task started: HMMsearch
[2024-01-24 13:46:09,346] [INFO] Running command: hmmsearch --tblout GCF_014199445.1_ASM1419944v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg7b55e5bb-d680-4654-9af5-37d06e27d40c/dqc_reference/reference_markers.hmm GCF_014199445.1_ASM1419944v1_genomic.fna/protein.faa > /dev/null
[2024-01-24 13:46:09,615] [INFO] Task succeeded: HMMsearch
[2024-01-24 13:46:09,617] [INFO] Found 6/6 markers.
[2024-01-24 13:46:09,654] [INFO] Query marker FASTA was written to GCF_014199445.1_ASM1419944v1_genomic.fna/markers.fasta
[2024-01-24 13:46:09,654] [INFO] Task started: Blastn
[2024-01-24 13:46:09,654] [INFO] Running command: blastn -query GCF_014199445.1_ASM1419944v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg7b55e5bb-d680-4654-9af5-37d06e27d40c/dqc_reference/reference_markers.fasta -out GCF_014199445.1_ASM1419944v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 13:46:10,460] [INFO] Task succeeded: Blastn
[2024-01-24 13:46:10,463] [INFO] Selected 29 target genomes.
[2024-01-24 13:46:10,464] [INFO] Target genome list was writen to GCF_014199445.1_ASM1419944v1_genomic.fna/target_genomes.txt
[2024-01-24 13:46:10,559] [INFO] Task started: fastANI
[2024-01-24 13:46:10,560] [INFO] Running command: fastANI --query /var/lib/cwl/stgb64e5be5-2dfc-4043-8398-69a39f0322c0/GCF_014199445.1_ASM1419944v1_genomic.fna.gz --refList GCF_014199445.1_ASM1419944v1_genomic.fna/target_genomes.txt --output GCF_014199445.1_ASM1419944v1_genomic.fna/fastani_result.tsv --threads 1
[2024-01-24 13:46:27,666] [INFO] Task succeeded: fastANI
[2024-01-24 13:46:27,667] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg7b55e5bb-d680-4654-9af5-37d06e27d40c/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2024-01-24 13:46:27,668] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg7b55e5bb-d680-4654-9af5-37d06e27d40c/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2024-01-24 13:46:27,694] [INFO] Found 29 fastANI hits (1 hits with ANI > threshold)
[2024-01-24 13:46:27,694] [INFO] The taxonomy check result is classified as 'conclusive'.
[2024-01-24 13:46:27,695] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Rubricella aquisinus	strain=DSM 103377	GCA_014199445.1	2028108	2028108	type	True	100.0	1069	1069	95	conclusive
Pontivivens insulae	strain=CeCT 8812	GCA_900302495.1	1639689	1639689	type	True	77.3263	217	1069	95	below_threshold
Pontivivens insulae	strain=DSM 103361	GCA_003385715.1	1639689	1639689	type	True	77.3141	218	1069	95	below_threshold
Pontivivens ytuae	strain=MT2928	GCA_015679265.1	2789856	2789856	type	True	77.1753	283	1069	95	below_threshold
Oceanomicrobium pacificus	strain=KN286	GCA_009833495.1	2692916	2692916	type	True	77.1165	228	1069	95	below_threshold
Alexandriicola marinus	strain=LZ-14	GCA_004000435.1	2081710	2081710	type	True	76.9062	175	1069	95	below_threshold
Maribius pelagius	strain=DSM 26893	GCA_900110115.1	387096	387096	type	True	76.8047	152	1069	95	below_threshold
Rhodovulum adriaticum	strain=DSM 2781	GCA_016583705.1	35804	35804	type	True	76.7543	163	1069	95	below_threshold
Rhodovulum adriaticum	strain=DSM 2781	GCA_004345735.1	35804	35804	type	True	76.7227	169	1069	95	below_threshold
Pelagivirga dicentrarchi	strain=YLY04	GCA_003316635.1	2250573	2250573	type	True	76.7193	138	1069	95	below_threshold
Roseinatronobacter thiooxidans	strain=ALG1	GCA_001870675.1	121821	121821	type	True	76.6845	133	1069	95	below_threshold
Salibaculum griseiflavum	strain=WDS4C29	GCA_003129565.1	1914409	1914409	type	True	76.6577	164	1069	95	below_threshold
Phaeovulum veldkampii	strain=DSM 11550	GCA_004363195.1	33049	33049	type	True	76.6416	165	1069	95	below_threshold
Gemmobacter fulva	strain=con5	GCA_018798885.1	2840474	2840474	type	True	76.6385	206	1069	95	below_threshold
Pseudooceanicola algae	strain=Lw-13e	GCA_003590145.2	1537215	1537215	type	True	76.6111	137	1069	95	below_threshold
Phaeovulum veldkampii	strain=DSM 11550	GCA_003034995.1	33049	33049	type	True	76.6075	167	1069	95	below_threshold
Nioella ostreopsis	strain=Z7-4	GCA_004000255.1	2448479	2448479	type	True	76.605	202	1069	95	below_threshold
Paracoccus nototheniae	strain=I-41R45	GCA_004335005.1	2489002	2489002	type	True	76.5701	152	1069	95	below_threshold
Salinihabitans flavidus	strain=DSM 27842	GCA_900110425.1	569882	569882	type	True	76.5461	177	1069	95	below_threshold
Sulfitobacter dubius	strain=DSM 16472	GCA_900113435.1	218673	218673	type	True	76.5395	154	1069	95	below_threshold
Paracoccus mutanolyticus	strain=RSP-02	GCA_003285265.1	1499308	1499308	type	True	76.5018	110	1069	95	below_threshold
Thalassorhabdomicrobium marinisediminis	strain=BH-SD16	GCA_003072065.1	2170577	2170577	type	True	76.4863	142	1069	95	below_threshold
Albimonas donghaensis	strain=DSM 17890	GCA_900106695.1	356660	356660	type	True	76.4142	151	1069	95	below_threshold
Oceanicella actignis	strain=DSM 22673	GCA_008124525.1	1189325	1189325	type	True	76.3779	138	1069	95	below_threshold
Salipiger bermudensis	strain=HTCC2601	GCA_000153725.1	344736	344736	type	True	76.3753	147	1069	95	below_threshold
Rhodovulum imhoffii	strain=DSM 18064	GCA_016653305.1	365340	365340	type	True	76.3011	105	1069	95	below_threshold
Rhodovulum imhoffii	strain=DSM 18064	GCA_003046545.1	365340	365340	type	True	76.2833	106	1069	95	below_threshold
Rubellimicrobium aerolatum	strain=DSM 19297	GCA_017872975.1	490979	490979	type	True	76.0986	89	1069	95	below_threshold
Paracoccus tegillarcae	strain=BM15	GCA_002847305.1	1529068	1529068	type	True	76.0676	138	1069	95	below_threshold
--------------------------------------------------------------------------------
[2024-01-24 13:46:27,697] [INFO] DFAST Taxonomy check result was written to GCF_014199445.1_ASM1419944v1_genomic.fna/tc_result.tsv
[2024-01-24 13:46:27,698] [INFO] ===== Taxonomy check completed =====
[2024-01-24 13:46:27,698] [INFO] ===== Start completeness check using CheckM =====
[2024-01-24 13:46:27,698] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg7b55e5bb-d680-4654-9af5-37d06e27d40c/dqc_reference/checkm_data
[2024-01-24 13:46:27,700] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2024-01-24 13:46:27,735] [INFO] Task started: CheckM
[2024-01-24 13:46:27,735] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_014199445.1_ASM1419944v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_014199445.1_ASM1419944v1_genomic.fna/checkm_input GCF_014199445.1_ASM1419944v1_genomic.fna/checkm_result
[2024-01-24 13:47:03,713] [INFO] Task succeeded: CheckM
[2024-01-24 13:47:03,714] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2024-01-24 13:47:03,734] [INFO] ===== Completeness check finished =====
[2024-01-24 13:47:03,735] [INFO] ===== Start GTDB Search =====
[2024-01-24 13:47:03,735] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_014199445.1_ASM1419944v1_genomic.fna/markers.fasta)
[2024-01-24 13:47:03,735] [INFO] Task started: Blastn
[2024-01-24 13:47:03,736] [INFO] Running command: blastn -query GCF_014199445.1_ASM1419944v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg7b55e5bb-d680-4654-9af5-37d06e27d40c/dqc_reference/reference_markers_gtdb.fasta -out GCF_014199445.1_ASM1419944v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 13:47:05,151] [INFO] Task succeeded: Blastn
[2024-01-24 13:47:05,156] [INFO] Selected 27 target genomes.
[2024-01-24 13:47:05,156] [INFO] Target genome list was writen to GCF_014199445.1_ASM1419944v1_genomic.fna/target_genomes_gtdb.txt
[2024-01-24 13:47:05,235] [INFO] Task started: fastANI
[2024-01-24 13:47:05,235] [INFO] Running command: fastANI --query /var/lib/cwl/stgb64e5be5-2dfc-4043-8398-69a39f0322c0/GCF_014199445.1_ASM1419944v1_genomic.fna.gz --refList GCF_014199445.1_ASM1419944v1_genomic.fna/target_genomes_gtdb.txt --output GCF_014199445.1_ASM1419944v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2024-01-24 13:47:21,663] [INFO] Task succeeded: fastANI
[2024-01-24 13:47:21,687] [INFO] Found 27 fastANI hits (1 hits with ANI > circumscription radius)
[2024-01-24 13:47:21,687] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_014199445.1	s__Rubricella aquisinus	100.0	1069	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rubricella	95.0	N/A	N/A	N/A	N/A	1	conclusive
GCF_011290545.1	s__Monaibacterium sp011290545	77.649	231	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Monaibacterium	95.0	96.75	96.75	0.88	0.88	2	-
GCF_900302495.1	s__Pontivivens insulae	77.3399	216	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Pontivivens	95.0	100.00	100.00	1.00	1.00	2	-
GCF_015679265.1	s__MT2928 sp015679265	77.1753	283	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__MT2928	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003993775.1	s__Frigidibacter sp003993775	76.722	166	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Frigidibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003316635.1	s__Roseovarius dicentrarchi	76.7193	138	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius	95.0	N/A	N/A	N/A	N/A	1	-
GCA_015487165.1	s__UBA5972 sp015487165	76.7043	109	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__UBA5972	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001879695.1	s__Nioella sediminis	76.6944	181	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Nioella	95.0	100.00	100.00	0.99	0.99	2	-
GCF_003129565.1	s__Salibaculum griseiflavum	76.6577	164	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Salibaculum	95.0	N/A	N/A	N/A	N/A	1	-
GCA_018798885.1	s__Gemmobacter sp018798885	76.6385	206	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Gemmobacter	95.0	100.00	100.00	1.00	1.00	2	-
GCF_004000255.1	s__Nioella ostreopsis	76.6149	201	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Nioella	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003590145.2	s__Pseudooceanicola algae	76.6111	137	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Pseudooceanicola	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900110425.1	s__Salinihabitans flavidus	76.5461	177	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Salinihabitans	95.0	N/A	N/A	N/A	N/A	1	-
GCA_018524345.1	s__KMM-3653 sp018524345	76.5447	153	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__KMM-3653	95.0	N/A	N/A	N/A	N/A	1	-
GCF_012395815.1	s__Roseicyclus sp012395815	76.5423	181	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseicyclus	95.0	N/A	N/A	N/A	N/A	1	-
GCF_008086115.1	s__Muriiphilus fusiformis	76.5421	151	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Muriiphilus	95.0	98.34	98.34	0.92	0.92	2	-
GCF_900113435.1	s__Sulfitobacter dubius	76.5395	154	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter	95.0	97.36	97.36	0.82	0.82	2	-
GCA_014359745.1	s__JACIYW01 sp014359745	76.5276	146	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__JACIYW01	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003072065.1	s__Thalassorhabdomicrobium marinisediminis	76.4724	143	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Thalassorhabdomicrobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_008124525.1	s__Oceanicella actignis	76.4329	134	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Oceanicella	95.0	98.87	98.87	0.96	0.96	3	-
GCF_900106695.1	s__Albimonas donghaensis	76.4142	151	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Albimonas	95.0	97.11	97.08	0.90	0.88	3	-
GCA_007116445.1	s__Roseinatronobacter sp007116445	76.3643	76	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseinatronobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCA_014762635.1	s__JABURR01 sp014762635	76.3103	163	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__JABURR01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002280515.1	s__Albidovulum sp002280515	76.2691	135	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Albidovulum	95.0	N/A	N/A	N/A	N/A	1	-
GCA_001314655.1	s__Salibaculum sp001314655	76.219	185	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Salibaculum	95.0	N/A	N/A	N/A	N/A	1	-
GCF_017872975.1	s__Rubellimicrobium aerolatum	76.0986	89	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rubellimicrobium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_013151335.1	s__UBA5972 sp013151335	75.9272	75	1069	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__UBA5972	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2024-01-24 13:47:21,689] [INFO] GTDB search result was written to GCF_014199445.1_ASM1419944v1_genomic.fna/result_gtdb.tsv
[2024-01-24 13:47:21,690] [INFO] ===== GTDB Search completed =====
[2024-01-24 13:47:21,695] [INFO] DFAST_QC result json was written to GCF_014199445.1_ASM1419944v1_genomic.fna/dqc_result.json
[2024-01-24 13:47:21,695] [INFO] DFAST_QC completed!
[2024-01-24 13:47:21,695] [INFO] Total running time: 0h1m24s
