[2024-01-25 20:19:05,573] [INFO] DFAST_QC pipeline started.
[2024-01-25 20:19:05,577] [INFO] DFAST_QC version: 0.5.7
[2024-01-25 20:19:05,577] [INFO] DQC Reference Directory: /var/lib/cwl/stg2fc771f6-4ac3-4b6b-bf62-7ba63e2ec399/dqc_reference
[2024-01-25 20:19:06,706] [INFO] ===== Start taxonomy check using ANI =====
[2024-01-25 20:19:06,707] [INFO] Task started: Prodigal
[2024-01-25 20:19:06,707] [INFO] Running command: gunzip -c /var/lib/cwl/stg131a6f51-d4ac-44b3-aa4f-9b6d0cdab7d1/GCF_003336875.1_ASM333687v1_genomic.fna.gz | prodigal -d GCF_003336875.1_ASM333687v1_genomic.fna/cds.fna -a GCF_003336875.1_ASM333687v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2024-01-25 20:19:19,528] [INFO] Task succeeded: Prodigal
[2024-01-25 20:19:19,528] [INFO] Task started: HMMsearch
[2024-01-25 20:19:19,528] [INFO] Running command: hmmsearch --tblout GCF_003336875.1_ASM333687v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg2fc771f6-4ac3-4b6b-bf62-7ba63e2ec399/dqc_reference/reference_markers.hmm GCF_003336875.1_ASM333687v1_genomic.fna/protein.faa > /dev/null
[2024-01-25 20:19:19,789] [INFO] Task succeeded: HMMsearch
[2024-01-25 20:19:19,790] [INFO] Found 6/6 markers.
[2024-01-25 20:19:19,836] [INFO] Query marker FASTA was written to GCF_003336875.1_ASM333687v1_genomic.fna/markers.fasta
[2024-01-25 20:19:19,836] [INFO] Task started: Blastn
[2024-01-25 20:19:19,836] [INFO] Running command: blastn -query GCF_003336875.1_ASM333687v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg2fc771f6-4ac3-4b6b-bf62-7ba63e2ec399/dqc_reference/reference_markers.fasta -out GCF_003336875.1_ASM333687v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-25 20:19:20,770] [INFO] Task succeeded: Blastn
[2024-01-25 20:19:20,774] [INFO] Selected 21 target genomes.
[2024-01-25 20:19:20,775] [INFO] Target genome list was writen to GCF_003336875.1_ASM333687v1_genomic.fna/target_genomes.txt
[2024-01-25 20:19:20,798] [INFO] Task started: fastANI
[2024-01-25 20:19:20,798] [INFO] Running command: fastANI --query /var/lib/cwl/stg131a6f51-d4ac-44b3-aa4f-9b6d0cdab7d1/GCF_003336875.1_ASM333687v1_genomic.fna.gz --refList GCF_003336875.1_ASM333687v1_genomic.fna/target_genomes.txt --output GCF_003336875.1_ASM333687v1_genomic.fna/fastani_result.tsv --threads 1
[2024-01-25 20:19:43,275] [INFO] Task succeeded: fastANI
[2024-01-25 20:19:43,276] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg2fc771f6-4ac3-4b6b-bf62-7ba63e2ec399/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2024-01-25 20:19:43,276] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg2fc771f6-4ac3-4b6b-bf62-7ba63e2ec399/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2024-01-25 20:19:43,289] [INFO] Found 21 fastANI hits (0 hits with ANI > threshold)
[2024-01-25 20:19:43,290] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2024-01-25 20:19:43,290] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Niveispirillum irakense	strain=DSM 11586	GCA_000429645.1	34011	34011	type	True	79.334	612	1539	95	below_threshold
Nitrospirillum iridis	strain=DSM 22198	GCA_014205765.1	765888	765888	type	True	79.0167	601	1539	95	below_threshold
Niveispirillum cyanobacteriorum	strain=CGMCC 1.12958	GCA_014640215.1	1612173	1612173	type	True	78.8728	615	1539	95	below_threshold
Niveispirillum cyanobacteriorum	strain=TH16	GCA_002868735.1	1612173	1612173	type	True	78.871	605	1539	95	below_threshold
Niveispirillum lacus	strain=1-14	GCA_002251795.1	1981099	1981099	type	True	78.4903	471	1539	95	below_threshold
Azospirillum thermophilum	strain=CFH 70021	GCA_003130795.1	2202148	2202148	type	True	78.2384	581	1539	95	below_threshold
Azospirillum brasilense	strain=Sp 7	GCA_001315015.1	192	192	type	True	78.1941	539	1539	95	below_threshold
Azospirillum rugosum	strain=IMMIB AFH-6	GCA_017876155.1	416170	416170	type	True	78.1734	586	1539	95	below_threshold
Azospirillum brasilense	strain=Sp 7	GCA_007827425.1	192	192	type	True	78.1546	551	1539	95	below_threshold
Azospirillum tabaci	strain=W712	GCA_014596085.1	2752310	2752310	type	True	78.1519	538	1539	95	below_threshold
Azospirillum brasilense	strain=Sp 7	GCA_002027385.1	192	192	type	True	78.1223	525	1539	95	below_threshold
Azospirillum brasilense	strain=Sp 7	GCA_008274945.1	192	192	type	True	78.1144	544	1539	95	below_threshold
Azospirillum formosense	strain=CC-NFb-7	GCA_013340925.1	861533	861533	type	True	78.0787	514	1539	95	below_threshold
Azospirillum picis	strain=IMMIB TAR-3	GCA_017876115.1	488438	488438	type	True	77.9786	565	1539	95	below_threshold
Azospirillum agricola	strain=CC-HIH038	GCA_017876095.1	1720247	1720247	type	True	77.8073	590	1539	95	below_threshold
Inquilinus limosus	strain=DSM 16000	GCA_000423185.1	171674	171674	type	True	77.3794	497	1539	95	below_threshold
Arenibaculum pallidiluteum	strain=SYSU D00532	GCA_017355985.1	2812559	2812559	type	True	77.3574	438	1539	95	below_threshold
Xanthobacter oligotrophicus	strain=29k	GCA_008364685.1	2607286	2607286	type	True	76.7537	282	1539	95	below_threshold
Rhodovastum atsumiense	strain=G2-11	GCA_937425535.1	504468	504468	type	True	76.6503	353	1539	95	below_threshold
Roseococcus pinisoli	strain=XZZS9	GCA_018413645.1	2835040	2835040	type	True	76.4623	258	1539	95	below_threshold
Roseococcus thiosulfatophilus	strain=RB-3	GCA_017311575.1	35813	35813	type	True	76.1848	313	1539	95	below_threshold
--------------------------------------------------------------------------------
[2024-01-25 20:19:43,291] [INFO] DFAST Taxonomy check result was written to GCF_003336875.1_ASM333687v1_genomic.fna/tc_result.tsv
[2024-01-25 20:19:43,292] [INFO] ===== Taxonomy check completed =====
[2024-01-25 20:19:43,292] [INFO] ===== Start completeness check using CheckM =====
[2024-01-25 20:19:43,292] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg2fc771f6-4ac3-4b6b-bf62-7ba63e2ec399/dqc_reference/checkm_data
[2024-01-25 20:19:43,293] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2024-01-25 20:19:43,340] [INFO] Task started: CheckM
[2024-01-25 20:19:43,341] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_003336875.1_ASM333687v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_003336875.1_ASM333687v1_genomic.fna/checkm_input GCF_003336875.1_ASM333687v1_genomic.fna/checkm_result
[2024-01-25 20:20:35,200] [INFO] Task succeeded: CheckM
[2024-01-25 20:20:35,202] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2024-01-25 20:20:35,249] [INFO] ===== Completeness check finished =====
[2024-01-25 20:20:35,250] [INFO] ===== Start GTDB Search =====
[2024-01-25 20:20:35,251] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_003336875.1_ASM333687v1_genomic.fna/markers.fasta)
[2024-01-25 20:20:35,251] [INFO] Task started: Blastn
[2024-01-25 20:20:35,251] [INFO] Running command: blastn -query GCF_003336875.1_ASM333687v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg2fc771f6-4ac3-4b6b-bf62-7ba63e2ec399/dqc_reference/reference_markers_gtdb.fasta -out GCF_003336875.1_ASM333687v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-25 20:20:37,083] [INFO] Task succeeded: Blastn
[2024-01-25 20:20:37,087] [INFO] Selected 14 target genomes.
[2024-01-25 20:20:37,087] [INFO] Target genome list was writen to GCF_003336875.1_ASM333687v1_genomic.fna/target_genomes_gtdb.txt
[2024-01-25 20:20:37,100] [INFO] Task started: fastANI
[2024-01-25 20:20:37,101] [INFO] Running command: fastANI --query /var/lib/cwl/stg131a6f51-d4ac-44b3-aa4f-9b6d0cdab7d1/GCF_003336875.1_ASM333687v1_genomic.fna.gz --refList GCF_003336875.1_ASM333687v1_genomic.fna/target_genomes_gtdb.txt --output GCF_003336875.1_ASM333687v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2024-01-25 20:20:54,846] [INFO] Task succeeded: fastANI
[2024-01-25 20:20:54,856] [INFO] Found 14 fastANI hits (1 hits with ANI > circumscription radius)
[2024-01-25 20:20:54,856] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_003336875.1	s__Oleisolibacter albus	100.0	1534	1539	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Oleisolibacter	95.0	N/A	N/A	N/A	N/A	1	conclusive
GCA_001939945.1	s__Aerophototrophica crusticola	81.2228	889	1539	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Aerophototrophica	95.0	99.94	99.94	0.97	0.97	2	-
GCF_014197805.1	s__Rhodospirillum_A centenum	80.7093	765	1539	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Rhodospirillum_A	95.0	100.00	100.00	1.00	1.00	2	-
GCF_003568845.1	s__Indioceanicola profundi	80.4177	696	1539	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Indioceanicola	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900188385.1	s__Niveispirillum sp900188385	79.6896	688	1539	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Niveispirillum	95.0	100.00	100.00	1.00	1.00	2	-
GCF_009495745.1	s__Niveispirillum sp009495745	79.4116	692	1539	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Niveispirillum	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000429645.1	s__Niveispirillum irakense	79.3578	609	1539	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Niveispirillum	95.0	N/A	N/A	N/A	N/A	1	-
GCF_007827955.1	s__Nitrospirillum amazonense_B	79.1344	587	1539	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Nitrospirillum	95.0	N/A	N/A	N/A	N/A	1	-
GCF_007828045.1	s__Nitrospirillum amazonense_C	79.0205	614	1539	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Nitrospirillum	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002868735.1	s__Niveispirillum cyanobacteriorum	78.8822	603	1539	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Niveispirillum	95.0	100.00	100.00	1.00	1.00	2	-
GCF_007828035.1	s__Nitrospirillum amazonense_A	78.8561	602	1539	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Nitrospirillum	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001315015.1	s__Azospirillum brasilense	78.1952	540	1539	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Azospirillum	95.0321	99.35	95.17	0.92	0.83	16	-
GCF_017876155.1	s__Azospirillum rugosum	78.1497	592	1539	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Azospirillum	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003349955.1	s__Azospirillum brasilense_B	78.0601	575	1539	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Azospirillum	95.0	95.36	95.36	0.89	0.89	2	-
--------------------------------------------------------------------------------
[2024-01-25 20:20:54,857] [INFO] GTDB search result was written to GCF_003336875.1_ASM333687v1_genomic.fna/result_gtdb.tsv
[2024-01-25 20:20:54,858] [INFO] ===== GTDB Search completed =====
[2024-01-25 20:20:54,861] [INFO] DFAST_QC result json was written to GCF_003336875.1_ASM333687v1_genomic.fna/dqc_result.json
[2024-01-25 20:20:54,861] [INFO] DFAST_QC completed!
[2024-01-25 20:20:54,861] [INFO] Total running time: 0h1m49s
