[2024-01-24 13:01:39,628] [INFO] DFAST_QC pipeline started.
[2024-01-24 13:01:39,632] [INFO] DFAST_QC version: 0.5.7
[2024-01-24 13:01:39,633] [INFO] DQC Reference Directory: /var/lib/cwl/stga6cdbaea-2b11-4225-8afb-aa8d1eb39d2d/dqc_reference
[2024-01-24 13:01:41,327] [INFO] ===== Start taxonomy check using ANI =====
[2024-01-24 13:01:41,328] [INFO] Task started: Prodigal
[2024-01-24 13:01:41,328] [INFO] Running command: gunzip -c /var/lib/cwl/stg766eed70-81e4-4ed8-b7fd-624ca6dd9120/GCF_025136295.1_ASM2513629v1_genomic.fna.gz | prodigal -d GCF_025136295.1_ASM2513629v1_genomic.fna/cds.fna -a GCF_025136295.1_ASM2513629v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2024-01-24 13:01:55,263] [INFO] Task succeeded: Prodigal
[2024-01-24 13:01:55,263] [INFO] Task started: HMMsearch
[2024-01-24 13:01:55,264] [INFO] Running command: hmmsearch --tblout GCF_025136295.1_ASM2513629v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stga6cdbaea-2b11-4225-8afb-aa8d1eb39d2d/dqc_reference/reference_markers.hmm GCF_025136295.1_ASM2513629v1_genomic.fna/protein.faa > /dev/null
[2024-01-24 13:01:55,635] [INFO] Task succeeded: HMMsearch
[2024-01-24 13:01:55,636] [INFO] Found 6/6 markers.
[2024-01-24 13:01:55,680] [INFO] Query marker FASTA was written to GCF_025136295.1_ASM2513629v1_genomic.fna/markers.fasta
[2024-01-24 13:01:55,680] [INFO] Task started: Blastn
[2024-01-24 13:01:55,681] [INFO] Running command: blastn -query GCF_025136295.1_ASM2513629v1_genomic.fna/markers.fasta -db /var/lib/cwl/stga6cdbaea-2b11-4225-8afb-aa8d1eb39d2d/dqc_reference/reference_markers.fasta -out GCF_025136295.1_ASM2513629v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 13:01:56,537] [INFO] Task succeeded: Blastn
[2024-01-24 13:01:56,542] [INFO] Selected 27 target genomes.
[2024-01-24 13:01:56,542] [INFO] Target genome list was writen to GCF_025136295.1_ASM2513629v1_genomic.fna/target_genomes.txt
[2024-01-24 13:01:56,552] [INFO] Task started: fastANI
[2024-01-24 13:01:56,552] [INFO] Running command: fastANI --query /var/lib/cwl/stg766eed70-81e4-4ed8-b7fd-624ca6dd9120/GCF_025136295.1_ASM2513629v1_genomic.fna.gz --refList GCF_025136295.1_ASM2513629v1_genomic.fna/target_genomes.txt --output GCF_025136295.1_ASM2513629v1_genomic.fna/fastani_result.tsv --threads 1
[2024-01-24 13:02:17,872] [INFO] Task succeeded: fastANI
[2024-01-24 13:02:17,873] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stga6cdbaea-2b11-4225-8afb-aa8d1eb39d2d/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2024-01-24 13:02:17,873] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stga6cdbaea-2b11-4225-8afb-aa8d1eb39d2d/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2024-01-24 13:02:17,890] [INFO] Found 22 fastANI hits (1 hits with ANI > threshold)
[2024-01-24 13:02:17,890] [INFO] The taxonomy check result is classified as 'conclusive'.
[2024-01-24 13:02:17,890] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Aestuariispira ectoiniformans	strain=SWCN16	GCA_025136295.1	2775080	2775080	type	True	100.0	1452	1453	95	conclusive
Aestuariispira insulae	strain=CECT 8488	GCA_003385955.1	1461337	1461337	type	True	77.5881	285	1453	95	below_threshold
Martelella endophytica	strain=YC6887	GCA_000960975.1	1486262	1486262	type	True	76.9617	56	1453	95	below_threshold
Azospirillum baldaniorum	strain=Sp245	GCA_003119195.2	1064539	1064539	type	True	76.8305	105	1453	95	below_threshold
Pacificispira spongiicola	strain=KN72	GCA_012926505.1	2729598	2729598	type	True	76.8295	160	1453	95	below_threshold
Azospirillum baldaniorum	strain=Sp245	GCA_000237365.1	1064539	1064539	type	True	76.6872	107	1453	95	below_threshold
Thalassospira indica	strain=PB8BT	GCA_003403095.1	1891279	1891279	type	True	76.5428	98	1453	95	below_threshold
Thalassospira marina	strain=CSC3H3	GCA_002844375.1	2048283	2048283	type	True	76.4563	95	1453	95	below_threshold
Reyranella massiliensis	strain=521	GCA_000312425.1	445220	445220	type	True	76.2909	62	1453	95	below_threshold
Dongia mobilis	strain=CGMCC 1.7660	GCA_004363235.1	578943	578943	type	True	76.2461	88	1453	95	below_threshold
Zavarzinia aquatilis	strain=HR-AS	GCA_003173035.1	2211142	2211142	type	True	76.1744	81	1453	95	below_threshold
Niveispirillum irakense	strain=DSM 11586	GCA_000429645.1	34011	34011	type	True	75.9844	98	1453	95	below_threshold
Skermanella aerolata	strain=5416T-32	GCA_000936425.1	393310	393310	type	True	75.9812	97	1453	95	below_threshold
Thalassospira mesophila	strain=JCM 18969	GCA_002115755.1	1293891	1293891	type	True	75.91	51	1453	95	below_threshold
Paracoccus denitrificans	strain=NBRC 102528	GCA_007989485.1	266	266	type	True	75.9047	73	1453	95	below_threshold
Paracoccus limosus	strain=JCM 17370	GCA_009711185.1	913252	913252	type	True	75.8504	59	1453	95	below_threshold
Paracoccus halophilus	strain=CGMCC 1.6117	GCA_900111785.1	376733	376733	type	True	75.8474	65	1453	95	below_threshold
Azospirillum agricola	strain=CC-HIH038	GCA_017876095.1	1720247	1720247	type	True	75.8178	98	1453	95	below_threshold
Xanthobacter dioxanivorans	strain=YN2	GCA_016807805.1	2528964	2528964	type	True	75.7765	71	1453	95	below_threshold
Pararhodospirillum oryzae	strain=NBRC 107573	GCA_007992075.1	478448	478448	type	True	75.7443	54	1453	95	below_threshold
Tistrella bauzanensis	strain=CGMCC 1.10188	GCA_014636235.1	657419	657419	type	True	75.7396	83	1453	95	below_threshold
Paracoccus halophilus	strain=JCM 14014	GCA_000763905.1	376733	376733	type	True	75.6439	64	1453	95	below_threshold
--------------------------------------------------------------------------------
[2024-01-24 13:02:17,892] [INFO] DFAST Taxonomy check result was written to GCF_025136295.1_ASM2513629v1_genomic.fna/tc_result.tsv
[2024-01-24 13:02:17,893] [INFO] ===== Taxonomy check completed =====
[2024-01-24 13:02:17,893] [INFO] ===== Start completeness check using CheckM =====
[2024-01-24 13:02:17,893] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stga6cdbaea-2b11-4225-8afb-aa8d1eb39d2d/dqc_reference/checkm_data
[2024-01-24 13:02:17,894] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2024-01-24 13:02:17,935] [INFO] Task started: CheckM
[2024-01-24 13:02:17,936] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_025136295.1_ASM2513629v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_025136295.1_ASM2513629v1_genomic.fna/checkm_input GCF_025136295.1_ASM2513629v1_genomic.fna/checkm_result
[2024-01-24 13:03:02,309] [INFO] Task succeeded: CheckM
[2024-01-24 13:03:02,311] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2024-01-24 13:03:02,332] [INFO] ===== Completeness check finished =====
[2024-01-24 13:03:02,333] [INFO] ===== Start GTDB Search =====
[2024-01-24 13:03:02,333] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_025136295.1_ASM2513629v1_genomic.fna/markers.fasta)
[2024-01-24 13:03:02,333] [INFO] Task started: Blastn
[2024-01-24 13:03:02,334] [INFO] Running command: blastn -query GCF_025136295.1_ASM2513629v1_genomic.fna/markers.fasta -db /var/lib/cwl/stga6cdbaea-2b11-4225-8afb-aa8d1eb39d2d/dqc_reference/reference_markers_gtdb.fasta -out GCF_025136295.1_ASM2513629v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 13:03:03,642] [INFO] Task succeeded: Blastn
[2024-01-24 13:03:03,647] [INFO] Selected 30 target genomes.
[2024-01-24 13:03:03,647] [INFO] Target genome list was writen to GCF_025136295.1_ASM2513629v1_genomic.fna/target_genomes_gtdb.txt
[2024-01-24 13:03:03,682] [INFO] Task started: fastANI
[2024-01-24 13:03:03,682] [INFO] Running command: fastANI --query /var/lib/cwl/stg766eed70-81e4-4ed8-b7fd-624ca6dd9120/GCF_025136295.1_ASM2513629v1_genomic.fna.gz --refList GCF_025136295.1_ASM2513629v1_genomic.fna/target_genomes_gtdb.txt --output GCF_025136295.1_ASM2513629v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2024-01-24 13:03:26,719] [INFO] Task succeeded: fastANI
[2024-01-24 13:03:26,744] [INFO] Found 26 fastANI hits (0 hits with ANI > circumscription radius)
[2024-01-24 13:03:26,744] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_003385955.1	s__Aestuariispira insulae	77.6008	283	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__UBA8366;f__GCA-2696645;g__Aestuariispira	95.0	N/A	N/A	N/A	N/A	1	-
GCA_017792345.1	s__GCA-2696645 sp017792345	77.1849	185	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__UBA8366;f__GCA-2696645;g__GCA-2696645	95.0	95.39	95.37	0.96	0.95	3	-
GCF_003119195.2	s__Azospirillum baldaniorum	76.9094	107	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Azospirillum	95.0321	97.20	95.48	0.90	0.83	11	-
GCA_012926505.1	s__GCA-2696645 sp012926505	76.8441	159	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__UBA8366;f__GCA-2696645;g__GCA-2696645	95.0	N/A	N/A	N/A	N/A	1	-
GCF_009495745.1	s__Niveispirillum sp009495745	76.4389	86	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Niveispirillum	95.0	N/A	N/A	N/A	N/A	1	-
GCF_017963645.1	s__Marivibrio halodurans	76.3831	133	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__UBA8366;f__GCA-2696645;g__Marivibrio	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000960975.1	s__Martelella endophytica	76.3227	53	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Martelella	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003336875.1	s__Oleisolibacter albus	76.3095	83	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Oleisolibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_004363235.1	s__Dongia mobilis	76.2904	89	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Dongiales;f__Dongiaceae;g__Dongia	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002238725.1	s__Bin65 sp002238725	76.2485	61	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Bin65;f__Bin65;g__Bin65	95.0	N/A	N/A	N/A	N/A	1	-
GCF_004799325.1	s__Thalassobius vesicularis	76.1912	70	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Thalassobius	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002709955.1	s__GCA-2696645 sp002709955	76.1866	92	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__UBA8366;f__GCA-2696645;g__GCA-2696645	95.0	99.86	99.71	0.96	0.91	10	-
GCF_014192915.1	s__Azospirillum sp014192915	76.0857	99	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Azospirillum	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002168225.1	s__TMED2 sp002168225	76.0203	62	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__TMED2;f__TMED2;g__TMED2	95.0	99.94	99.84	0.98	0.96	15	-
GCA_002696645.1	s__GCA-2696645 sp002696645	75.9864	91	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__UBA8366;f__GCA-2696645;g__GCA-2696645	95.0	99.85	99.85	0.96	0.96	2	-
GCF_000936425.1	s__Skermanella aerolata	75.9812	97	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Skermanella	95.0	99.99	99.99	0.98	0.98	2	-
GCA_009649675.1	s__HT1-32 sp009649675	75.9423	62	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__HT1-32;g__HT1-32	95.0	N/A	N/A	N/A	N/A	1	-
GCF_016595245.1	s__Azospirillum sp016595245	75.919	108	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Azospirillum	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003326755.1	s__Thalassospira profundimaris_B	75.8786	92	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Thalassospiraceae;g__Thalassospira	95.0	N/A	N/A	N/A	N/A	1	-
GCF_007827815.1	s__Azospirillum brasilense_C	75.8733	104	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Azospirillum	95.0	97.36	95.98	0.93	0.90	3	-
GCF_009711185.1	s__Paracoccus limosus	75.8504	59	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Paracoccus	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001907695.1	s__Thalassospira sp001907695	75.8291	82	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Thalassospiraceae;g__Thalassospira	95.0	N/A	N/A	N/A	N/A	1	-
GCF_017876095.1	s__Azospirillum agricola	75.8036	99	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Azospirillum	95.0	96.88	96.88	0.87	0.87	2	-
GCA_016212565.1	s__JACRFY01 sp016212565	75.6606	52	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Magnetospirillaceae;g__JACRFY01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_903843125.1	s__CAIMOR01 sp903843125	75.5345	60	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Acetobacterales;f__Acetobacteraceae;g__CAIMOR01	95.0	99.35	99.29	0.86	0.84	4	-
GCA_017307275.1	s__Reyranella sp017307275	75.3003	50	1453	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Reyranellales;f__Reyranellaceae;g__Reyranella	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2024-01-24 13:03:26,746] [INFO] GTDB search result was written to GCF_025136295.1_ASM2513629v1_genomic.fna/result_gtdb.tsv
[2024-01-24 13:03:26,746] [INFO] ===== GTDB Search completed =====
[2024-01-24 13:03:26,751] [INFO] DFAST_QC result json was written to GCF_025136295.1_ASM2513629v1_genomic.fna/dqc_result.json
[2024-01-24 13:03:26,751] [INFO] DFAST_QC completed!
[2024-01-24 13:03:26,751] [INFO] Total running time: 0h1m47s
