[2024-01-24 13:17:10,856] [INFO] DFAST_QC pipeline started.
[2024-01-24 13:17:10,858] [INFO] DFAST_QC version: 0.5.7
[2024-01-24 13:17:10,858] [INFO] DQC Reference Directory: /var/lib/cwl/stg4e3b5464-9cce-470a-8fad-3453b54cf924/dqc_reference
[2024-01-24 13:17:12,093] [INFO] ===== Start taxonomy check using ANI =====
[2024-01-24 13:17:12,094] [INFO] Task started: Prodigal
[2024-01-24 13:17:12,094] [INFO] Running command: gunzip -c /var/lib/cwl/stg53c75546-5ade-4a5b-b500-c1f144438858/GCF_004362855.1_ASM436285v1_genomic.fna.gz | prodigal -d GCF_004362855.1_ASM436285v1_genomic.fna/cds.fna -a GCF_004362855.1_ASM436285v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2024-01-24 13:17:26,927] [INFO] Task succeeded: Prodigal
[2024-01-24 13:17:26,928] [INFO] Task started: HMMsearch
[2024-01-24 13:17:26,928] [INFO] Running command: hmmsearch --tblout GCF_004362855.1_ASM436285v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg4e3b5464-9cce-470a-8fad-3453b54cf924/dqc_reference/reference_markers.hmm GCF_004362855.1_ASM436285v1_genomic.fna/protein.faa > /dev/null
[2024-01-24 13:17:27,196] [INFO] Task succeeded: HMMsearch
[2024-01-24 13:17:27,197] [INFO] Found 6/6 markers.
[2024-01-24 13:17:27,242] [INFO] Query marker FASTA was written to GCF_004362855.1_ASM436285v1_genomic.fna/markers.fasta
[2024-01-24 13:17:27,243] [INFO] Task started: Blastn
[2024-01-24 13:17:27,243] [INFO] Running command: blastn -query GCF_004362855.1_ASM436285v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg4e3b5464-9cce-470a-8fad-3453b54cf924/dqc_reference/reference_markers.fasta -out GCF_004362855.1_ASM436285v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 13:17:28,189] [INFO] Task succeeded: Blastn
[2024-01-24 13:17:28,194] [INFO] Selected 22 target genomes.
[2024-01-24 13:17:28,194] [INFO] Target genome list was writen to GCF_004362855.1_ASM436285v1_genomic.fna/target_genomes.txt
[2024-01-24 13:17:28,244] [INFO] Task started: fastANI
[2024-01-24 13:17:28,245] [INFO] Running command: fastANI --query /var/lib/cwl/stg53c75546-5ade-4a5b-b500-c1f144438858/GCF_004362855.1_ASM436285v1_genomic.fna.gz --refList GCF_004362855.1_ASM436285v1_genomic.fna/target_genomes.txt --output GCF_004362855.1_ASM436285v1_genomic.fna/fastani_result.tsv --threads 1
[2024-01-24 13:17:47,024] [INFO] Task succeeded: fastANI
[2024-01-24 13:17:47,024] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg4e3b5464-9cce-470a-8fad-3453b54cf924/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2024-01-24 13:17:47,025] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg4e3b5464-9cce-470a-8fad-3453b54cf924/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2024-01-24 13:17:47,043] [INFO] Found 22 fastANI hits (1 hits with ANI > threshold)
[2024-01-24 13:17:47,044] [INFO] The taxonomy check result is classified as 'conclusive'.
[2024-01-24 13:17:47,044] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Aquabacterium commune	strain=DSM 11901	GCA_004362855.1	70586	70586	type	True	100.0	1382	1383	95	conclusive
Aquabacterium parvum	strain=B6	GCA_001447195.1	70584	70584	type	True	85.1001	903	1383	95	below_threshold
Aquabacterium olei	strain=NBRC 110486	GCA_003100395.1	1296669	1296669	type	True	83.2679	764	1383	95	below_threshold
Aquabacterium soli	strain=SJQ9	GCA_003933735.1	2493092	2493092	type	True	82.8485	768	1383	95	below_threshold
Aquabacterium lacunae	strain=KMB7	GCA_004310865.1	2528630	2528630	type	True	82.3611	686	1383	95	below_threshold
Aquabacterium fontiphilum	strain=CS-6	GCA_009909205.1	450365	450365	type	True	82.1514	687	1383	95	below_threshold
Ideonella dechloratans	strain=CCUG 30977	GCA_021049305.1	36863	36863	type	True	79.9711	564	1383	95	below_threshold
Schlegelella thermodepolymerans	strain=DSM 15344	GCA_002933415.1	215580	215580	type	True	79.8658	521	1383	95	below_threshold
Sphaerotilus hippei	strain=DSM 566	GCA_003201595.1	744406	744406	type	True	79.8596	553	1383	95	below_threshold
Schlegelella thermodepolymerans	strain=DSM 15344	GCA_015476235.1	215580	215580	type	True	79.7954	537	1383	95	below_threshold
Sphaerotilus natans	strain=ATCC 13338	GCA_900156335.1	34103	34103	type	True	79.7771	525	1383	95	below_threshold
Sphaerotilus natans subsp. natans	strain=DSM 6575	GCA_000689195.1	882627	34103	type	True	79.7722	500	1383	95	below_threshold
Sphaerotilus sulfidivorans	strain=D-501	GCA_013426975.1	639200	639200	type	True	79.6723	511	1383	95	below_threshold
Ideonella benzenivorans	strain=B7	GCA_020387415.1	2831643	2831643	type	True	79.427	531	1383	95	below_threshold
Azohydromonas aeria	strain=CFCC 13393	GCA_009760915.1	2590212	2590212	type	True	79.4188	576	1383	95	below_threshold
Hydrogenophaga crocea	strain=BA0156	GCA_011388215.1	2716225	2716225	type	True	79.2704	501	1383	95	below_threshold
Schlegelella brevitalea	strain=DSM 7029	GCA_001017435.1	413882	413882	type	True	79.1475	459	1383	95	below_threshold
Hydrogenophaga palleronii	strain=NBRC 102513	GCA_001571225.1	65655	65655	type	True	79.1112	450	1383	95	below_threshold
Ideonella paludis	strain=KCTC 32238	GCA_018069865.1	1233411	1233411	type	True	78.8075	370	1383	95	below_threshold
Ramlibacter alkalitolerans	strain=KACC 19305	GCA_016722765.1	2039631	2039631	type	True	78.4956	417	1383	95	below_threshold
Inhella gelatinilytica	strain=4Y10	GCA_016093295.1	2795030	2795030	type	True	78.3726	256	1383	95	below_threshold
Luteibacter pinisoli	strain=MAH-14	GCA_006385595.1	2589080	2589080	type	True	76.1282	124	1383	95	below_threshold
--------------------------------------------------------------------------------
[2024-01-24 13:17:47,046] [INFO] DFAST Taxonomy check result was written to GCF_004362855.1_ASM436285v1_genomic.fna/tc_result.tsv
[2024-01-24 13:17:47,050] [INFO] ===== Taxonomy check completed =====
[2024-01-24 13:17:47,050] [INFO] ===== Start completeness check using CheckM =====
[2024-01-24 13:17:47,050] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg4e3b5464-9cce-470a-8fad-3453b54cf924/dqc_reference/checkm_data
[2024-01-24 13:17:47,052] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2024-01-24 13:17:47,096] [INFO] Task started: CheckM
[2024-01-24 13:17:47,096] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_004362855.1_ASM436285v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_004362855.1_ASM436285v1_genomic.fna/checkm_input GCF_004362855.1_ASM436285v1_genomic.fna/checkm_result
[2024-01-24 13:19:52,092] [INFO] Task succeeded: CheckM
[2024-01-24 13:19:52,093] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 100.00%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2024-01-24 13:19:52,115] [INFO] ===== Completeness check finished =====
[2024-01-24 13:19:52,115] [INFO] ===== Start GTDB Search =====
[2024-01-24 13:19:52,116] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_004362855.1_ASM436285v1_genomic.fna/markers.fasta)
[2024-01-24 13:19:52,116] [INFO] Task started: Blastn
[2024-01-24 13:19:52,116] [INFO] Running command: blastn -query GCF_004362855.1_ASM436285v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg4e3b5464-9cce-470a-8fad-3453b54cf924/dqc_reference/reference_markers_gtdb.fasta -out GCF_004362855.1_ASM436285v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2024-01-24 13:19:53,930] [INFO] Task succeeded: Blastn
[2024-01-24 13:19:53,935] [INFO] Selected 15 target genomes.
[2024-01-24 13:19:53,935] [INFO] Target genome list was writen to GCF_004362855.1_ASM436285v1_genomic.fna/target_genomes_gtdb.txt
[2024-01-24 13:19:54,057] [INFO] Task started: fastANI
[2024-01-24 13:19:54,057] [INFO] Running command: fastANI --query /var/lib/cwl/stg53c75546-5ade-4a5b-b500-c1f144438858/GCF_004362855.1_ASM436285v1_genomic.fna.gz --refList GCF_004362855.1_ASM436285v1_genomic.fna/target_genomes_gtdb.txt --output GCF_004362855.1_ASM436285v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2024-01-24 13:20:08,324] [INFO] Task succeeded: fastANI
[2024-01-24 13:20:08,339] [INFO] Found 15 fastANI hits (1 hits with ANI > circumscription radius)
[2024-01-24 13:20:08,340] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_004362855.1	s__Aquabacterium commune	100.0	1382	1383	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Aquabacterium	95.0	95.44	95.44	0.90	0.90	2	conclusive
GCA_903894125.1	s__Aquabacterium sp903894125	93.5751	761	1383	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Aquabacterium	95.0	99.35	99.35	0.82	0.82	2	-
GCA_903871865.1	s__Aquabacterium sp903871865	92.6419	609	1383	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Aquabacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016000265.1	s__Aquabacterium sp016000265	86.1117	879	1383	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Aquabacterium	95.0	99.96	99.96	0.95	0.95	2	-
GCA_001770725.1	s__Aquabacterium sp001770725	85.5272	893	1383	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Aquabacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_005502875.1	s__Aquabacterium sp005502875	85.1376	858	1383	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Aquabacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001447195.1	s__Aquabacterium parvum	85.1116	902	1383	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Aquabacterium	95.0	97.32	96.21	0.91	0.89	4	-
GCA_016195855.1	s__Aquabacterium sp016195855	82.8761	788	1383	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Aquabacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_003963115.1	s__Aquabacterium sp003963115	82.8667	770	1383	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Aquabacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003201595.1	s__Sphaerotilus hippei	79.85	554	1383	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Sphaerotilus	95.0	N/A	N/A	N/A	N/A	1	-
GCF_013294065.1	s__Sphaerotilus sp013294065	79.7565	546	1383	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Sphaerotilus	95.0	N/A	N/A	N/A	N/A	1	-
GCA_014873965.1	s__VBDL01 sp014873965	79.5532	549	1383	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__VBDL01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_003265685.1	s__Piscinibacter caeni	79.3271	545	1383	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Piscinibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000799305.1	s__Rhizobacter sp000799305	78.7153	482	1383	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Rhizobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCA_018240325.1	s__Rubrivivax sp018240325	78.3874	409	1383	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae;g__Rubrivivax	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2024-01-24 13:20:08,344] [INFO] GTDB search result was written to GCF_004362855.1_ASM436285v1_genomic.fna/result_gtdb.tsv
[2024-01-24 13:20:08,345] [INFO] ===== GTDB Search completed =====
[2024-01-24 13:20:08,349] [INFO] DFAST_QC result json was written to GCF_004362855.1_ASM436285v1_genomic.fna/dqc_result.json
[2024-01-24 13:20:08,349] [INFO] DFAST_QC completed!
[2024-01-24 13:20:08,349] [INFO] Total running time: 0h2m57s
