[2023-06-30 12:26:54,669] [INFO] DFAST_QC pipeline started.
[2023-06-30 12:26:54,671] [INFO] DFAST_QC version: 0.5.7
[2023-06-30 12:26:54,671] [INFO] DQC Reference Directory: /var/lib/cwl/stg31aa7e01-5cc1-4d97-9ca4-87b76a5d64f5/dqc_reference
[2023-06-30 12:26:55,950] [INFO] ===== Start taxonomy check using ANI =====
[2023-06-30 12:26:55,951] [INFO] Task started: Prodigal
[2023-06-30 12:26:55,951] [INFO] Running command: gunzip -c /var/lib/cwl/stg44a8c2d7-bb96-4a51-8932-09d1937a79e4/GCA_016708475.1_ASM1670847v1_genomic.fna.gz | prodigal -d GCA_016708475.1_ASM1670847v1_genomic.fna/cds.fna -a GCA_016708475.1_ASM1670847v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2023-06-30 12:27:19,880] [INFO] Task succeeded: Prodigal
[2023-06-30 12:27:19,881] [INFO] Task started: HMMsearch
[2023-06-30 12:27:19,881] [INFO] Running command: hmmsearch --tblout GCA_016708475.1_ASM1670847v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg31aa7e01-5cc1-4d97-9ca4-87b76a5d64f5/dqc_reference/reference_markers.hmm GCA_016708475.1_ASM1670847v1_genomic.fna/protein.faa > /dev/null
[2023-06-30 12:27:20,235] [INFO] Task succeeded: HMMsearch
[2023-06-30 12:27:20,237] [INFO] Found 6/6 markers.
[2023-06-30 12:27:20,305] [INFO] Query marker FASTA was written to GCA_016708475.1_ASM1670847v1_genomic.fna/markers.fasta
[2023-06-30 12:27:20,305] [INFO] Task started: Blastn
[2023-06-30 12:27:20,305] [INFO] Running command: blastn -query GCA_016708475.1_ASM1670847v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg31aa7e01-5cc1-4d97-9ca4-87b76a5d64f5/dqc_reference/reference_markers.fasta -out GCA_016708475.1_ASM1670847v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-06-30 12:27:21,096] [INFO] Task succeeded: Blastn
[2023-06-30 12:27:21,100] [INFO] Selected 36 target genomes.
[2023-06-30 12:27:21,101] [INFO] Target genome list was writen to GCA_016708475.1_ASM1670847v1_genomic.fna/target_genomes.txt
[2023-06-30 12:27:21,109] [INFO] Task started: fastANI
[2023-06-30 12:27:21,110] [INFO] Running command: fastANI --query /var/lib/cwl/stg44a8c2d7-bb96-4a51-8932-09d1937a79e4/GCA_016708475.1_ASM1670847v1_genomic.fna.gz --refList GCA_016708475.1_ASM1670847v1_genomic.fna/target_genomes.txt --output GCA_016708475.1_ASM1670847v1_genomic.fna/fastani_result.tsv --threads 1
[2023-06-30 12:27:57,776] [INFO] Task succeeded: fastANI
[2023-06-30 12:27:57,777] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg31aa7e01-5cc1-4d97-9ca4-87b76a5d64f5/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-06-30 12:27:57,777] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg31aa7e01-5cc1-4d97-9ca4-87b76a5d64f5/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-06-30 12:27:57,802] [INFO] Found 35 fastANI hits (0 hits with ANI > threshold)
[2023-06-30 12:27:57,802] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-06-30 12:27:57,803] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Polyangium aurulentum	strain=SDU3-1	GCA_005144635.2	2567896	2567896	type	True	75.5283	646	2844	95	below_threshold
Labilithrix luteola	strain=DSM 27648	GCA_001263205.1	1391654	1391654	type	True	75.4993	417	2844	95	below_threshold
Thioalkalivibrio sulfidiphilus	strain=HL-EbGR7	GCA_000021985.1	1033854	1033854	type	True	75.4149	100	2844	95	below_threshold
Polyangium spumosum	strain=DSM 14734	GCA_009649845.1	889282	889282	type	True	75.3299	612	2844	95	below_threshold
Microvirga thermotolerans	strain=HR1	GCA_009363855.1	2651334	2651334	type	True	75.2838	140	2844	95	below_threshold
Myxococcus stipitatus	strain=DSM 14675	GCA_000331735.1	83455	83455	neotype	True	75.2641	393	2844	95	below_threshold
Sandaracinus amylolyticus	strain=DSM 53668	GCA_000737325.2	927083	927083	type	True	75.262	756	2844	95	below_threshold
Polyangium fumosum	strain=DSM 14668	GCA_005144585.1	889272	889272	neotype	True	75.2468	606	2844	95	below_threshold
Luteimonas mephitis	strain=DSM 12574	GCA_000422305.1	83615	83615	type	True	75.2144	134	2844	95	below_threshold
Rhodocyclus purpureus	strain=DSM 168	GCA_016653115.1	1067	1067	type	True	75.2093	128	2844	95	below_threshold
Myxococcus fulvus	strain=DSM 16525	GCA_900111765.1	33	33	type	True	75.1888	432	2844	95	below_threshold
Luteimonas terricola	strain=CGMCC 1.8985	GCA_014645675.1	645597	645597	type	True	75.1446	153	2844	95	below_threshold
Pelagibius marinus	strain=NBU2595	GCA_014925385.1	2762760	2762760	type	True	75.1391	165	2844	95	below_threshold
Luteimonas aquatica	strain=RIB1-20	GCA_022662575.1	450364	450364	type	True	75.0825	216	2844	95	below_threshold
Marinicauda algicola	strain=RMAR8-3	GCA_017161425.1	2029849	2029849	type	True	75.0532	145	2844	95	below_threshold
Marinicauda algicola	strain=JCM 31718	GCA_004793685.1	2029849	2029849	type	True	75.0236	140	2844	95	below_threshold
Paenibacillus albicereus	strain=UniB2	GCA_012676905.1	2726185	2726185	type	True	75.0224	114	2844	95	below_threshold
Longimicrobium terrae	strain=CB-286315	GCA_013000925.1	1639882	1639882	type	True	74.9959	224	2844	95	below_threshold
Salinarimonas rosea	strain=DSM 21201	GCA_000429045.1	552063	552063	type	True	74.9461	361	2844	95	below_threshold
Salinarimonas ramus	strain=CGMCC 1.9161	GCA_014645695.1	690164	690164	type	True	74.8808	310	2844	95	below_threshold
Longimicrobium terrae	strain=DSM 29007	GCA_014202995.1	1639882	1639882	type	True	74.8603	219	2844	95	below_threshold
Longimicrobium terrae	strain=CECT 8660	GCA_014198875.1	1639882	1639882	type	True	74.8598	220	2844	95	below_threshold
Luteimonas huabeiensis	strain=HB2	GCA_000559025.1	1244513	1244513	type	True	74.8496	290	2844	95	below_threshold
Sphingomonas yunnanensis	strain=YIM 3	GCA_019898765.1	310400	310400	type	True	74.8479	261	2844	95	below_threshold
Methylobacterium isbiliense	strain=DSM 17168	GCA_022179325.1	315478	315478	type	True	74.8397	372	2844	95	below_threshold
Sphingomonas jejuensis	strain=DSM 27651	GCA_011927695.1	904715	904715	type	True	74.8095	127	2844	95	below_threshold
Amycolatopsis arida	strain=DSM 45648	GCA_004365925.1	587909	587909	type	True	74.8085	364	2844	95	below_threshold
Achromobacter pulmonis	strain=LMG 26696	GCA_902859765.1	1389932	1389932	type	True	74.7966	165	2844	95	below_threshold
Paenibacillus pasadenensis	strain=DSM 19293	GCA_000422485.1	217090	217090	type	True	74.7908	127	2844	95	below_threshold
Kribbella speibonae	strain=YM55	GCA_004331375.1	1572660	1572660	type	True	74.7877	244	2844	95	below_threshold
Amycolatopsis arida	strain=CGMCC 4.5579	GCA_900115565.1	587909	587909	type	True	74.7873	359	2844	95	below_threshold
Burkholderia glumae	strain=ATCC 33617	GCA_000960995.1	337	337	type	True	74.7867	290	2844	95	below_threshold
Kribbella sindirgiensis	strain=DSM 27082	GCA_004331435.1	1124744	1124744	type	True	74.783	224	2844	95	below_threshold
Paenibacillus pasadenensis	strain=NBRC 101214	GCA_004001085.1	217090	217090	type	True	74.7824	121	2844	95	below_threshold
Burkholderia glumae	strain=LMG 2196	GCA_902832765.1	337	337	type	True	74.7652	284	2844	95	below_threshold
--------------------------------------------------------------------------------
[2023-06-30 12:27:57,805] [INFO] DFAST Taxonomy check result was written to GCA_016708475.1_ASM1670847v1_genomic.fna/tc_result.tsv
[2023-06-30 12:27:57,806] [INFO] ===== Taxonomy check completed =====
[2023-06-30 12:27:57,807] [INFO] ===== Start completeness check using CheckM =====
[2023-06-30 12:27:57,807] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg31aa7e01-5cc1-4d97-9ca4-87b76a5d64f5/dqc_reference/checkm_data
[2023-06-30 12:27:57,809] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-06-30 12:27:57,888] [INFO] Task started: CheckM
[2023-06-30 12:27:57,889] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_016708475.1_ASM1670847v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_016708475.1_ASM1670847v1_genomic.fna/checkm_input GCA_016708475.1_ASM1670847v1_genomic.fna/checkm_result
[2023-06-30 12:29:18,493] [INFO] Task succeeded: CheckM
[2023-06-30 12:29:18,494] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 85.61%
Contamintation: 5.09%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-06-30 12:29:18,523] [INFO] ===== Completeness check finished =====
[2023-06-30 12:29:18,524] [INFO] ===== Start GTDB Search =====
[2023-06-30 12:29:18,524] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_016708475.1_ASM1670847v1_genomic.fna/markers.fasta)
[2023-06-30 12:29:18,525] [INFO] Task started: Blastn
[2023-06-30 12:29:18,525] [INFO] Running command: blastn -query GCA_016708475.1_ASM1670847v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg31aa7e01-5cc1-4d97-9ca4-87b76a5d64f5/dqc_reference/reference_markers_gtdb.fasta -out GCA_016708475.1_ASM1670847v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-06-30 12:29:19,802] [INFO] Task succeeded: Blastn
[2023-06-30 12:29:19,808] [INFO] Selected 17 target genomes.
[2023-06-30 12:29:19,808] [INFO] Target genome list was writen to GCA_016708475.1_ASM1670847v1_genomic.fna/target_genomes_gtdb.txt
[2023-06-30 12:29:19,818] [INFO] Task started: fastANI
[2023-06-30 12:29:19,818] [INFO] Running command: fastANI --query /var/lib/cwl/stg44a8c2d7-bb96-4a51-8932-09d1937a79e4/GCA_016708475.1_ASM1670847v1_genomic.fna.gz --refList GCA_016708475.1_ASM1670847v1_genomic.fna/target_genomes_gtdb.txt --output GCA_016708475.1_ASM1670847v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2023-06-30 12:29:48,801] [INFO] Task succeeded: fastANI
[2023-06-30 12:29:48,818] [INFO] Found 17 fastANI hits (2 hits with ANI > circumscription radius)
[2023-06-30 12:29:48,818] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCA_016704445.1	s__SCUS01 sp016704445	98.3272	2521	2844	d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__Ga0077539;g__SCUS01	95.0	98.47	98.28	0.90	0.89	4	inconclusive
GCA_004297725.1	s__SCUS01 sp004297725	95.0593	2231	2844	d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__Ga0077539;g__SCUS01	95.0	N/A	N/A	N/A	N/A	1	inconclusive
GCA_903845045.1	s__SCUS01 sp903845045	79.9607	1193	2844	d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__Ga0077539;g__SCUS01	95.0	99.02	99.02	0.83	0.83	2	-
GCA_016792845.1	s__JAEUKH01 sp016792845	77.0535	845	2844	d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__Ga0077539;g__JAEUKH01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016218725.1	s__JACRDB01 sp016218725	76.7438	757	2844	d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__Ga0077539;g__JACRDB01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_001465015.1	s__Ga0077539 sp001465015	75.8961	532	2844	d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__Ga0077539;g__Ga0077539	95.0	N/A	N/A	N/A	N/A	1	-
GCA_903893335.1	s__CAITLX01 sp903893335	75.7626	481	2844	d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__Polyangiaceae;g__CAITLX01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_013298625.1	s__Ga0077539 sp013298625	75.7089	706	2844	d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__Ga0077539;g__Ga0077539	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016793325.1	s__Ga0077539 sp016793325	75.6582	607	2844	d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__Ga0077539;g__Ga0077539	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002305855.1	s__Melittangium boletus	75.4006	342	2844	d__Bacteria;p__Myxococcota;c__Myxococcia;o__Myxococcales;f__Myxococcaceae;g__Melittangium	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016707895.1	s__UBA2376 sp016707895	75.2481	734	2844	d__Bacteria;p__Myxococcota;c__Polyangia;o__Haliangiales;f__Haliangiaceae;g__UBA2376	95.0	99.64	99.63	0.96	0.96	3	-
GCA_016710095.1	s__CAIXRL01 sp016710095	75.1941	247	2844	d__Bacteria;p__Eisenbacteria;c__RBG-16-71-46;o__RBG-16-71-46;f__RBG-16-71-46;g__CAIXRL01	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003611735.1	s__Corallococcus sicarius	75.1818	399	2844	d__Bacteria;p__Myxococcota;c__Myxococcia;o__Myxococcales;f__Myxococcaceae;g__Corallococcus	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003149435.1	s__E85 sp003149435	75.1725	176	2844	d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Nitrococcales;f__Nitrococcaceae;g__E85	95.0	N/A	N/A	N/A	N/A	1	-
GCF_012676905.1	s__Paenibacillus_O sp012676905	75.005	110	2844	d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus_O	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000422485.1	s__Paenibacillus_O pasadenensis	74.7902	127	2844	d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus_O	95.0	99.23	98.68	0.94	0.90	4	-
GCA_003164475.1	s__Bog-877 sp003164475	74.6542	57	2844	d__Bacteria;p__Dormibacterota;c__Dormibacteria;o__UBA8260;f__Bog-877;g__Bog-877	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-06-30 12:29:48,821] [INFO] GTDB search result was written to GCA_016708475.1_ASM1670847v1_genomic.fna/result_gtdb.tsv
[2023-06-30 12:29:48,821] [INFO] ===== GTDB Search completed =====
[2023-06-30 12:29:48,827] [INFO] DFAST_QC result json was written to GCA_016708475.1_ASM1670847v1_genomic.fna/dqc_result.json
[2023-06-30 12:29:48,827] [INFO] DFAST_QC completed!
[2023-06-30 12:29:48,827] [INFO] Total running time: 0h2m54s
