[2023-06-18 17:35:10,429] [INFO] DFAST_QC pipeline started.
[2023-06-18 17:35:10,431] [INFO] DFAST_QC version: 0.5.7
[2023-06-18 17:35:10,431] [INFO] DQC Reference Directory: /var/lib/cwl/stg1319343e-53c8-4c2c-9299-0a80b768b40a/dqc_reference
[2023-06-18 17:35:11,732] [INFO] ===== Start taxonomy check using ANI =====
[2023-06-18 17:35:11,733] [INFO] Task started: Prodigal
[2023-06-18 17:35:11,733] [INFO] Running command: gunzip -c /var/lib/cwl/stg8ff0de62-1f72-423f-b218-91d00ac17dab/GCA_018679975.1_ASM1867997v1_genomic.fna.gz | prodigal -d GCA_018679975.1_ASM1867997v1_genomic.fna/cds.fna -a GCA_018679975.1_ASM1867997v1_genomic.fna/protein.faa -g 11 -q > /dev/null
[2023-06-18 17:35:24,864] [INFO] Task succeeded: Prodigal
[2023-06-18 17:35:24,864] [INFO] Task started: HMMsearch
[2023-06-18 17:35:24,864] [INFO] Running command: hmmsearch --tblout GCA_018679975.1_ASM1867997v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg1319343e-53c8-4c2c-9299-0a80b768b40a/dqc_reference/reference_markers.hmm GCA_018679975.1_ASM1867997v1_genomic.fna/protein.faa > /dev/null
[2023-06-18 17:35:25,114] [INFO] Task succeeded: HMMsearch
[2023-06-18 17:35:25,116] [WARNING] Found 5/6 markers. [/var/lib/cwl/stg8ff0de62-1f72-423f-b218-91d00ac17dab/GCA_018679975.1_ASM1867997v1_genomic.fna.gz]
[2023-06-18 17:35:25,156] [INFO] Query marker FASTA was written to GCA_018679975.1_ASM1867997v1_genomic.fna/markers.fasta
[2023-06-18 17:35:25,156] [INFO] Task started: Blastn
[2023-06-18 17:35:25,157] [INFO] Running command: blastn -query GCA_018679975.1_ASM1867997v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg1319343e-53c8-4c2c-9299-0a80b768b40a/dqc_reference/reference_markers.fasta -out GCA_018679975.1_ASM1867997v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-06-18 17:35:25,822] [INFO] Task succeeded: Blastn
[2023-06-18 17:35:25,827] [INFO] Selected 26 target genomes.
[2023-06-18 17:35:25,827] [INFO] Target genome list was writen to GCA_018679975.1_ASM1867997v1_genomic.fna/target_genomes.txt
[2023-06-18 17:35:25,835] [INFO] Task started: fastANI
[2023-06-18 17:35:25,835] [INFO] Running command: fastANI --query /var/lib/cwl/stg8ff0de62-1f72-423f-b218-91d00ac17dab/GCA_018679975.1_ASM1867997v1_genomic.fna.gz --refList GCA_018679975.1_ASM1867997v1_genomic.fna/target_genomes.txt --output GCA_018679975.1_ASM1867997v1_genomic.fna/fastani_result.tsv --threads 1
[2023-06-18 17:35:42,788] [INFO] Task succeeded: fastANI
[2023-06-18 17:35:42,789] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg1319343e-53c8-4c2c-9299-0a80b768b40a/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-06-18 17:35:42,789] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg1319343e-53c8-4c2c-9299-0a80b768b40a/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-06-18 17:35:42,810] [INFO] Found 23 fastANI hits (0 hits with ANI > threshold)
[2023-06-18 17:35:42,810] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-06-18 17:35:42,811] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Maribacter arenosus	strain=CAU 1321	GCA_014610845.1	1854708	1854708	type	True	78.0308	246	652	95	below_threshold
Maribacter polysiphoniae	strain=KCTC 22021	GCA_014673435.1	429344	429344	type	True	77.826	213	652	95	below_threshold
Maribacter polysiphoniae	strain=DSM 23514	GCA_003148665.1	429344	429344	type	True	77.8197	214	652	95	below_threshold
Maribacter luteus	strain=RZ05	GCA_009674825.1	2594478	2594478	type	True	77.7389	205	652	95	below_threshold
Arenibacter arenosicollis	strain=BSSL-BM3	GCA_014397245.1	2762274	2762274	type	True	77.087	78	652	95	below_threshold
Maribacter algarum	strain=RZ26	GCA_005885635.1	2578118	2578118	type	True	76.7346	101	652	95	below_threshold
Arenibacter palladensis	strain=DSM 17539	GCA_900129275.1	237373	237373	type	True	76.6572	89	652	95	below_threshold
Arenibacter troitsensis	strain=DSM 19835	GCA_900177645.1	188872	188872	type	True	76.6415	77	652	95	below_threshold
Maribacter cobaltidurans	strain=B1	GCA_002269385.1	1178778	1178778	type	True	76.6317	72	652	95	below_threshold
Ulvibacterium marinum	strain=CCMM003	GCA_003626755.1	2419782	2419782	type	True	76.6149	93	652	95	below_threshold
Maribacter cobaltidurans	strain=CGMCC 1.15508	GCA_014643435.1	1178778	1178778	type	True	76.6134	71	652	95	below_threshold
Costertonia aggregata	strain=KCCM 42265	GCA_013402795.1	343403	343403	type	True	76.5845	85	652	95	below_threshold
Maribacter dokdonensis	strain=DSW-8	GCA_001447995.1	320912	320912	type	True	76.4477	92	652	95	below_threshold
Arenibacter algicola	strain=TG409	GCA_000733925.1	616991	616991	type	True	76.4376	89	652	95	below_threshold
Muricauda parva	strain=DSM 25885	GCA_900215465.1	1247520	1247520	type	True	76.3977	61	652	95	below_threshold
Eudoraea adriatica	strain=DSM 19308	GCA_000382125.1	446681	446681	type	True	76.3919	77	652	95	below_threshold
Muricauda onchidii	strain=XY-359	GCA_004804315.1	2562684	2562684	type	True	76.3559	57	652	95	below_threshold
Arenibacter echinorum	strain=DSM 23522	GCA_003259375.1	440515	440515	type	True	76.34	88	652	95	below_threshold
Arenibacter catalasegens	strain=P308H10	GCA_002909235.1	2056779	2056779	type	True	76.3342	87	652	95	below_threshold
Maribacter arcticus	strain=DSM 23546	GCA_900167935.1	561365	561365	type	True	76.0909	105	652	95	below_threshold
Muricauda lutimaris	strain=KCTC 22173	GCA_003581615.1	475082	475082	type	True	76.0373	62	652	95	below_threshold
Muricauda hymeniacidonis	strain=176CP4-71	GCA_004296335.1	2517819	2517819	type	True	76.0326	50	652	95	below_threshold
Muricauda profundi	strain=BC31-3-A3	GCA_017313275.1	2915620	2915620	type	True	75.8722	54	652	95	below_threshold
--------------------------------------------------------------------------------
[2023-06-18 17:35:42,812] [INFO] DFAST Taxonomy check result was written to GCA_018679975.1_ASM1867997v1_genomic.fna/tc_result.tsv
[2023-06-18 17:35:42,813] [INFO] ===== Taxonomy check completed =====
[2023-06-18 17:35:42,814] [INFO] ===== Start completeness check using CheckM =====
[2023-06-18 17:35:42,814] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg1319343e-53c8-4c2c-9299-0a80b768b40a/dqc_reference/checkm_data
[2023-06-18 17:35:42,815] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-06-18 17:35:42,851] [INFO] Task started: CheckM
[2023-06-18 17:35:42,852] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_018679975.1_ASM1867997v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_018679975.1_ASM1867997v1_genomic.fna/checkm_input GCA_018679975.1_ASM1867997v1_genomic.fna/checkm_result
[2023-06-18 17:36:21,745] [INFO] Task succeeded: CheckM
[2023-06-18 17:36:21,746] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 85.52%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-06-18 17:36:21,772] [INFO] ===== Completeness check finished =====
[2023-06-18 17:36:21,772] [INFO] ===== Start GTDB Search =====
[2023-06-18 17:36:21,773] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_018679975.1_ASM1867997v1_genomic.fna/markers.fasta)
[2023-06-18 17:36:21,773] [INFO] Task started: Blastn
[2023-06-18 17:36:21,773] [INFO] Running command: blastn -query GCA_018679975.1_ASM1867997v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg1319343e-53c8-4c2c-9299-0a80b768b40a/dqc_reference/reference_markers_gtdb.fasta -out GCA_018679975.1_ASM1867997v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-06-18 17:36:22,839] [INFO] Task succeeded: Blastn
[2023-06-18 17:36:22,844] [INFO] Selected 19 target genomes.
[2023-06-18 17:36:22,844] [INFO] Target genome list was writen to GCA_018679975.1_ASM1867997v1_genomic.fna/target_genomes_gtdb.txt
[2023-06-18 17:36:22,861] [INFO] Task started: fastANI
[2023-06-18 17:36:22,861] [INFO] Running command: fastANI --query /var/lib/cwl/stg8ff0de62-1f72-423f-b218-91d00ac17dab/GCA_018679975.1_ASM1867997v1_genomic.fna.gz --refList GCA_018679975.1_ASM1867997v1_genomic.fna/target_genomes_gtdb.txt --output GCA_018679975.1_ASM1867997v1_genomic.fna/fastani_result_gtdb.tsv --threads 1
[2023-06-18 17:36:36,873] [INFO] Task succeeded: fastANI
[2023-06-18 17:36:36,887] [INFO] Found 17 fastANI hits (1 hits with ANI > circumscription radius)
[2023-06-18 17:36:36,888] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCA_013001935.1	s__Maribacter_A sp013001935	96.7681	385	652	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Maribacter_A	95.0	96.76	96.72	0.66	0.66	3	conclusive
GCA_018679695.1	s__Maribacter_A sp018679695	80.547	369	652	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Maribacter_A	95.0	99.00	99.00	0.82	0.82	2	-
GCF_000153165.2	s__Maribacter_A sp000153165	78.6574	301	652	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Maribacter_A	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014610845.1	s__Maribacter_A arenosus	78.0308	246	652	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Maribacter_A	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003148665.1	s__Maribacter_A polysiphoniae	77.8197	214	652	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Maribacter_A	95.0	99.27	98.55	0.95	0.91	3	-
GCF_014596745.1	s__Maribacter_A sp014596745	77.749	201	652	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Maribacter_A	95.0	N/A	N/A	N/A	N/A	1	-
GCF_009674825.1	s__Maribacter_A luteus	77.7389	205	652	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Maribacter_A	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014397245.1	s__Arenibacter arenosicollis	77.087	78	652	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Arenibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900177645.1	s__Arenibacter troitsensis	76.6415	77	652	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Arenibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003626755.1	s__Ulvibacterium marinum	76.6149	93	652	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Ulvibacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003970695.1	s__Maribacter sp002742365	76.5851	98	652	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Maribacter	95.0	96.61	96.61	0.86	0.86	2	-
GCF_001413955.1	s__Muricauda eckloniae	76.5352	77	652	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Muricauda	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900215465.1	s__Muricauda pacifica_A	76.3977	61	652	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Muricauda	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001430825.1	s__Sediminicola sp001430825	76.3787	81	652	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Sediminicola	95.0	N/A	N/A	N/A	N/A	1	-
GCF_008040165.1	s__Muricauda hymeniacidonis	76.3648	79	652	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Muricauda	95.0	N/A	N/A	N/A	N/A	1	-
GCF_017313105.1	s__Muricauda sp017313105	76.1693	55	652	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Muricauda	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003581615.1	s__Muricauda lutimaris	76.0373	62	652	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Muricauda	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-06-18 17:36:36,890] [INFO] GTDB search result was written to GCA_018679975.1_ASM1867997v1_genomic.fna/result_gtdb.tsv
[2023-06-18 17:36:36,891] [INFO] ===== GTDB Search completed =====
[2023-06-18 17:36:36,900] [INFO] DFAST_QC result json was written to GCA_018679975.1_ASM1867997v1_genomic.fna/dqc_result.json
[2023-06-18 17:36:36,900] [INFO] DFAST_QC completed!
[2023-06-18 17:36:36,901] [INFO] Total running time: 0h1m26s
