[2023-03-16 23:55:45,749] [INFO] DFAST_QC pipeline started.
[2023-03-16 23:55:45,751] [INFO] DFAST_QC version: 0.5.7
[2023-03-16 23:55:45,752] [INFO] DQC Reference Directory: /var/lib/cwl/stge64875d9-0a23-4b1d-9c71-22d9faeb2d15/dqc_reference
[2023-03-16 23:55:46,851] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-16 23:55:46,851] [INFO] Task started: Prodigal
[2023-03-16 23:55:46,852] [INFO] Running command: cat /var/lib/cwl/stg5a2b5412-3a20-4a9e-9b08-3b611dab5d34/OceanDNA-b27619.fa | prodigal -d OceanDNA-b27619/cds.fna -a OceanDNA-b27619/protein.faa -g 11 -q > /dev/null
[2023-03-16 23:56:15,165] [INFO] Task succeeded: Prodigal
[2023-03-16 23:56:15,165] [INFO] Task started: HMMsearch
[2023-03-16 23:56:15,165] [INFO] Running command: hmmsearch --tblout OceanDNA-b27619/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stge64875d9-0a23-4b1d-9c71-22d9faeb2d15/dqc_reference/reference_markers.hmm OceanDNA-b27619/protein.faa > /dev/null
[2023-03-16 23:56:15,385] [INFO] Task succeeded: HMMsearch
[2023-03-16 23:56:15,385] [INFO] Found 6/6 markers.
[2023-03-16 23:56:15,542] [INFO] Query marker FASTA was written to OceanDNA-b27619/markers.fasta
[2023-03-16 23:56:15,544] [INFO] Task started: Blastn
[2023-03-16 23:56:15,544] [INFO] Running command: blastn -query OceanDNA-b27619/markers.fasta -db /var/lib/cwl/stge64875d9-0a23-4b1d-9c71-22d9faeb2d15/dqc_reference/reference_markers.fasta -out OceanDNA-b27619/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-16 23:56:16,416] [INFO] Task succeeded: Blastn
[2023-03-16 23:56:16,473] [INFO] Selected 16 target genomes.
[2023-03-16 23:56:16,474] [INFO] Target genome list was writen to OceanDNA-b27619/target_genomes.txt
[2023-03-16 23:56:16,483] [INFO] Task started: fastANI
[2023-03-16 23:56:16,484] [INFO] Running command: fastANI --query /var/lib/cwl/stg5a2b5412-3a20-4a9e-9b08-3b611dab5d34/OceanDNA-b27619.fa --refList OceanDNA-b27619/target_genomes.txt --output OceanDNA-b27619/fastani_result.tsv --threads 1
[2023-03-16 23:56:31,681] [INFO] Task succeeded: fastANI
[2023-03-16 23:56:31,681] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stge64875d9-0a23-4b1d-9c71-22d9faeb2d15/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-16 23:56:31,682] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stge64875d9-0a23-4b1d-9c71-22d9faeb2d15/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-16 23:56:31,692] [INFO] Found 16 fastANI hits (0 hits with ANI > threshold)
[2023-03-16 23:56:31,692] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-16 23:56:31,692] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Albimonas pacifica	strain=CGMCC 1.11030	GCA_900113695.1	1114924	1114924	type	True	89.2202	1311	1608	95	below_threshold
Albimonas donghaensis	strain=DSM 17890	GCA_900106695.1	356660	356660	type	True	81.7499	968	1608	95	below_threshold
Oceanicella actignis	strain=DSM 22673	GCA_008124525.1	1189325	1189325	type	True	78.5989	587	1608	95	below_threshold
Allosediminivita pacifica	strain=DSM 29329	GCA_003054175.1	1267769	1267769	type	True	77.8791	387	1608	95	below_threshold
Allosediminivita pacifica	strain=CGMCC 1.12410	GCA_014637495.1	1267769	1267769	type	True	77.8256	388	1608	95	below_threshold
Pseudooceanicola nanhaiensis	strain=DSM 18065	GCA_000688295.1	375761	375761	type	True	77.7633	471	1608	95	below_threshold
Pseudooceanicola nanhaiensis	strain=CGMCC 1.6293	GCA_014645095.1	375761	375761	type	True	77.7162	482	1608	95	below_threshold
Yangia pacifica	strain=DSM 26894	GCA_900116195.1	311180	311180	suspected-type	True	77.5387	458	1608	95	below_threshold
Yangia pacifica	strain=CGMCC 1.3455	GCA_900100725.1	311180	311180	suspected-type	True	77.5336	459	1608	95	below_threshold
Sinirhodobacter hankyongi	strain=BO-81	GCA_003664585.1	2294033	2294033	type	True	77.486	402	1608	95	below_threshold
Cereibacter changlensis	strain=DSM 18774	GCA_003254335.1	402884	402884	type	True	77.432	466	1608	95	below_threshold
Roseivivax isoporae	strain=LMG 25204	GCA_000521865.1	591206	591206	type	True	77.2676	492	1608	95	below_threshold
Rubrimonas cliftonensis	strain=DSM 15345	GCA_900107585.1	89524	89524	type	True	77.2629	563	1608	95	below_threshold
Amaricoccus solimangrovi	strain=HB172011	GCA_006385685.1	2589815	2589815	type	True	77.2346	441	1608	95	below_threshold
Dinoroseobacter shibae	strain=DFL 12	GCA_000018145.1	215813	215813	type	True	77.0118	314	1608	95	below_threshold
Frigidibacter albus	strain=SP32	GCA_009881095.1	1465486	1465486	type	True	76.9482	408	1608	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-16 23:56:31,695] [INFO] DFAST Taxonomy check result was written to OceanDNA-b27619/tc_result.tsv
[2023-03-16 23:56:31,701] [INFO] ===== Taxonomy check completed =====
[2023-03-16 23:56:31,701] [INFO] ===== Start completeness check using CheckM =====
[2023-03-16 23:56:31,701] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stge64875d9-0a23-4b1d-9c71-22d9faeb2d15/dqc_reference/checkm_data
[2023-03-16 23:56:31,702] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-16 23:56:31,712] [INFO] Task started: CheckM
[2023-03-16 23:56:31,712] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b27619/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b27619/checkm_input OceanDNA-b27619/checkm_result
[2023-03-16 23:57:43,782] [INFO] Task succeeded: CheckM
[2023-03-16 23:57:43,783] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 93.52%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-03-16 23:57:43,997] [INFO] ===== Completeness check finished =====
[2023-03-16 23:57:43,998] [INFO] ===== Start GTDB Search =====
[2023-03-16 23:57:43,998] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b27619/markers.fasta)
[2023-03-16 23:57:43,999] [INFO] Task started: Blastn
[2023-03-16 23:57:43,999] [INFO] Running command: blastn -query OceanDNA-b27619/markers.fasta -db /var/lib/cwl/stge64875d9-0a23-4b1d-9c71-22d9faeb2d15/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b27619/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-16 23:57:45,559] [INFO] Task succeeded: Blastn
[2023-03-16 23:57:45,572] [INFO] Selected 15 target genomes.
[2023-03-16 23:57:45,572] [INFO] Target genome list was writen to OceanDNA-b27619/target_genomes_gtdb.txt
[2023-03-16 23:57:45,840] [INFO] Task started: fastANI
[2023-03-16 23:57:45,840] [INFO] Running command: fastANI --query /var/lib/cwl/stg5a2b5412-3a20-4a9e-9b08-3b611dab5d34/OceanDNA-b27619.fa --refList OceanDNA-b27619/target_genomes_gtdb.txt --output OceanDNA-b27619/fastani_result_gtdb.tsv --threads 1
[2023-03-16 23:58:01,156] [INFO] Task succeeded: fastANI
[2023-03-16 23:58:01,165] [INFO] Found 15 fastANI hits (0 hits with ANI > circumscription radius)
[2023-03-16 23:58:01,165] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_900113695.1	s__Albimonas pacifica	89.1514	1317	1608	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Albimonas	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900106695.1	s__Albimonas donghaensis	81.7552	967	1608	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Albimonas	95.0	97.11	97.08	0.90	0.88	3	-
GCF_008124525.1	s__Oceanicella actignis	78.5544	594	1608	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Oceanicella	95.0	98.87	98.87	0.96	0.96	3	-
GCF_003448965.1	s__Paroceanicella sp003448965	78.4294	686	1608	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Paroceanicella	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002300555.1	s__Salipiger sp002300555	77.9534	366	1608	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Salipiger	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003054175.1	s__Allosediminivita pacifica	77.9057	384	1608	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Allosediminivita	95.0	100.00	100.00	1.00	1.00	2	-
GCF_012395815.1	s__Roseicyclus sp012395815	77.6492	497	1608	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseicyclus	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000153305.1	s__Oceanicola granulosus	77.5018	486	1608	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Oceanicola	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003340565.1	s__HLUCCA09 sp003340565	77.4829	472	1608	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__HLUCCA09	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002814095.1	s__Sagittula sp002814095	77.4266	380	1608	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sagittula	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000521865.1	s__Roseivivax isoporae	77.2789	490	1608	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseivivax	95.0	N/A	N/A	N/A	N/A	1	-
GCF_005870025.1	s__Mangrovicoccus sp005870025	77.2144	592	1608	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Mangrovicoccus	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003122205.1	s__QEYE01 sp003122205	77.2132	321	1608	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__QEYE01	95.0	N/A	N/A	N/A	N/A	1	-
GCF_009908165.1	s__Frigidibacter albus	77.0067	394	1608	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Frigidibacter	95.0	100.00	100.00	1.00	0.99	4	-
GCF_000169415.1	s__Sagittula stellata	76.9595	309	1608	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sagittula	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-03-16 23:58:01,170] [INFO] GTDB search result was written to OceanDNA-b27619/result_gtdb.tsv
[2023-03-16 23:58:01,176] [INFO] ===== GTDB Search completed =====
[2023-03-16 23:58:01,183] [INFO] DFAST_QC result json was written to OceanDNA-b27619/dqc_result.json
[2023-03-16 23:58:01,183] [INFO] DFAST_QC completed!
[2023-03-16 23:58:01,183] [INFO] Total running time: 0h2m15s
