[2023-03-18 10:54:49,134] [INFO] DFAST_QC pipeline started.
[2023-03-18 10:54:49,134] [INFO] DFAST_QC version: 0.5.7
[2023-03-18 10:54:49,134] [INFO] DQC Reference Directory: /var/lib/cwl/stg93b5791f-620d-434f-a54a-b17eb790e7f3/dqc_reference
[2023-03-18 10:54:50,287] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-18 10:54:50,288] [INFO] Task started: Prodigal
[2023-03-18 10:54:50,288] [INFO] Running command: cat /var/lib/cwl/stg1d9fe82a-330f-41bb-9128-d4c4af4d6b60/OceanDNA-b24941.fa | prodigal -d OceanDNA-b24941/cds.fna -a OceanDNA-b24941/protein.faa -g 11 -q > /dev/null
[2023-03-18 10:55:12,763] [INFO] Task succeeded: Prodigal
[2023-03-18 10:55:12,764] [INFO] Task started: HMMsearch
[2023-03-18 10:55:12,764] [INFO] Running command: hmmsearch --tblout OceanDNA-b24941/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg93b5791f-620d-434f-a54a-b17eb790e7f3/dqc_reference/reference_markers.hmm OceanDNA-b24941/protein.faa > /dev/null
[2023-03-18 10:55:12,959] [INFO] Task succeeded: HMMsearch
[2023-03-18 10:55:12,959] [INFO] Found 6/6 markers.
[2023-03-18 10:55:12,991] [INFO] Query marker FASTA was written to OceanDNA-b24941/markers.fasta
[2023-03-18 10:55:12,992] [INFO] Task started: Blastn
[2023-03-18 10:55:12,992] [INFO] Running command: blastn -query OceanDNA-b24941/markers.fasta -db /var/lib/cwl/stg93b5791f-620d-434f-a54a-b17eb790e7f3/dqc_reference/reference_markers.fasta -out OceanDNA-b24941/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-18 10:55:13,911] [INFO] Task succeeded: Blastn
[2023-03-18 10:55:13,921] [INFO] Selected 21 target genomes.
[2023-03-18 10:55:13,921] [INFO] Target genome list was writen to OceanDNA-b24941/target_genomes.txt
[2023-03-18 10:55:13,933] [INFO] Task started: fastANI
[2023-03-18 10:55:13,933] [INFO] Running command: fastANI --query /var/lib/cwl/stg1d9fe82a-330f-41bb-9128-d4c4af4d6b60/OceanDNA-b24941.fa --refList OceanDNA-b24941/target_genomes.txt --output OceanDNA-b24941/fastani_result.tsv --threads 1
[2023-03-18 10:55:29,573] [INFO] Task succeeded: fastANI
[2023-03-18 10:55:29,573] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg93b5791f-620d-434f-a54a-b17eb790e7f3/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-18 10:55:29,573] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg93b5791f-620d-434f-a54a-b17eb790e7f3/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-18 10:55:29,584] [INFO] Found 19 fastANI hits (0 hits with ANI > threshold)
[2023-03-18 10:55:29,584] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-18 10:55:29,585] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Parvibaculum lavamentivorans	strain=DS-1	GCA_000017565.1	256618	256618	type	True	82.9747	857	1218	95	below_threshold
Parvibaculum indicum	strain=DSM 25305	GCA_011762095.1	562969	562969	type	True	79.6259	583	1218	95	below_threshold
Parvibaculum sedimenti	strain=HXT-9	GCA_009184905.1	2608632	2608632	type	True	79.5723	541	1218	95	below_threshold
Tepidicaulis marinus	strain=MA2	GCA_000739695.1	1333998	1333998	type	True	77.9758	297	1218	95	below_threshold
Kaistia granuli	strain=Ko04	GCA_000380505.1	363259	363259	type	True	76.9556	197	1218	95	below_threshold
Kaistia adipata	strain=DSM 17808	GCA_000423225.1	166954	166954	type	True	76.937	214	1218	95	below_threshold
Kaistia hirudinis	strain=DSM 25966	GCA_014196455.1	1293440	1293440	type	True	76.8838	212	1218	95	below_threshold
Oharaeibacter diazotrophicus	strain=SM30	GCA_011317485.1	1920512	1920512	type	True	76.792	186	1218	95	below_threshold
Kaistia soli	strain=DSM 19436	GCA_900129325.1	446684	446684	type	True	76.7886	160	1218	95	below_threshold
Chelatococcus caeni	strain=DSM 103737	GCA_014196925.1	1348468	1348468	type	True	76.6695	230	1218	95	below_threshold
Methylobrevis pamukkalensis	strain=PK2	GCA_001720135.1	1439726	1439726	type	True	76.6368	193	1218	95	below_threshold
Oharaeibacter diazotrophicus	strain=DSM 102969	GCA_004362745.1	1920512	1920512	type	True	76.592	222	1218	95	below_threshold
Stappia albiluteola	strain=F7233	GCA_014050225.1	2758565	2758565	type	True	76.5411	193	1218	95	below_threshold
Methylobacterium bullatum	strain=DSM 21893	GCA_022179105.1	570505	570505	type	True	76.5134	145	1218	95	below_threshold
Shinella daejeonensis	strain=JCM 16236	GCA_024281235.1	659017	659017	type	True	76.3047	150	1218	95	below_threshold
Nitratireductor aquibiodomus	strain=JCM 21793	GCA_000615975.1	204799	204799	type	True	76.2909	117	1218	95	below_threshold
Sphingomonas citri	strain=RRHST34	GCA_019429485.1	2862499	2862499	type	True	76.2453	108	1218	95	below_threshold
Rhodovarius crocodyli	strain=CCP-6	GCA_004005855.1	1979269	1979269	type	True	75.8129	126	1218	95	below_threshold
Sphingomonas hylomeconis	strain=CCTCC AB 2013304	GCA_025370105.1	1395958	1395958	type	True	75.6385	131	1218	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-18 10:55:29,664] [INFO] DFAST Taxonomy check result was written to OceanDNA-b24941/tc_result.tsv
[2023-03-18 10:55:29,665] [INFO] ===== Taxonomy check completed =====
[2023-03-18 10:55:29,665] [INFO] ===== Start completeness check using CheckM =====
[2023-03-18 10:55:29,665] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg93b5791f-620d-434f-a54a-b17eb790e7f3/dqc_reference/checkm_data
[2023-03-18 10:55:29,666] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-18 10:55:29,677] [INFO] Task started: CheckM
[2023-03-18 10:55:29,677] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b24941/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b24941/checkm_input OceanDNA-b24941/checkm_result
[2023-03-18 10:56:24,998] [INFO] Task succeeded: CheckM
[2023-03-18 10:56:24,998] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 83.33%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-03-18 10:56:25,001] [INFO] ===== Completeness check finished =====
[2023-03-18 10:56:25,001] [INFO] ===== Start GTDB Search =====
[2023-03-18 10:56:25,001] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b24941/markers.fasta)
[2023-03-18 10:56:25,003] [INFO] Task started: Blastn
[2023-03-18 10:56:25,003] [INFO] Running command: blastn -query OceanDNA-b24941/markers.fasta -db /var/lib/cwl/stg93b5791f-620d-434f-a54a-b17eb790e7f3/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b24941/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-18 10:56:26,864] [INFO] Task succeeded: Blastn
[2023-03-18 10:56:26,864] [INFO] Selected 11 target genomes.
[2023-03-18 10:56:26,865] [INFO] Target genome list was writen to OceanDNA-b24941/target_genomes_gtdb.txt
[2023-03-18 10:56:26,875] [INFO] Task started: fastANI
[2023-03-18 10:56:26,875] [INFO] Running command: fastANI --query /var/lib/cwl/stg1d9fe82a-330f-41bb-9128-d4c4af4d6b60/OceanDNA-b24941.fa --refList OceanDNA-b24941/target_genomes_gtdb.txt --output OceanDNA-b24941/fastani_result_gtdb.tsv --threads 1
[2023-03-18 10:56:35,216] [INFO] Task succeeded: fastANI
[2023-03-18 10:56:35,222] [INFO] Found 10 fastANI hits (0 hits with ANI > circumscription radius)
[2023-03-18 10:56:35,222] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCA_017642695.1	s__Parvibaculum sp017642695	90.6185	1087	1218	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Parvibaculales;f__Parvibaculaceae;g__Parvibaculum	95.0	99.96	99.92	0.99	0.97	4	-
GCA_002480495.1	s__Parvibaculum sp002480495	83.9653	904	1218	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Parvibaculales;f__Parvibaculaceae;g__Parvibaculum	95.0	99.50	98.87	0.94	0.88	9	-
GCA_002842875.1	s__Parvibaculum sp002842875	83.3033	762	1218	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Parvibaculales;f__Parvibaculaceae;g__Parvibaculum	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000017565.1	s__Parvibaculum lavamentivorans	82.9942	855	1218	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Parvibaculales;f__Parvibaculaceae;g__Parvibaculum	95.0	N/A	N/A	N/A	N/A	1	-
GCA_014359855.1	s__Parvibaculum sp014359855	82.8298	509	1218	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Parvibaculales;f__Parvibaculaceae;g__Parvibaculum	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002694985.1	s__Parvibaculum sp002694985	82.5793	754	1218	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Parvibaculales;f__Parvibaculaceae;g__Parvibaculum	95.0	99.99	99.99	0.99	0.99	2	-
GCA_002841195.1	s__Parvibaculum sp002841195	81.8446	636	1218	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Parvibaculales;f__Parvibaculaceae;g__Parvibaculum	95.0	N/A	N/A	N/A	N/A	1	-
GCA_008933495.1	s__Pseudorhodoplanes sp008933495	76.8924	117	1218	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Pseudorhodoplanes	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000615975.1	s__Nitratireductor aquibiodomus	76.2756	118	1218	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Nitratireductor	95.0	97.24	95.23	0.87	0.84	4	-
GCF_014194975.1	s__Sphingomonas sp014194975	75.9658	143	1218	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-03-18 10:56:35,222] [INFO] GTDB search result was written to OceanDNA-b24941/result_gtdb.tsv
[2023-03-18 10:56:35,222] [INFO] ===== GTDB Search completed =====
[2023-03-18 10:56:35,224] [INFO] DFAST_QC result json was written to OceanDNA-b24941/dqc_result.json
[2023-03-18 10:56:35,224] [INFO] DFAST_QC completed!
[2023-03-18 10:56:35,224] [INFO] Total running time: 0h1m46s
