[2023-03-16 23:35:12,980] [INFO] DFAST_QC pipeline started.
[2023-03-16 23:35:12,980] [INFO] DFAST_QC version: 0.5.7
[2023-03-16 23:35:12,981] [INFO] DQC Reference Directory: /var/lib/cwl/stgdce84a48-cfb0-4fd4-9510-706778775473/dqc_reference
[2023-03-16 23:35:14,132] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-16 23:35:14,132] [INFO] Task started: Prodigal
[2023-03-16 23:35:14,132] [INFO] Running command: cat /var/lib/cwl/stg05c4542a-3d6b-47e0-b209-b081502e24db/OceanDNA-b29385.fa | prodigal -d OceanDNA-b29385/cds.fna -a OceanDNA-b29385/protein.faa -g 11 -q > /dev/null
[2023-03-16 23:35:32,410] [INFO] Task succeeded: Prodigal
[2023-03-16 23:35:32,411] [INFO] Task started: HMMsearch
[2023-03-16 23:35:32,411] [INFO] Running command: hmmsearch --tblout OceanDNA-b29385/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgdce84a48-cfb0-4fd4-9510-706778775473/dqc_reference/reference_markers.hmm OceanDNA-b29385/protein.faa > /dev/null
[2023-03-16 23:35:32,622] [INFO] Task succeeded: HMMsearch
[2023-03-16 23:35:32,623] [WARNING] Found 5/6 markers. [/var/lib/cwl/stg05c4542a-3d6b-47e0-b209-b081502e24db/OceanDNA-b29385.fa]
[2023-03-16 23:35:32,647] [INFO] Query marker FASTA was written to OceanDNA-b29385/markers.fasta
[2023-03-16 23:35:32,647] [INFO] Task started: Blastn
[2023-03-16 23:35:32,647] [INFO] Running command: blastn -query OceanDNA-b29385/markers.fasta -db /var/lib/cwl/stgdce84a48-cfb0-4fd4-9510-706778775473/dqc_reference/reference_markers.fasta -out OceanDNA-b29385/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-16 23:35:33,391] [INFO] Task succeeded: Blastn
[2023-03-16 23:35:33,392] [INFO] Selected 29 target genomes.
[2023-03-16 23:35:33,392] [INFO] Target genome list was writen to OceanDNA-b29385/target_genomes.txt
[2023-03-16 23:35:33,409] [INFO] Task started: fastANI
[2023-03-16 23:35:33,409] [INFO] Running command: fastANI --query /var/lib/cwl/stg05c4542a-3d6b-47e0-b209-b081502e24db/OceanDNA-b29385.fa --refList OceanDNA-b29385/target_genomes.txt --output OceanDNA-b29385/fastani_result.tsv --threads 1
[2023-03-16 23:35:53,085] [INFO] Task succeeded: fastANI
[2023-03-16 23:35:53,085] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgdce84a48-cfb0-4fd4-9510-706778775473/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-16 23:35:53,086] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgdce84a48-cfb0-4fd4-9510-706778775473/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-16 23:35:53,102] [INFO] Found 29 fastANI hits (0 hits with ANI > threshold)
[2023-03-16 23:35:53,102] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-16 23:35:53,102] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Hasllibacter halocynthiae	strain=DSM 29318	GCA_003003095.1	595589	595589	type	True	77.5961	329	1036	95	below_threshold
Limimaricola pyoseonensis	strain=DSM 21424	GCA_900102015.1	521013	521013	type	True	77.5839	357	1036	95	below_threshold
Jannaschia formosa	strain=12N15	GCA_003340555.1	2259592	2259592	type	True	77.1796	344	1036	95	below_threshold
Rubellimicrobium roseum	strain=YIM 48858	GCA_006152145.1	687525	687525	type	True	77.1668	287	1036	95	below_threshold
Roseitranquillus sediminis	strain=MCCB 386	GCA_016918935.1	2809051	2809051	type	True	77.0683	262	1036	95	below_threshold
Jannaschia marina	strain=SHC163	GCA_013404595.1	2741674	2741674	type	True	77.0451	277	1036	95	below_threshold
Roseivivax isoporae	strain=LMG 25204	GCA_000521865.1	591206	591206	type	True	76.9987	338	1036	95	below_threshold
Wenxinia marina	strain=DSM 24838	GCA_000379485.1	390641	390641	type	True	76.9043	337	1036	95	below_threshold
Rubellimicrobium mesophilum	strain=DSM 19309	GCA_000600335.2	1123067	1123067	type	True	76.9028	295	1036	95	below_threshold
Wenxinia marina	strain=CGMCC 1.6105	GCA_014645075.1	390641	390641	type	True	76.8634	342	1036	95	below_threshold
Meinhardsimonia xiamenensis	strain=CGMCC 1.10789	GCA_900102905.1	990712	990712	type	True	76.7428	219	1036	95	below_threshold
Meinhardsimonia xiamenensis	strain=DSM 24422	GCA_003001835.1	990712	990712	type	True	76.739	223	1036	95	below_threshold
Cereibacter sediminicola	strain=JA983	GCA_007668225.1	2584941	2584941	type	True	76.6536	257	1036	95	below_threshold
Maritimibacter harenae	strain=DP07	GCA_009882975.1	2606218	2606218	type	True	76.6429	197	1036	95	below_threshold
Cereibacter changlensis	strain=DSM 18774	GCA_003254335.1	402884	402884	type	True	76.5376	258	1036	95	below_threshold
Allosediminivita pacifica	strain=CGMCC 1.12410	GCA_014637495.1	1267769	1267769	type	True	76.4891	216	1036	95	below_threshold
Silicimonas algicola	strain=KC90	GCA_003970735.1	1826607	1826607	type	True	76.4855	208	1036	95	below_threshold
Allosediminivita pacifica	strain=DSM 29329	GCA_003054175.1	1267769	1267769	type	True	76.4621	218	1036	95	below_threshold
Tropicimonas isoalkanivorans	strain=DSM 19548	GCA_900112335.1	441112	441112	type	True	76.4566	170	1036	95	below_threshold
Silicimonas algicola	strain=DSM 103371	GCA_003148765.1	1826607	1826607	type	True	76.4455	210	1036	95	below_threshold
Halovulum marinum	strain=2CG4	GCA_009697225.1	2662447	2662447	type	True	76.445	275	1036	95	below_threshold
Halovulum dunhuangense	strain=YYQ-30	GCA_013093415.1	1505036	1505036	type	True	76.3674	191	1036	95	below_threshold
Phaeovulum vinaykumarii	strain=DSM 18714	GCA_900156695.1	407234	407234	type	True	76.1854	174	1036	95	below_threshold
Phaeovulum vinaykumarii	strain=JA123	GCA_900217755.1	407234	407234	type	True	76.1779	174	1036	95	below_threshold
Mangrovicoccus algicola	strain=HB182678	GCA_014903745.1	2771008	2771008	type	True	76.1778	205	1036	95	below_threshold
Antarcticimicrobium luteum	strain=318-1	GCA_004358185.1	2547397	2547397	type	True	76.0819	171	1036	95	below_threshold
Paracoccus mutanolyticus	strain=RSP-02	GCA_003285265.1	1499308	1499308	type	True	76.0521	159	1036	95	below_threshold
Leisingera daeponensis	strain=DSM 23529	GCA_000473145.1	405746	405746	type	True	75.9329	137	1036	95	below_threshold
Paracoccus tegillarcae	strain=BM15	GCA_002847305.1	1529068	1529068	type	True	75.5772	70	1036	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-16 23:35:53,102] [INFO] DFAST Taxonomy check result was written to OceanDNA-b29385/tc_result.tsv
[2023-03-16 23:35:53,102] [INFO] ===== Taxonomy check completed =====
[2023-03-16 23:35:53,102] [INFO] ===== Start completeness check using CheckM =====
[2023-03-16 23:35:53,102] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgdce84a48-cfb0-4fd4-9510-706778775473/dqc_reference/checkm_data
[2023-03-16 23:35:53,103] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-16 23:35:53,108] [INFO] Task started: CheckM
[2023-03-16 23:35:53,108] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b29385/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b29385/checkm_input OceanDNA-b29385/checkm_result
[2023-03-16 23:36:39,906] [INFO] Task succeeded: CheckM
[2023-03-16 23:36:39,906] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 54.17%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-03-16 23:36:39,909] [INFO] ===== Completeness check finished =====
[2023-03-16 23:36:39,909] [INFO] ===== Start GTDB Search =====
[2023-03-16 23:36:39,910] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b29385/markers.fasta)
[2023-03-16 23:36:39,911] [INFO] Task started: Blastn
[2023-03-16 23:36:39,911] [INFO] Running command: blastn -query OceanDNA-b29385/markers.fasta -db /var/lib/cwl/stgdce84a48-cfb0-4fd4-9510-706778775473/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b29385/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-16 23:36:41,272] [INFO] Task succeeded: Blastn
[2023-03-16 23:36:41,273] [INFO] Selected 26 target genomes.
[2023-03-16 23:36:41,273] [INFO] Target genome list was writen to OceanDNA-b29385/target_genomes_gtdb.txt
[2023-03-16 23:36:41,720] [INFO] Task started: fastANI
[2023-03-16 23:36:41,720] [INFO] Running command: fastANI --query /var/lib/cwl/stg05c4542a-3d6b-47e0-b209-b081502e24db/OceanDNA-b29385.fa --refList OceanDNA-b29385/target_genomes_gtdb.txt --output OceanDNA-b29385/fastani_result_gtdb.tsv --threads 1
[2023-03-16 23:37:02,785] [INFO] Task succeeded: fastANI
[2023-03-16 23:37:02,800] [INFO] Found 26 fastANI hits (0 hits with ANI > circumscription radius)
[2023-03-16 23:37:02,801] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_900102015.1	s__Limimaricola pyoseonensis	77.6041	354	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Limimaricola	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003003095.1	s__Hasllibacter halocynthiae	77.5658	333	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Hasllibacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003385555.1	s__Rhodosalinus sediminis	77.3428	348	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodosalinus	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003340555.1	s__Jannaschia formosa	77.1783	344	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Jannaschia	95.0	N/A	N/A	N/A	N/A	1	-
GCF_016820245.1	s__Jannaschia sp016820245	77.1401	350	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Jannaschia	95.0	N/A	N/A	N/A	N/A	1	-
GCA_016918935.1	s__MCCB-386 sp016918935	77.0878	260	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__MCCB-386	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003789055.1	s__Oceanicola lentulus	77.0585	295	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Oceanicola	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003340565.1	s__HLUCCA09 sp003340565	77.0057	349	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__HLUCCA09	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000153305.1	s__Oceanicola granulosus	77.0029	359	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Oceanicola	95.0	N/A	N/A	N/A	N/A	1	-
GCA_001314685.1	s__HLUCCA09 sp001314685	76.9249	320	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__HLUCCA09	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000600335.2	s__Rubellimicrobium mesophilum	76.9039	295	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rubellimicrobium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900115595.1	s__Tranquillimonas alkanivorans	76.8871	294	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Tranquillimonas	95.0	N/A	N/A	N/A	N/A	1	-
GCA_004005435.1	s__CCMM004 sp004005435	76.8804	315	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__CCMM004	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900102905.1	s__Meinhardsimonia xiamenensis	76.7428	219	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Meinhardsimonia	95.0	100.00	100.00	1.00	1.00	2	-
GCF_004011175.1	s__Tranquillimonas littorinae	76.5523	233	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Tranquillimonas	95.0	N/A	N/A	N/A	N/A	1	-
GCF_005870025.1	s__Mangrovicoccus sp005870025	76.5474	321	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Mangrovicoccus	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003054175.1	s__Allosediminivita pacifica	76.4709	217	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Allosediminivita	95.0	100.00	100.00	1.00	1.00	2	-
GCF_003148765.1	s__Silicimonas algicola	76.4455	210	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Silicimonas	95.0	100.00	100.00	1.00	1.00	2	-
GCA_014762635.1	s__JABURR01 sp014762635	76.437	180	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__JABURR01	95.0	N/A	N/A	N/A	N/A	1	-
GCF_013184745.1	s__Mangrovicoccus sp013184745	76.3078	301	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Mangrovicoccus	95.0	N/A	N/A	N/A	N/A	1	-
GCA_003511785.1	s__UBA7951 sp003511785	76.2607	159	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__UBA7951	95.0	N/A	N/A	N/A	N/A	1	-
GCA_007131685.1	s__Rhodobaculum sp007131685	76.2595	178	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodobaculum	95.0	99.44	99.41	0.86	0.86	3	-
GCF_900156695.1	s__Bieblia_A vinaykumarii	76.1938	173	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Bieblia_A	95.0	100.00	100.00	1.00	1.00	2	-
GCA_003551925.1	s__PUOA01 sp003551925	76.1859	165	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__PUOA01	95.0	N/A	N/A	N/A	N/A	1	-
GCA_003254465.1	s__Fluviibacterium sp003254465	76.1326	182	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Fluviibacterium	95.0	N/A	N/A	N/A	N/A	1	-
GCF_004358185.1	s__Antarcticimicrobium luteum	76.0635	173	1036	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Antarcticimicrobium	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-03-16 23:37:02,801] [INFO] GTDB search result was written to OceanDNA-b29385/result_gtdb.tsv
[2023-03-16 23:37:02,801] [INFO] ===== GTDB Search completed =====
[2023-03-16 23:37:02,804] [INFO] DFAST_QC result json was written to OceanDNA-b29385/dqc_result.json
[2023-03-16 23:37:02,804] [INFO] DFAST_QC completed!
[2023-03-16 23:37:02,804] [INFO] Total running time: 0h1m50s
