[2023-03-19 05:01:20,090] [INFO] DFAST_QC pipeline started.
[2023-03-19 05:01:20,090] [INFO] DFAST_QC version: 0.5.7
[2023-03-19 05:01:20,090] [INFO] DQC Reference Directory: /var/lib/cwl/stga22d84e0-bc6a-4538-bec3-7997e265049c/dqc_reference
[2023-03-19 05:01:21,233] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-19 05:01:21,233] [INFO] Task started: Prodigal
[2023-03-19 05:01:21,233] [INFO] Running command: cat /var/lib/cwl/stg107b4686-51fa-4c31-9d5f-0d1305d3b69c/OceanDNA-b30073.fa | prodigal -d OceanDNA-b30073/cds.fna -a OceanDNA-b30073/protein.faa -g 11 -q > /dev/null
[2023-03-19 05:01:44,298] [INFO] Task succeeded: Prodigal
[2023-03-19 05:01:44,298] [INFO] Task started: HMMsearch
[2023-03-19 05:01:44,299] [INFO] Running command: hmmsearch --tblout OceanDNA-b30073/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stga22d84e0-bc6a-4538-bec3-7997e265049c/dqc_reference/reference_markers.hmm OceanDNA-b30073/protein.faa > /dev/null
[2023-03-19 05:01:44,508] [INFO] Task succeeded: HMMsearch
[2023-03-19 05:01:44,509] [INFO] Found 6/6 markers.
[2023-03-19 05:01:44,532] [INFO] Query marker FASTA was written to OceanDNA-b30073/markers.fasta
[2023-03-19 05:01:44,533] [INFO] Task started: Blastn
[2023-03-19 05:01:44,533] [INFO] Running command: blastn -query OceanDNA-b30073/markers.fasta -db /var/lib/cwl/stga22d84e0-bc6a-4538-bec3-7997e265049c/dqc_reference/reference_markers.fasta -out OceanDNA-b30073/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-19 05:01:45,296] [INFO] Task succeeded: Blastn
[2023-03-19 05:01:45,296] [INFO] Selected 23 target genomes.
[2023-03-19 05:01:45,297] [INFO] Target genome list was writen to OceanDNA-b30073/target_genomes.txt
[2023-03-19 05:01:45,311] [INFO] Task started: fastANI
[2023-03-19 05:01:45,311] [INFO] Running command: fastANI --query /var/lib/cwl/stg107b4686-51fa-4c31-9d5f-0d1305d3b69c/OceanDNA-b30073.fa --refList OceanDNA-b30073/target_genomes.txt --output OceanDNA-b30073/fastani_result.tsv --threads 1
[2023-03-19 05:02:00,838] [INFO] Task succeeded: fastANI
[2023-03-19 05:02:00,838] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stga22d84e0-bc6a-4538-bec3-7997e265049c/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-19 05:02:00,838] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stga22d84e0-bc6a-4538-bec3-7997e265049c/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-19 05:02:00,851] [INFO] Found 23 fastANI hits (0 hits with ANI > threshold)
[2023-03-19 05:02:00,851] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-19 05:02:00,851] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Roseovarius indicus	strain=B108	GCA_001441635.1	540747	540747	type	True	77.5418	376	1170	95	below_threshold
Roseovarius atlanticus	strain=R12B	GCA_001441615.1	1641875	1641875	type	True	77.5399	331	1170	95	below_threshold
Roseovarius indicus	strain=DSM 26383	GCA_008728195.1	540747	540747	type	True	77.5318	380	1170	95	below_threshold
Roseovarius indicus	strain=DSM 26383	GCA_900112725.1	540747	540747	type	True	77.5006	386	1170	95	below_threshold
Roseovarius faecimaris	strain=MME-070	GCA_009762325.1	2494550	2494550	type	True	77.4831	346	1170	95	below_threshold
Roseovarius confluentis	strain=SAG6	GCA_002917925.1	1852027	1852027	type	True	77.4631	332	1170	95	below_threshold
Roseovarius aestuariivivens	strain=GHTF-24	GCA_004761875.1	1888910	1888910	type	True	77.2795	269	1170	95	below_threshold
Roseovarius litoreus	strain=DSM 28249	GCA_900142765.1	1155722	1155722	type	True	77.2771	305	1170	95	below_threshold
Aestuariivita boseongensis	strain=BS-B2	GCA_001262635.1	1470562	1470562	type	True	77.2674	264	1170	95	below_threshold
Pseudooceanicola onchidii	strain=XY-99	GCA_004959925.1	2562279	2562279	type	True	77.0173	220	1170	95	below_threshold
Aliishimia ponticola	strain=MYP11	GCA_004803475.1	2499833	2499833	type	True	76.9626	214	1170	95	below_threshold
Thalassobius mangrovi	strain=GS-10	GCA_009857745.1	2692236	2692236	type	True	76.922	276	1170	95	below_threshold
Pelagivirga sediminicola	strain=BH-SD19	GCA_003072125.1	2170575	2170575	type	True	76.8979	240	1170	95	below_threshold
Roseovarius nitratireducens	strain=TFZ	GCA_002925845.1	2044597	2044597	type	True	76.8734	254	1170	95	below_threshold
Rhodobacter amnigenus	strain=HSP-20	GCA_009908265.2	2852097	2852097	type	True	76.8056	173	1170	95	below_threshold
Pelagivirga dicentrarchi	strain=YLY04	GCA_003316635.1	2250573	2250573	type	True	76.7622	185	1170	95	below_threshold
Poseidonocella pacifica	strain=DSM 29316	GCA_900111875.1	871651	871651	type	True	76.7492	122	1170	95	below_threshold
Litoreibacter ponti	strain=DSM 100977	GCA_003054285.1	1510457	1510457	type	True	76.7033	189	1170	95	below_threshold
Salipiger pallidus	strain=CGMCC 1.15762	GCA_014643635.1	1775170	1775170	type	True	76.4336	175	1170	95	below_threshold
Phaeobacter porticola	strain=P97	GCA_001888185.1	1844006	1844006	type	True	76.4278	147	1170	95	below_threshold
Rhabdonatronobacter sediminivivens	strain=IM2376	GCA_013415485.1	2743469	2743469	type	True	76.4173	150	1170	95	below_threshold
Cereibacter sediminicola	strain=JA983	GCA_007668225.1	2584941	2584941	type	True	76.2401	127	1170	95	below_threshold
Cereibacter azotoformans	strain=KA25	GCA_003050905.1	43057	43057	type	True	76.2163	133	1170	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-19 05:02:00,851] [INFO] DFAST Taxonomy check result was written to OceanDNA-b30073/tc_result.tsv
[2023-03-19 05:02:00,852] [INFO] ===== Taxonomy check completed =====
[2023-03-19 05:02:00,852] [INFO] ===== Start completeness check using CheckM =====
[2023-03-19 05:02:00,852] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stga22d84e0-bc6a-4538-bec3-7997e265049c/dqc_reference/checkm_data
[2023-03-19 05:02:00,852] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-19 05:02:00,930] [INFO] Task started: CheckM
[2023-03-19 05:02:00,930] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b30073/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b30073/checkm_input OceanDNA-b30073/checkm_result
[2023-03-19 05:03:01,658] [INFO] Task succeeded: CheckM
[2023-03-19 05:03:01,658] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 89.58%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-03-19 05:03:01,660] [INFO] ===== Completeness check finished =====
[2023-03-19 05:03:01,661] [INFO] ===== Start GTDB Search =====
[2023-03-19 05:03:01,661] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b30073/markers.fasta)
[2023-03-19 05:03:01,662] [INFO] Task started: Blastn
[2023-03-19 05:03:01,662] [INFO] Running command: blastn -query OceanDNA-b30073/markers.fasta -db /var/lib/cwl/stga22d84e0-bc6a-4538-bec3-7997e265049c/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b30073/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-19 05:03:03,034] [INFO] Task succeeded: Blastn
[2023-03-19 05:03:03,035] [INFO] Selected 26 target genomes.
[2023-03-19 05:03:03,035] [INFO] Target genome list was writen to OceanDNA-b30073/target_genomes_gtdb.txt
[2023-03-19 05:03:03,052] [INFO] Task started: fastANI
[2023-03-19 05:03:03,052] [INFO] Running command: fastANI --query /var/lib/cwl/stg107b4686-51fa-4c31-9d5f-0d1305d3b69c/OceanDNA-b30073.fa --refList OceanDNA-b30073/target_genomes_gtdb.txt --output OceanDNA-b30073/fastani_result_gtdb.tsv --threads 1
[2023-03-19 05:03:19,915] [INFO] Task succeeded: fastANI
[2023-03-19 05:03:19,930] [INFO] Found 26 fastANI hits (0 hits with ANI > circumscription radius)
[2023-03-19 05:03:19,930] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCF_009363715.1	s__Roseovarius sp009363715	77.6605	315	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius	95.0	N/A	N/A	N/A	N/A	1	-
GCF_009363655.1	s__Roseovarius sp009363655	77.541	322	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius	95.0	98.13	98.13	0.93	0.93	2	-
GCF_001441615.1	s__Roseovarius atlanticus	77.5362	330	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius	95.0	N/A	N/A	N/A	N/A	1	-
GCF_008728195.1	s__Roseovarius indicus	77.5087	383	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius	95.0	99.18	97.54	0.97	0.92	4	-
GCF_009762325.1	s__Roseovarius faecimaris	77.4785	346	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002917925.1	s__Roseovarius confluentis	77.4631	332	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius	95.0	98.93	98.93	0.93	0.93	2	-
GCF_004761875.1	s__Roseovarius aestuariivivens	77.2748	268	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003260265.1	s__Roseovarius sp003260265	77.2706	262	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001262635.1	s__Aestuariivita boseongensis	77.2565	265	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Aestuariivita	95.0	N/A	N/A	N/A	N/A	1	-
GCF_001205715.1	s__Pseudaestuariivita atlantica	77.0773	266	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Pseudaestuariivita	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900109855.1	s__Roseovarius tolerans	77.0581	300	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius	95.0	96.73	96.73	0.86	0.86	2	-
GCF_007995245.1	s__Tateyamaria sp007995245	77.0489	223	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Tateyamaria	95.0	N/A	N/A	N/A	N/A	1	-
GCF_004803475.1	s__Aliishimia ponticola	76.9635	213	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Aliishimia	95.0	N/A	N/A	N/A	N/A	1	-
GCF_009857745.1	s__Thalassobius mangrovi	76.9211	275	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Thalassobius	95.0	N/A	N/A	N/A	N/A	1	-
GCF_002925845.1	s__Roseovarius nitratireducens	76.8921	252	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius	95.0	96.60	96.60	0.90	0.90	2	-
GCF_003072125.1	s__Roseovarius sediminicola	76.891	239	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius	95.0	N/A	N/A	N/A	N/A	1	-
GCF_016428585.1	s__Sedimentitalea sp016428585	76.8153	210	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sedimentitalea	95.0	N/A	N/A	N/A	N/A	1	-
GCF_900177545.1	s__Muriiphilus sp900177545	76.7426	130	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Muriiphilus	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002708925.1	s__Thalassobius sp002708925	76.7254	177	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Thalassobius	95.0	99.91	99.91	0.84	0.84	2	-
GCF_013155465.1	s__Alterinioella nitratireducens	76.6396	192	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Alterinioella	95.0	97.91	97.83	0.91	0.91	3	-
GCF_900184945.1	s__Maliponia aquimaris	76.6196	220	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Maliponia	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002706605.1	s__SAT37 sp002706605	76.569	146	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__SAT37	95.0	99.85	99.85	0.95	0.95	2	-
GCF_014640115.1	s__Muriiphilus lacisalsi	76.4883	158	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Muriiphilus	95.0	N/A	N/A	N/A	N/A	1	-
GCA_019061145.1	s__ASV31 sp019061145	76.3877	183	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__ASV31	95.0	N/A	N/A	N/A	N/A	1	-
GCA_018401965.1	s__Roseicyclus sp018401965	76.3743	153	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseicyclus	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003789055.1	s__Oceanicola lentulus	76.1684	158	1170	d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Oceanicola	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-03-19 05:03:19,930] [INFO] GTDB search result was written to OceanDNA-b30073/result_gtdb.tsv
[2023-03-19 05:03:19,930] [INFO] ===== GTDB Search completed =====
[2023-03-19 05:03:19,933] [INFO] DFAST_QC result json was written to OceanDNA-b30073/dqc_result.json
[2023-03-19 05:03:19,933] [INFO] DFAST_QC completed!
[2023-03-19 05:03:19,933] [INFO] Total running time: 0h1m60s
