[2023-03-19 04:59:58,320] [INFO] DFAST_QC pipeline started.
[2023-03-19 04:59:58,320] [INFO] DFAST_QC version: 0.5.7
[2023-03-19 04:59:58,320] [INFO] DQC Reference Directory: /var/lib/cwl/stg2c04f41a-b1da-4001-98ac-d7e20db3db8e/dqc_reference
[2023-03-19 04:59:59,425] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-19 04:59:59,425] [INFO] Task started: Prodigal
[2023-03-19 04:59:59,425] [INFO] Running command: cat /var/lib/cwl/stg9a9f5b0c-99ba-4c2e-af65-a85461f8d9c8/OceanDNA-b13345.fa | prodigal -d OceanDNA-b13345/cds.fna -a OceanDNA-b13345/protein.faa -g 11 -q > /dev/null
[2023-03-19 05:00:11,064] [INFO] Task succeeded: Prodigal
[2023-03-19 05:00:11,065] [INFO] Task started: HMMsearch
[2023-03-19 05:00:11,065] [INFO] Running command: hmmsearch --tblout OceanDNA-b13345/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg2c04f41a-b1da-4001-98ac-d7e20db3db8e/dqc_reference/reference_markers.hmm OceanDNA-b13345/protein.faa > /dev/null
[2023-03-19 05:00:11,268] [INFO] Task succeeded: HMMsearch
[2023-03-19 05:00:11,269] [INFO] Found 6/6 markers.
[2023-03-19 05:00:11,288] [INFO] Query marker FASTA was written to OceanDNA-b13345/markers.fasta
[2023-03-19 05:00:11,289] [INFO] Task started: Blastn
[2023-03-19 05:00:11,289] [INFO] Running command: blastn -query OceanDNA-b13345/markers.fasta -db /var/lib/cwl/stg2c04f41a-b1da-4001-98ac-d7e20db3db8e/dqc_reference/reference_markers.fasta -out OceanDNA-b13345/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-19 05:00:11,846] [INFO] Task succeeded: Blastn
[2023-03-19 05:00:11,847] [INFO] Selected 24 target genomes.
[2023-03-19 05:00:11,847] [INFO] Target genome list was writen to OceanDNA-b13345/target_genomes.txt
[2023-03-19 05:00:11,860] [INFO] Task started: fastANI
[2023-03-19 05:00:11,861] [INFO] Running command: fastANI --query /var/lib/cwl/stg9a9f5b0c-99ba-4c2e-af65-a85461f8d9c8/OceanDNA-b13345.fa --refList OceanDNA-b13345/target_genomes.txt --output OceanDNA-b13345/fastani_result.tsv --threads 1
[2023-03-19 05:00:23,082] [INFO] Task succeeded: fastANI
[2023-03-19 05:00:23,083] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg2c04f41a-b1da-4001-98ac-d7e20db3db8e/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-19 05:00:23,083] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg2c04f41a-b1da-4001-98ac-d7e20db3db8e/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-19 05:00:23,096] [INFO] Found 24 fastANI hits (0 hits with ANI > threshold)
[2023-03-19 05:00:23,096] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-19 05:00:23,096] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Arcobacter nitrofigilis	strain=DSM 7299	GCA_000092245.1	28199	28199	type	True	91.9267	689	788	95	below_threshold
Arcobacter venerupis	strain=LMG 26156	GCA_013201665.1	1054033	1054033	type	True	79.4909	380	788	95	below_threshold
Arcobacter venerupis	strain=CECT7836	GCA_004023405.1	1054033	1054033	type	True	79.4731	369	788	95	below_threshold
Arcobacter ellisii	strain=CECT 7837	GCA_004115815.1	913109	913109	type	True	79.341	318	788	95	below_threshold
Arcobacter ellisii	strain=LMG 26155	GCA_003544915.1	913109	913109	type	True	79.2985	324	788	95	below_threshold
Arcobacter suis	strain=CECT 7833	GCA_003544815.1	1278212	1278212	type	True	79.0853	316	788	95	below_threshold
Arcobacter caeni	strain=RW17-10	GCA_003063245.1	1912877	1912877	type	True	79.0309	303	788	95	below_threshold
Malaciobacter molluscorum	strain=CECT 7696	GCA_003544935.1	1032072	1032072	type	True	79.0254	355	788	95	below_threshold
Arcobacter cloacae	strain=CECT 7834	GCA_004115805.1	1054034	1054034	type	True	79.0158	308	788	95	below_threshold
Malaciobacter molluscorum	strain=F98-3	GCA_002701265.1	1032072	1032072	type	True	79.0093	348	788	95	below_threshold
Arcobacter suis	strain=CECT7833	GCA_004023465.1	1278212	1278212	type	True	78.9915	314	788	95	below_threshold
Malaciobacter mytili	strain=CECT 7386	GCA_004116555.1	603050	603050	type	True	78.9004	335	788	95	below_threshold
Halarcobacter bivalviorum	strain=LMG 26154	GCA_003346815.1	663364	663364	type	True	78.8552	324	788	95	below_threshold
Malaciobacter mytili	strain=LMG 24559	GCA_003346775.1	603050	603050	type	True	78.8372	342	788	95	below_threshold
Halarcobacter bivalviorum	strain=CECT 7835	GCA_004116675.1	663364	663364	type	True	78.8214	323	788	95	below_threshold
[Halarcobacter] arenosus	strain=CAU 1517	GCA_005771535.1	2576037	2576037	type	True	78.7826	351	788	95	below_threshold
Malaciobacter canalis	strain=F138-33	GCA_002723485.1	1912871	1912871	type	True	78.7564	336	788	95	below_threshold
Arcobacter defluvii	strain=CECT 7697	GCA_004115775.1	873191	873191	type	True	78.7369	321	788	95	below_threshold
Arcobacter defluvii	strain=LMG 25694	GCA_013201725.1	873191	873191	type	True	78.7209	329	788	95	below_threshold
Aliarcobacter butzleri	strain=NCTC 12481	GCA_900187115.1	28197	28197	type	True	78.6457	296	788	95	below_threshold
Aliarcobacter butzleri	strain=LMG 10828	GCA_024584145.1	28197	28197	type	True	78.6369	296	788	95	below_threshold
Aliarcobacter butzleri	strain=RM4018	GCA_000014025.1	28197	28197	type	True	78.5912	300	788	95	below_threshold
[Arcobacter] porcinus	strain=CCUG 56899	GCA_004299785.2	1935204	1935204	type	True	78.0557	203	788	95	below_threshold
[Arcobacter] porcinus	strain=LMG 24487	GCA_024584075.1	1935204	1935204	type	True	77.9969	206	788	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-19 05:00:23,096] [INFO] DFAST Taxonomy check result was written to OceanDNA-b13345/tc_result.tsv
[2023-03-19 05:00:23,096] [INFO] ===== Taxonomy check completed =====
[2023-03-19 05:00:23,096] [INFO] ===== Start completeness check using CheckM =====
[2023-03-19 05:00:23,096] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg2c04f41a-b1da-4001-98ac-d7e20db3db8e/dqc_reference/checkm_data
[2023-03-19 05:00:23,097] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-19 05:00:23,101] [INFO] Task started: CheckM
[2023-03-19 05:00:23,101] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b13345/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b13345/checkm_input OceanDNA-b13345/checkm_result
[2023-03-19 05:00:55,608] [INFO] Task succeeded: CheckM
[2023-03-19 05:00:55,608] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 75.93%
Contamintation: 0.00%
Strain heterogeneity: 0.00%
--------------------------------------------------------------------------------
[2023-03-19 05:00:55,611] [INFO] ===== Completeness check finished =====
[2023-03-19 05:00:55,611] [INFO] ===== Start GTDB Search =====
[2023-03-19 05:00:55,611] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b13345/markers.fasta)
[2023-03-19 05:00:55,612] [INFO] Task started: Blastn
[2023-03-19 05:00:55,612] [INFO] Running command: blastn -query OceanDNA-b13345/markers.fasta -db /var/lib/cwl/stg2c04f41a-b1da-4001-98ac-d7e20db3db8e/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b13345/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-19 05:00:56,408] [INFO] Task succeeded: Blastn
[2023-03-19 05:00:56,409] [INFO] Selected 17 target genomes.
[2023-03-19 05:00:56,409] [INFO] Target genome list was writen to OceanDNA-b13345/target_genomes_gtdb.txt
[2023-03-19 05:00:56,420] [INFO] Task started: fastANI
[2023-03-19 05:00:56,420] [INFO] Running command: fastANI --query /var/lib/cwl/stg9a9f5b0c-99ba-4c2e-af65-a85461f8d9c8/OceanDNA-b13345.fa --refList OceanDNA-b13345/target_genomes_gtdb.txt --output OceanDNA-b13345/fastani_result_gtdb.tsv --threads 1
[2023-03-19 05:01:05,634] [INFO] Task succeeded: fastANI
[2023-03-19 05:01:05,644] [INFO] Found 17 fastANI hits (1 hits with ANI > circumscription radius)
[2023-03-19 05:01:05,644] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCA_004116465.1	s__Arcobacter sp004116465	97.7938	736	788	d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Arcobacteraceae;g__Arcobacter	95.0	N/A	N/A	N/A	N/A	1	conclusive
GCF_000092245.1	s__Arcobacter nitrofigilis	91.9267	689	788	d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Arcobacteraceae;g__Arcobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002869535.1	s__Arcobacter sp002869535	84.2533	550	788	d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Arcobacteraceae;g__Arcobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCA_002869565.1	s__Halarcobacter sp002869565	79.8198	405	788	d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Arcobacteraceae;g__Halarcobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_013201665.1	s__Aliarcobacter venerupis	79.4913	380	788	d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Arcobacteraceae;g__Aliarcobacter	95.0	100.00	100.00	0.99	0.99	2	-
GCF_003544815.1	s__Aliarcobacter suis	79.0999	315	788	d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Arcobacteraceae;g__Aliarcobacter	95.0	100.00	100.00	1.00	1.00	2	-
GCF_003544935.1	s__Malaciobacter molluscorum	79.0345	354	788	d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Arcobacteraceae;g__Malaciobacter	95.0	99.44	98.88	0.96	0.91	3	-
GCF_003063245.1	s__Aliarcobacter caeni	79.022	303	788	d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Arcobacteraceae;g__Aliarcobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCA_001895145.1	s__Halarcobacter sp001895145	79.012	336	788	d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Arcobacteraceae;g__Halarcobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCA_003252105.1	s__Halarcobacter sp003252105	78.9902	318	788	d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Arcobacteraceae;g__Halarcobacter	95.0	N/A	N/A	N/A	N/A	1	-
GCF_003346815.1	s__Halarcobacter bivalviorum	78.8608	323	788	d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Arcobacteraceae;g__Halarcobacter	95.0	98.23	96.46	0.96	0.92	3	-
GCF_001956695.1	s__Poseidonibacter parvus	78.7691	305	788	d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Arcobacteraceae;g__Poseidonibacter	95.0	98.33	98.33	0.91	0.91	2	-
GCF_009208075.1	s__Poseidonibacter sp009208075	78.7586	321	788	d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Arcobacteraceae;g__Poseidonibacter	95.0	99.63	99.63	0.93	0.93	3	-
GCF_900187115.1	s__Aliarcobacter butzleri	78.6321	297	788	d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Arcobacteraceae;g__Aliarcobacter	95.0	97.53	96.98	0.89	0.81	50	-
GCA_001655195.1	s__Arcobacter_A sp001655195	78.2089	220	788	d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Arcobacteraceae;g__Arcobacter_A	95.0	N/A	N/A	N/A	N/A	1	-
GCF_004299785.2	s__Aliarcobacter porcinus	78.0399	204	788	d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Arcobacteraceae;g__Aliarcobacter	95.0	98.76	98.06	0.91	0.88	6	-
GCA_003316695.1	s__Aliarcobacter sp003316695	77.9379	237	788	d__Bacteria;p__Campylobacterota;c__Campylobacteria;o__Campylobacterales;f__Arcobacteraceae;g__Aliarcobacter	95.0	97.67	97.67	0.88	0.88	2	-
--------------------------------------------------------------------------------
[2023-03-19 05:01:05,644] [INFO] GTDB search result was written to OceanDNA-b13345/result_gtdb.tsv
[2023-03-19 05:01:05,645] [INFO] ===== GTDB Search completed =====
[2023-03-19 05:01:05,647] [INFO] DFAST_QC result json was written to OceanDNA-b13345/dqc_result.json
[2023-03-19 05:01:05,647] [INFO] DFAST_QC completed!
[2023-03-19 05:01:05,647] [INFO] Total running time: 0h1m7s
