[2023-03-19 02:16:59,735] [INFO] DFAST_QC pipeline started.
[2023-03-19 02:16:59,735] [INFO] DFAST_QC version: 0.5.7
[2023-03-19 02:16:59,735] [INFO] DQC Reference Directory: /var/lib/cwl/stgfc7f883f-bea9-4b21-ae6d-bdcbd41afbb5/dqc_reference
[2023-03-19 02:17:00,811] [INFO] ===== Start taxonomy check using ANI =====
[2023-03-19 02:17:00,812] [INFO] Task started: Prodigal
[2023-03-19 02:17:00,812] [INFO] Running command: cat /var/lib/cwl/stg82e1cde6-a6af-4103-8b3c-16669aca43fd/OceanDNA-b3301.fa | prodigal -d OceanDNA-b3301/cds.fna -a OceanDNA-b3301/protein.faa -g 11 -q > /dev/null
[2023-03-19 02:17:12,999] [INFO] Task succeeded: Prodigal
[2023-03-19 02:17:12,999] [INFO] Task started: HMMsearch
[2023-03-19 02:17:12,999] [INFO] Running command: hmmsearch --tblout OceanDNA-b3301/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgfc7f883f-bea9-4b21-ae6d-bdcbd41afbb5/dqc_reference/reference_markers.hmm OceanDNA-b3301/protein.faa > /dev/null
[2023-03-19 02:17:13,171] [INFO] Task succeeded: HMMsearch
[2023-03-19 02:17:13,171] [WARNING] Found 4/6 markers. [/var/lib/cwl/stg82e1cde6-a6af-4103-8b3c-16669aca43fd/OceanDNA-b3301.fa]
[2023-03-19 02:17:13,193] [INFO] Query marker FASTA was written to OceanDNA-b3301/markers.fasta
[2023-03-19 02:17:13,195] [INFO] Task started: Blastn
[2023-03-19 02:17:13,195] [INFO] Running command: blastn -query OceanDNA-b3301/markers.fasta -db /var/lib/cwl/stgfc7f883f-bea9-4b21-ae6d-bdcbd41afbb5/dqc_reference/reference_markers.fasta -out OceanDNA-b3301/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-19 02:17:14,154] [INFO] Task succeeded: Blastn
[2023-03-19 02:17:14,154] [INFO] Selected 21 target genomes.
[2023-03-19 02:17:14,155] [INFO] Target genome list was writen to OceanDNA-b3301/target_genomes.txt
[2023-03-19 02:17:14,165] [INFO] Task started: fastANI
[2023-03-19 02:17:14,165] [INFO] Running command: fastANI --query /var/lib/cwl/stg82e1cde6-a6af-4103-8b3c-16669aca43fd/OceanDNA-b3301.fa --refList OceanDNA-b3301/target_genomes.txt --output OceanDNA-b3301/fastani_result.tsv --threads 1
[2023-03-19 02:17:28,592] [INFO] Task succeeded: fastANI
[2023-03-19 02:17:28,592] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgfc7f883f-bea9-4b21-ae6d-bdcbd41afbb5/dqc_reference/prokaryote_ANI_species_specific_threshold.txt
[2023-03-19 02:17:28,592] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgfc7f883f-bea9-4b21-ae6d-bdcbd41afbb5/dqc_reference/prokaryote_ANI_species_specific_threshold.txt]
[2023-03-19 02:17:28,604] [INFO] Found 21 fastANI hits (0 hits with ANI > threshold)
[2023-03-19 02:17:28,604] [INFO] The taxonomy check result is classified as 'below_threshold'.
[2023-03-19 02:17:28,604] [INFO] DFAST Taxonomy check final result
--------------------------------------------------------------------------------
organism_name	strain	accession	taxid	species_taxid	relation_to_type	validated	ani	matched_fragments	total_fragments	ani_threshold	status
Nocardioides salarius	strain=DSM 18239	GCA_015339545.1	374513	374513	type	True	79.1759	329	635	95	below_threshold
Nocardioides marinisabuli	strain=DSM 18965	GCA_013409965.1	419476	419476	type	True	79.0622	334	635	95	below_threshold
Nocardioides marinisabuli	strain=DSM 18965	GCA_013466785.1	419476	419476	type	True	79.0591	327	635	95	below_threshold
Nocardioides perillae	strain=DSM 24552	GCA_013409425.1	1119534	1119534	type	True	79.0237	319	635	95	below_threshold
Marmoricola scoriae	strain=DSM 22127	GCA_900104965.1	642780	642780	type	True	78.8479	315	635	95	below_threshold
Nocardioides ginsengisegetis	strain=DSM 21349	GCA_014138045.1	661491	661491	type	True	78.79	291	635	95	below_threshold
Nocardioides lacusdianchii	strain=JXJ CY 38	GCA_020102855.1	2783664	2783664	type	True	78.739	293	635	95	below_threshold
Marmoricola aequoreus	strain=NRRL B-24464	GCA_000720335.1	397278	397278	type	True	78.722	334	635	95	below_threshold
Marmoricola aurantiacus	strain=DSM 12652	GCA_003752505.1	86796	86796	type	True	78.7057	321	635	95	below_threshold
Nocardioides okcheonensis	strain=MMS20-HV4-12	GCA_020991065.1	2894081	2894081	type	True	78.6733	308	635	95	below_threshold
Nocardioides kribbensis	strain=DSM 16314	GCA_015070375.1	305517	305517	type	True	78.6491	332	635	95	below_threshold
Nocardioides iriomotensis	strain=NBRC 105384	GCA_004168035.1	715784	715784	type	True	78.603	311	635	95	below_threshold
Nocardioides furvisabuli	strain=JCM 13813	GCA_021083185.1	375542	375542	type	True	78.5956	285	635	95	below_threshold
Nocardioides ferulae	strain=SZ4R5S7	GCA_003660455.1	2340821	2340821	type	True	78.5881	312	635	95	below_threshold
Nocardioides donggukensis	strain=MJB4	GCA_014842875.1	2774019	2774019	type	True	78.5281	275	635	95	below_threshold
Nocardioides panaciterrulae	strain=DSM 21350	GCA_013409645.1	661492	661492	type	True	78.4833	295	635	95	below_threshold
Nocardioides solisilvae	strain=Ka25	GCA_003194625.1	1542435	1542435	type	True	78.4448	261	635	95	below_threshold
Nocardioides panacis	strain=G188	GCA_019039255.1	2849501	2849501	type	True	78.4345	296	635	95	below_threshold
Nocardioides lijunqiniae	strain=S-531	GCA_015024225.1	2760832	2760832	type	True	78.4308	298	635	95	below_threshold
Nocardioides litoris	strain=DSM 103718	GCA_006346315.1	1926648	1926648	type	True	78.3351	335	635	95	below_threshold
Nocardioides cavernae	strain=KCTC 39551	GCA_014779675.1	1921566	1921566	type	True	78.1562	288	635	95	below_threshold
--------------------------------------------------------------------------------
[2023-03-19 02:17:28,604] [INFO] DFAST Taxonomy check result was written to OceanDNA-b3301/tc_result.tsv
[2023-03-19 02:17:28,604] [INFO] ===== Taxonomy check completed =====
[2023-03-19 02:17:28,604] [INFO] ===== Start completeness check using CheckM =====
[2023-03-19 02:17:28,604] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgfc7f883f-bea9-4b21-ae6d-bdcbd41afbb5/dqc_reference/checkm_data
[2023-03-19 02:17:28,605] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM
[2023-03-19 02:17:28,610] [INFO] Task started: CheckM
[2023-03-19 02:17:28,610] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b3301/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b3301/checkm_input OceanDNA-b3301/checkm_result
[2023-03-19 02:18:02,300] [INFO] Task succeeded: CheckM
[2023-03-19 02:18:02,300] [INFO] Completeness check finished.
--------------------------------------------------------------------------------
Completeness: 65.28%
Contamintation: 6.25%
Strain heterogeneity: 100.00%
--------------------------------------------------------------------------------
[2023-03-19 02:18:02,304] [INFO] ===== Completeness check finished =====
[2023-03-19 02:18:02,304] [INFO] ===== Start GTDB Search =====
[2023-03-19 02:18:02,304] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b3301/markers.fasta)
[2023-03-19 02:18:02,305] [INFO] Task started: Blastn
[2023-03-19 02:18:02,305] [INFO] Running command: blastn -query OceanDNA-b3301/markers.fasta -db /var/lib/cwl/stgfc7f883f-bea9-4b21-ae6d-bdcbd41afbb5/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b3301/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5
[2023-03-19 02:18:04,011] [INFO] Task succeeded: Blastn
[2023-03-19 02:18:04,013] [INFO] Selected 18 target genomes.
[2023-03-19 02:18:04,013] [INFO] Target genome list was writen to OceanDNA-b3301/target_genomes_gtdb.txt
[2023-03-19 02:18:04,035] [INFO] Task started: fastANI
[2023-03-19 02:18:04,035] [INFO] Running command: fastANI --query /var/lib/cwl/stg82e1cde6-a6af-4103-8b3c-16669aca43fd/OceanDNA-b3301.fa --refList OceanDNA-b3301/target_genomes_gtdb.txt --output OceanDNA-b3301/fastani_result_gtdb.tsv --threads 1
[2023-03-19 02:18:16,866] [INFO] Task succeeded: fastANI
[2023-03-19 02:18:16,877] [INFO] Found 18 fastANI hits (0 hits with ANI > circumscription radius)
[2023-03-19 02:18:16,877] [INFO] GTDB search result
--------------------------------------------------------------------------------
accession	gtdb_species	ani	matched_fragments	total_fragments	gtdb_taxonomy	ani_circumscription_radius	mean_intra_species_ani	min_intra_species_ani	mean_intra_species_af	min_intra_species_af	num_clustered_genomes	status
GCA_018672315.1	s__Nocardioides_A sp018672315	83.9758	515	635	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides_A	95.0	N/A	N/A	N/A	N/A	1	-
GCF_013466785.1	s__Nocardioides marinisabuli	79.0503	327	635	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides	95.0	99.99	99.99	1.00	1.00	2	-
GCF_013409425.1	s__Nocardioides perillae	78.9887	322	635	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014138045.1	s__Nocardioides ginsengisegetis	78.7605	293	635	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides	95.0	96.18	96.18	0.84	0.84	2	-
GCF_000720335.1	s__Marmoricola aequoreus	78.689	336	635	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Marmoricola	95.0	N/A	N/A	N/A	N/A	1	-
GCF_014266025.1	s__Nocardioides deserti	78.6537	314	635	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides	95.0	100.00	100.00	0.99	0.99	2	-
GCF_004168035.1	s__Nocardioides_B iriomotensis	78.6287	309	635	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides_B	95.0	N/A	N/A	N/A	N/A	1	-
GCF_011250565.1	s__Nocardioides sp011250565	78.5442	316	635	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides	95.0	N/A	N/A	N/A	N/A	1	-
GCF_000620645.1	s__Nocardioides sp000620645	78.4987	285	635	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides	95.0	N/A	N/A	N/A	N/A	1	-
GCF_013177455.1	s__Nocardioides sp013177455	78.4597	322	635	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides	95.0	99.28	98.92	0.97	0.95	4	-
GCF_015024225.1	s__Nocardioides sp015024225	78.4566	296	635	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides	95.0	97.48	97.48	0.91	0.91	2	-
GCF_013778305.1	s__Nocardioides sp013778305	78.4508	306	635	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides	95.0	100.00	100.00	1.00	1.00	2	-
GCF_001425175.1	s__Nocardioides sp001425175	78.4248	312	635	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides	95.0	100.00	100.00	1.00	1.00	2	-
GCF_006346315.1	s__Nocardioides litoris	78.3706	332	635	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides	95.0	N/A	N/A	N/A	N/A	1	-
GCA_013361955.1	s__JABFXA01 sp013361955	78.3095	267	635	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__JABFXA01	95.0	99.75	99.75	0.92	0.92	2	-
GCF_013624435.2	s__Nocardioides sp013624415	78.2374	262	635	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides	95.0	99.04	99.04	0.95	0.95	2	-
GCA_902805865.1	s__Marmoricola_A sp902805865	77.8142	204	635	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Marmoricola_A	95.0	N/A	N/A	N/A	N/A	1	-
GCF_013055795.1	s__Streptomyces sp013055795	76.1188	175	635	d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Streptomycetales;f__Streptomycetaceae;g__Streptomyces	95.0	N/A	N/A	N/A	N/A	1	-
--------------------------------------------------------------------------------
[2023-03-19 02:18:16,877] [INFO] GTDB search result was written to OceanDNA-b3301/result_gtdb.tsv
[2023-03-19 02:18:16,877] [INFO] ===== GTDB Search completed =====
[2023-03-19 02:18:16,880] [INFO] DFAST_QC result json was written to OceanDNA-b3301/dqc_result.json
[2023-03-19 02:18:16,880] [INFO] DFAST_QC completed!
[2023-03-19 02:18:16,880] [INFO] Total running time: 0h1m17s
