[2023-06-07 22:49:22,668] [INFO] DFAST_QC pipeline started. [2023-06-07 22:49:22,671] [INFO] DFAST_QC version: 0.5.7 [2023-06-07 22:49:22,671] [INFO] DQC Reference Directory: /var/lib/cwl/stgf49c5325-bf03-40fd-99f7-9d935bfba9c2/dqc_reference [2023-06-07 22:49:25,113] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-07 22:49:25,114] [INFO] Task started: Prodigal [2023-06-07 22:49:25,115] [INFO] Running command: gunzip -c /var/lib/cwl/stg30943bff-9e5e-4f91-a184-62530088b800/GCA_945861925.1_MaH-09apr19-236_genomic.fna.gz | prodigal -d GCA_945861925.1_MaH-09apr19-236_genomic.fna/cds.fna -a GCA_945861925.1_MaH-09apr19-236_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-07 22:49:35,259] [INFO] Task succeeded: Prodigal [2023-06-07 22:49:35,259] [INFO] Task started: HMMsearch [2023-06-07 22:49:35,259] [INFO] Running command: hmmsearch --tblout GCA_945861925.1_MaH-09apr19-236_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgf49c5325-bf03-40fd-99f7-9d935bfba9c2/dqc_reference/reference_markers.hmm GCA_945861925.1_MaH-09apr19-236_genomic.fna/protein.faa > /dev/null [2023-06-07 22:49:35,536] [INFO] Task succeeded: HMMsearch [2023-06-07 22:49:35,537] [WARNING] Found 5/6 markers. [/var/lib/cwl/stg30943bff-9e5e-4f91-a184-62530088b800/GCA_945861925.1_MaH-09apr19-236_genomic.fna.gz] [2023-06-07 22:49:35,574] [INFO] Query marker FASTA was written to GCA_945861925.1_MaH-09apr19-236_genomic.fna/markers.fasta [2023-06-07 22:49:35,574] [INFO] Task started: Blastn [2023-06-07 22:49:35,574] [INFO] Running command: blastn -query GCA_945861925.1_MaH-09apr19-236_genomic.fna/markers.fasta -db /var/lib/cwl/stgf49c5325-bf03-40fd-99f7-9d935bfba9c2/dqc_reference/reference_markers.fasta -out GCA_945861925.1_MaH-09apr19-236_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-07 22:49:36,554] [INFO] Task succeeded: Blastn [2023-06-07 22:49:36,559] [INFO] Selected 29 target genomes. [2023-06-07 22:49:36,559] [INFO] Target genome list was writen to GCA_945861925.1_MaH-09apr19-236_genomic.fna/target_genomes.txt [2023-06-07 22:49:36,567] [INFO] Task started: fastANI [2023-06-07 22:49:36,568] [INFO] Running command: fastANI --query /var/lib/cwl/stg30943bff-9e5e-4f91-a184-62530088b800/GCA_945861925.1_MaH-09apr19-236_genomic.fna.gz --refList GCA_945861925.1_MaH-09apr19-236_genomic.fna/target_genomes.txt --output GCA_945861925.1_MaH-09apr19-236_genomic.fna/fastani_result.tsv --threads 1 [2023-06-07 22:49:58,100] [INFO] Task succeeded: fastANI [2023-06-07 22:49:58,101] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgf49c5325-bf03-40fd-99f7-9d935bfba9c2/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-07 22:49:58,102] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgf49c5325-bf03-40fd-99f7-9d935bfba9c2/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-07 22:49:58,124] [INFO] Found 27 fastANI hits (0 hits with ANI > threshold) [2023-06-07 22:49:58,124] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-07 22:49:58,125] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Nocardioides seonyuensis strain=MMS17-SY207-3 GCA_004683965.1 2518371 2518371 type True 76.4014 72 1069 95 below_threshold Janibacter anophelis strain=NBRC 107843 GCA_001570945.1 319054 319054 type True 76.3085 70 1069 95 below_threshold Knoellia aerolata strain=DSM 18566 GCA_000768695.1 442954 442954 type True 76.1913 73 1069 95 below_threshold Janibacter hoylei strain=PVAS-1 GCA_004025845.1 364298 364298 type True 76.1122 84 1069 95 below_threshold Nocardioides okcheonensis strain=MMS20-HV4-12 GCA_020991065.1 2894081 2894081 type True 76.0688 85 1069 95 below_threshold Streptomyces thermodiastaticus strain=DSM 40573 GCA_021394575.1 44061 44061 type True 76.0614 84 1069 95 below_threshold Jiangella rhizosphaerae strain=NEAU-YY265 GCA_003579925.1 2293569 2293569 type True 76.0549 103 1069 95 below_threshold Janibacter hoylei strain=PVAS-1 GCA_000297495.1 364298 364298 type True 76.0474 83 1069 95 below_threshold Intrasporangium chromatireducens strain=Q5-1 GCA_000576575.1 1386088 1386088 type True 75.9968 79 1069 95 below_threshold Jiangella alkaliphila strain=DSM 45079 GCA_900105925.1 419479 419479 type True 75.9929 107 1069 95 below_threshold Knoellia subterranea strain=KCTC 19937 GCA_000768685.1 184882 184882 type True 75.9679 79 1069 95 below_threshold Jiangella muralis strain=DSM 45357 GCA_001270745.1 702383 702383 type True 75.951 104 1069 95 below_threshold Streptomyces albireticuli strain=NRRL B1670 GCA_021228125.1 1940 1940 type True 75.9196 103 1069 95 below_threshold Jiangella alba strain=DSM 45237 GCA_900106035.1 561176 561176 type True 75.9124 119 1069 95 below_threshold Jiangella alba strain=YIM 61503 GCA_001708125.1 561176 561176 type True 75.9012 120 1069 95 below_threshold Oerskovia merdavium strain=Sa2CUA9 GCA_014836755.1 2762227 2762227 type True 75.703 58 1069 95 below_threshold Cnuibacter physcomitrellae strain=CGMCC 1.15041 GCA_014640535.1 1619308 1619308 type True 75.6642 60 1069 95 below_threshold Streptomyces rubrisoli strain=DSM 42083 GCA_024436055.1 1387313 1387313 type True 75.6246 78 1069 95 below_threshold Actinophytocola algeriensis strain=DSM 46746 GCA_014874055.1 1768010 1768010 type True 75.6243 78 1069 95 below_threshold Agromyces hippuratus strain=DSM 8598 GCA_013410355.1 286438 286438 type True 75.6097 69 1069 95 below_threshold Agromyces flavus strain=CCM 7623 GCA_014635805.1 589382 589382 type True 75.6082 51 1069 95 below_threshold Actinophytocola algeriensis strain=CECT 8960 GCA_014203735.1 1768010 1768010 type True 75.5841 78 1069 95 below_threshold Agromyces flavus strain=CPCC 202695 GCA_900104685.1 589382 589382 type True 75.5835 53 1069 95 below_threshold Agromyces flavus strain=CPCC 202695 GCA_004366335.2 589382 589382 type True 75.5625 53 1069 95 below_threshold Amycolatopsis xylanica strain=CPCC 202699 GCA_900107045.1 589385 589385 type True 75.509 85 1069 95 below_threshold Occultella glacieicola strain=T3246-1 GCA_004353825.1 2518684 2518684 type True 75.495 61 1069 95 below_threshold Nocardia lijiangensis strain=NBRC 108240 GCA_001613045.1 299618 299618 type True 75.4405 67 1069 95 below_threshold -------------------------------------------------------------------------------- [2023-06-07 22:49:58,126] [INFO] DFAST Taxonomy check result was written to GCA_945861925.1_MaH-09apr19-236_genomic.fna/tc_result.tsv [2023-06-07 22:49:58,127] [INFO] ===== Taxonomy check completed ===== [2023-06-07 22:49:58,127] [INFO] ===== Start completeness check using CheckM ===== [2023-06-07 22:49:58,127] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgf49c5325-bf03-40fd-99f7-9d935bfba9c2/dqc_reference/checkm_data [2023-06-07 22:49:58,128] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-07 22:49:58,165] [INFO] Task started: CheckM [2023-06-07 22:49:58,166] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_945861925.1_MaH-09apr19-236_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_945861925.1_MaH-09apr19-236_genomic.fna/checkm_input GCA_945861925.1_MaH-09apr19-236_genomic.fna/checkm_result [2023-06-07 22:50:31,312] [INFO] Task succeeded: CheckM [2023-06-07 22:50:31,313] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 74.31% Contamintation: 4.17% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-07 22:50:31,335] [INFO] ===== Completeness check finished ===== [2023-06-07 22:50:31,336] [INFO] ===== Start GTDB Search ===== [2023-06-07 22:50:31,336] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_945861925.1_MaH-09apr19-236_genomic.fna/markers.fasta) [2023-06-07 22:50:31,337] [INFO] Task started: Blastn [2023-06-07 22:50:31,337] [INFO] Running command: blastn -query GCA_945861925.1_MaH-09apr19-236_genomic.fna/markers.fasta -db /var/lib/cwl/stgf49c5325-bf03-40fd-99f7-9d935bfba9c2/dqc_reference/reference_markers_gtdb.fasta -out GCA_945861925.1_MaH-09apr19-236_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-07 22:50:32,624] [INFO] Task succeeded: Blastn [2023-06-07 22:50:32,628] [INFO] Selected 20 target genomes. [2023-06-07 22:50:32,629] [INFO] Target genome list was writen to GCA_945861925.1_MaH-09apr19-236_genomic.fna/target_genomes_gtdb.txt [2023-06-07 22:50:32,764] [INFO] Task started: fastANI [2023-06-07 22:50:32,764] [INFO] Running command: fastANI --query /var/lib/cwl/stg30943bff-9e5e-4f91-a184-62530088b800/GCA_945861925.1_MaH-09apr19-236_genomic.fna.gz --refList GCA_945861925.1_MaH-09apr19-236_genomic.fna/target_genomes_gtdb.txt --output GCA_945861925.1_MaH-09apr19-236_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-07 22:50:47,039] [INFO] Task succeeded: fastANI [2023-06-07 22:50:47,055] [INFO] Found 20 fastANI hits (0 hits with ANI > circumscription radius) [2023-06-07 22:50:47,056] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_009699015.1 s__WLND01 sp009699015 85.2986 634 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__WLND01 95.0 98.95 98.95 0.81 0.81 2 - GCA_009726075.1 s__WLND01 sp009726075 78.9959 242 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__WLND01 95.0 N/A N/A N/A N/A 1 - GCA_018970025.1 s__REEB460 sp018970025 77.4208 157 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__REEB460 95.0 N/A N/A N/A N/A 1 - GCA_903908025.1 s__UBA10649 sp903908025 77.3511 172 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__UBA10649 95.0 98.87 97.91 0.84 0.77 3 - GCA_903940375.1 s__UBA10649 sp903940375 77.3222 153 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__UBA10649 95.0 98.93 98.93 0.81 0.81 2 - GCA_016125355.1 s__UBA10649 sp016125355 77.1733 168 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__UBA10649 95.0 N/A N/A N/A N/A 1 - GCA_016717115.1 s__UBA10649 sp016717115 77.1538 200 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__UBA10649 95.0 N/A N/A N/A N/A 1 - GCA_017856385.1 s__UBA10649 sp017856385 76.8838 174 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__UBA10649 95.0 N/A N/A N/A N/A 1 - GCA_004379115.1 s__Mxb001 sp004379115 76.8701 91 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__Mxb001 95.0 N/A N/A N/A N/A 1 - GCA_903848435.1 s__Mxb001 sp903848435 76.8542 78 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__Mxb001 95.0 96.93 95.89 0.80 0.74 3 - GCA_017849335.1 s__UBA10649 sp017849335 76.8288 115 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__UBA10649 95.0 N/A N/A N/A N/A 1 - GCF_003594885.1 s__Vallicoccus soli 76.2759 106 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Motilibacterales;f__Motilibacteraceae;g__Vallicoccus 95.0 N/A N/A N/A N/A 1 - GCA_017848795.1 s__JAFIHG01 sp017848795 76.2586 83 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__CAIYMF01;g__JAFIHG01 95.0 N/A N/A N/A N/A 1 - GCF_002286695.1 s__Streptomyces albireticuli 75.9496 100 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Streptomycetales;f__Streptomycetaceae;g__Streptomyces 95.0 N/A N/A N/A N/A 1 - GCF_013363995.1 s__Nonomuraea rhodomycinica 75.9417 88 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Streptosporangiales;f__Streptosporangiaceae;g__Nonomuraea 95.0 N/A N/A N/A N/A 1 - GCF_000312005.1 s__Cellulomonas massiliensis 75.8201 56 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Cellulomonadaceae;g__Cellulomonas 95.0 99.99 99.99 0.99 0.99 2 - GCF_014836755.1 s__Oerskovia sp014836755 75.703 58 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Cellulomonadaceae;g__Oerskovia 95.0 N/A N/A N/A N/A 1 - GCF_001425175.1 s__Nocardioides sp001425175 75.6866 92 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides 95.0 100.00 100.00 1.00 1.00 2 - GCF_000414115.1 s__Streptomyces aurantiacus_A 75.5879 88 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Streptomycetales;f__Streptomycetaceae;g__Streptomyces 95.0 N/A N/A N/A N/A 1 - GCF_003946865.1 s__Antribacter gilvus 75.3792 58 1069 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Cellulomonadaceae;g__Antribacter 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-07 22:50:47,058] [INFO] GTDB search result was written to GCA_945861925.1_MaH-09apr19-236_genomic.fna/result_gtdb.tsv [2023-06-07 22:50:47,058] [INFO] ===== GTDB Search completed ===== [2023-06-07 22:50:47,063] [INFO] DFAST_QC result json was written to GCA_945861925.1_MaH-09apr19-236_genomic.fna/dqc_result.json [2023-06-07 22:50:47,064] [INFO] DFAST_QC completed! [2023-06-07 22:50:47,064] [INFO] Total running time: 0h1m24s