[2023-06-08 07:54:35,480] [INFO] DFAST_QC pipeline started. [2023-06-08 07:54:35,487] [INFO] DFAST_QC version: 0.5.7 [2023-06-08 07:54:35,488] [INFO] DQC Reference Directory: /var/lib/cwl/stgf743db59-25f3-4a5a-937e-d78dd7f10bd6/dqc_reference [2023-06-08 07:54:37,178] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-08 07:54:37,179] [INFO] Task started: Prodigal [2023-06-08 07:54:37,180] [INFO] Running command: gunzip -c /var/lib/cwl/stg6ff4cb7c-d102-4ae4-8453-9d8a754fb706/GCA_945877545.1_MoE-02may19-65_genomic.fna.gz | prodigal -d GCA_945877545.1_MoE-02may19-65_genomic.fna/cds.fna -a GCA_945877545.1_MoE-02may19-65_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-08 07:54:44,915] [INFO] Task succeeded: Prodigal [2023-06-08 07:54:44,915] [INFO] Task started: HMMsearch [2023-06-08 07:54:44,915] [INFO] Running command: hmmsearch --tblout GCA_945877545.1_MoE-02may19-65_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgf743db59-25f3-4a5a-937e-d78dd7f10bd6/dqc_reference/reference_markers.hmm GCA_945877545.1_MoE-02may19-65_genomic.fna/protein.faa > /dev/null [2023-06-08 07:54:45,099] [INFO] Task succeeded: HMMsearch [2023-06-08 07:54:45,100] [INFO] Found 6/6 markers. [2023-06-08 07:54:45,130] [INFO] Query marker FASTA was written to GCA_945877545.1_MoE-02may19-65_genomic.fna/markers.fasta [2023-06-08 07:54:45,130] [INFO] Task started: Blastn [2023-06-08 07:54:45,131] [INFO] Running command: blastn -query GCA_945877545.1_MoE-02may19-65_genomic.fna/markers.fasta -db /var/lib/cwl/stgf743db59-25f3-4a5a-937e-d78dd7f10bd6/dqc_reference/reference_markers.fasta -out GCA_945877545.1_MoE-02may19-65_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-08 07:54:45,936] [INFO] Task succeeded: Blastn [2023-06-08 07:54:45,940] [INFO] Selected 35 target genomes. [2023-06-08 07:54:45,940] [INFO] Target genome list was writen to GCA_945877545.1_MoE-02may19-65_genomic.fna/target_genomes.txt [2023-06-08 07:54:45,962] [INFO] Task started: fastANI [2023-06-08 07:54:45,962] [INFO] Running command: fastANI --query /var/lib/cwl/stg6ff4cb7c-d102-4ae4-8453-9d8a754fb706/GCA_945877545.1_MoE-02may19-65_genomic.fna.gz --refList GCA_945877545.1_MoE-02may19-65_genomic.fna/target_genomes.txt --output GCA_945877545.1_MoE-02may19-65_genomic.fna/fastani_result.tsv --threads 1 [2023-06-08 07:55:10,993] [INFO] Task succeeded: fastANI [2023-06-08 07:55:10,994] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgf743db59-25f3-4a5a-937e-d78dd7f10bd6/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-08 07:55:10,994] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgf743db59-25f3-4a5a-937e-d78dd7f10bd6/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-08 07:55:11,015] [INFO] Found 31 fastANI hits (0 hits with ANI > threshold) [2023-06-08 07:55:11,015] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-08 07:55:11,015] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Knoellia remsis strain=ATCC BAA-1496 GCA_003002895.1 407159 407159 type True 76.7241 67 852 95 below_threshold Cellulomonas humilata strain=ATCC 25174 GCA_013359775.1 144055 144055 type True 76.6761 52 852 95 below_threshold Streptomyces albireticuli strain=NRRL B1670 GCA_021228125.1 1940 1940 type True 76.5038 77 852 95 below_threshold Cellulomonas cellasea strain=DSM 20118 GCA_000767135.1 43670 43670 type True 76.4144 57 852 95 below_threshold Knoellia aerolata strain=DSM 18566 GCA_000768695.1 442954 442954 type True 76.4142 73 852 95 below_threshold Cellulomonas iranensis strain=NBRC 101100 GCA_001552375.1 76862 76862 type True 76.3792 57 852 95 below_threshold Streptomyces tirandamycinicus strain=HNM0039 GCA_003097515.1 2174846 2174846 type True 76.3753 65 852 95 below_threshold Janibacter melonis strain=NBRC107855 GCA_020567375.1 262209 262209 type True 76.3725 76 852 95 below_threshold Actinotalea subterranea strain=HO-Ch2 GCA_008364845.1 2607497 2607497 type True 76.3684 58 852 95 below_threshold Knoellia subterranea strain=KCTC 19937 GCA_000768685.1 184882 184882 type True 76.3517 63 852 95 below_threshold Nocardioides salarius strain=DSM 18239 GCA_015070765.1 374513 374513 type True 76.303 69 852 95 below_threshold Nocardioides salarius strain=DSM 18239 GCA_015339545.1 374513 374513 type True 76.303 69 852 95 below_threshold Nocardioides seonyuensis strain=MMS17-SY207-3 GCA_004683965.1 2518371 2518371 type True 76.296 70 852 95 below_threshold Nocardioides salarius strain=DSM 18239 GCA_016907435.1 374513 374513 type True 76.2919 70 852 95 below_threshold Nocardioides dokdonensis strain=FR1436 GCA_001653335.1 450734 450734 type True 76.1801 69 852 95 below_threshold Jiangella endophytica strain=KE2-3 GCA_003427025.1 1623398 1623398 type True 76.1768 78 852 95 below_threshold Arsenicicoccus cauae strain=MKL-02 GCA_009707125.1 2663847 2663847 type True 76.0813 58 852 95 below_threshold Agromyces archimandritae strain=G127AT GCA_018024495.1 2781962 2781962 type True 76.0539 58 852 95 below_threshold Arsenicicoccus bolidensis strain=DSM 15745 GCA_000426385.1 229480 229480 type True 76.0446 66 852 95 below_threshold Jiangella alba strain=DSM 45237 GCA_900106035.1 561176 561176 type True 75.9769 87 852 95 below_threshold Sanguibacter suarezii strain=NBRC 16159 GCA_001552735.1 60921 60921 type True 75.9758 50 852 95 below_threshold Clavibacter michiganensis subsp. tessellarius strain=ATCC 33566 GCA_002240635.1 31965 28447 type True 75.9156 51 852 95 below_threshold Actinoplanes brasiliensis strain=DSM 43805 GCA_004362215.1 52695 52695 type True 75.9123 65 852 95 below_threshold Actinomadura geliboluensis strain=A8036 GCA_005889745.1 882440 882440 type True 75.8861 74 852 95 below_threshold Actinomycetospora soli strain=SF1 GCA_021026295.1 2893887 2893887 type True 75.8751 82 852 95 below_threshold Clavibacter michiganensis strain=LMG7333 GCA_021216655.1 28447 28447 suspected-type True 75.8346 54 852 95 below_threshold Actinoplanes brasiliensis strain=NBRC 13938 GCA_016862015.1 52695 52695 type True 75.8329 64 852 95 below_threshold Amycolatopsis thailandensis strain=JCM 16380 GCA_002234405.1 589330 589330 type True 75.7411 59 852 95 below_threshold Actinoplanes deccanensis strain=NBRC 13994 GCA_016862115.1 113561 113561 type True 75.6724 84 852 95 below_threshold Amycolatopsis azurea strain=DSM 43854 GCA_001995215.1 36819 36819 type True 75.6272 59 852 95 below_threshold Amycolatopsis azurea strain=DSM 43854 GCA_000340415.1 36819 36819 type True 75.5174 60 852 95 below_threshold -------------------------------------------------------------------------------- [2023-06-08 07:55:11,017] [INFO] DFAST Taxonomy check result was written to GCA_945877545.1_MoE-02may19-65_genomic.fna/tc_result.tsv [2023-06-08 07:55:11,018] [INFO] ===== Taxonomy check completed ===== [2023-06-08 07:55:11,018] [INFO] ===== Start completeness check using CheckM ===== [2023-06-08 07:55:11,018] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgf743db59-25f3-4a5a-937e-d78dd7f10bd6/dqc_reference/checkm_data [2023-06-08 07:55:11,019] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-08 07:55:11,047] [INFO] Task started: CheckM [2023-06-08 07:55:11,047] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_945877545.1_MoE-02may19-65_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_945877545.1_MoE-02may19-65_genomic.fna/checkm_input GCA_945877545.1_MoE-02may19-65_genomic.fna/checkm_result [2023-06-08 07:55:38,213] [INFO] Task succeeded: CheckM [2023-06-08 07:55:38,214] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 94.91% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-08 07:55:38,230] [INFO] ===== Completeness check finished ===== [2023-06-08 07:55:38,231] [INFO] ===== Start GTDB Search ===== [2023-06-08 07:55:38,231] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_945877545.1_MoE-02may19-65_genomic.fna/markers.fasta) [2023-06-08 07:55:38,231] [INFO] Task started: Blastn [2023-06-08 07:55:38,231] [INFO] Running command: blastn -query GCA_945877545.1_MoE-02may19-65_genomic.fna/markers.fasta -db /var/lib/cwl/stgf743db59-25f3-4a5a-937e-d78dd7f10bd6/dqc_reference/reference_markers_gtdb.fasta -out GCA_945877545.1_MoE-02may19-65_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-08 07:55:39,369] [INFO] Task succeeded: Blastn [2023-06-08 07:55:39,372] [INFO] Selected 22 target genomes. [2023-06-08 07:55:39,372] [INFO] Target genome list was writen to GCA_945877545.1_MoE-02may19-65_genomic.fna/target_genomes_gtdb.txt [2023-06-08 07:55:39,380] [INFO] Task started: fastANI [2023-06-08 07:55:39,380] [INFO] Running command: fastANI --query /var/lib/cwl/stg6ff4cb7c-d102-4ae4-8453-9d8a754fb706/GCA_945877545.1_MoE-02may19-65_genomic.fna.gz --refList GCA_945877545.1_MoE-02may19-65_genomic.fna/target_genomes_gtdb.txt --output GCA_945877545.1_MoE-02may19-65_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-08 07:55:52,399] [INFO] Task succeeded: fastANI [2023-06-08 07:55:52,414] [INFO] Found 20 fastANI hits (0 hits with ANI > circumscription radius) [2023-06-08 07:55:52,414] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_009699015.1 s__WLND01 sp009699015 84.9857 594 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__WLND01 95.0 98.95 98.95 0.81 0.81 2 - GCA_009726075.1 s__WLND01 sp009726075 78.5946 237 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__WLND01 95.0 N/A N/A N/A N/A 1 - GCA_016717115.1 s__UBA10649 sp016717115 77.7444 172 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__UBA10649 95.0 N/A N/A N/A N/A 1 - GCA_903940375.1 s__UBA10649 sp903940375 77.6117 135 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__UBA10649 95.0 98.93 98.93 0.81 0.81 2 - GCA_016125355.1 s__UBA10649 sp016125355 77.4112 149 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__UBA10649 95.0 N/A N/A N/A N/A 1 - GCA_018970025.1 s__REEB460 sp018970025 77.2954 150 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__REEB460 95.0 N/A N/A N/A N/A 1 - GCA_903908025.1 s__UBA10649 sp903908025 77.2889 151 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__UBA10649 95.0 98.87 97.91 0.84 0.77 3 - GCA_017856385.1 s__UBA10649 sp017856385 77.2734 158 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__UBA10649 95.0 N/A N/A N/A N/A 1 - GCA_016125335.1 s__WLRQ01 sp016125335 76.9807 122 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__CAIYMF01;g__WLRQ01 95.0 N/A N/A N/A N/A 1 - GCA_017849335.1 s__UBA10649 sp017849335 76.9114 99 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__UBA10649 95.0 N/A N/A N/A N/A 1 - GCF_003002895.1 s__Knoellia remsis 76.6593 69 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Dermatophilaceae;g__Knoellia 95.0 N/A N/A N/A N/A 1 - GCA_016870195.1 s__VGCC01 sp016870195 76.6455 84 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__VGCC01 95.0 N/A N/A N/A N/A 1 - GCA_903909555.1 s__Mxb001 sp903909555 76.4361 103 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nanopelagicales;f__S36-B12;g__Mxb001 95.0 99.15 99.05 0.83 0.81 4 - GCF_003047295.1 s__Nocardioides sediminis 76.3639 66 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides 95.0 N/A N/A N/A N/A 1 - GCF_003097515.1 s__Streptomyces tirandamycinicus 76.3459 66 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Streptomycetales;f__Streptomycetaceae;g__Streptomyces 95.0 98.65 98.27 0.90 0.88 6 - GCF_006346315.1 s__Nocardioides litoris 76.3226 82 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Propionibacteriales;f__Nocardioidaceae;g__Nocardioides 95.0 N/A N/A N/A N/A 1 - GCF_000718255.1 s__Spirillospora albida 76.3025 60 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Streptosporangiales;f__Streptosporangiaceae;g__Spirillospora 95.0 N/A N/A N/A N/A 1 - GCF_902506375.1 s__Microbacterium sp902506375 76.0635 54 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_013364275.1 s__Spirillospora sp013364275 75.8485 74 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Streptosporangiales;f__Streptosporangiaceae;g__Spirillospora 95.0 N/A N/A N/A N/A 1 - GCF_014648675.1 s__Spirillospora livida 75.6948 73 852 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Streptosporangiales;f__Streptosporangiaceae;g__Spirillospora 95.0 98.54 97.09 0.95 0.91 3 - -------------------------------------------------------------------------------- [2023-06-08 07:55:52,416] [INFO] GTDB search result was written to GCA_945877545.1_MoE-02may19-65_genomic.fna/result_gtdb.tsv [2023-06-08 07:55:52,416] [INFO] ===== GTDB Search completed ===== [2023-06-08 07:55:52,421] [INFO] DFAST_QC result json was written to GCA_945877545.1_MoE-02may19-65_genomic.fna/dqc_result.json [2023-06-08 07:55:52,421] [INFO] DFAST_QC completed! [2023-06-08 07:55:52,421] [INFO] Total running time: 0h1m17s