[2024-01-25 18:09:05,609] [INFO] DFAST_QC pipeline started. [2024-01-25 18:09:05,614] [INFO] DFAST_QC version: 0.5.7 [2024-01-25 18:09:05,614] [INFO] DQC Reference Directory: /var/lib/cwl/stga907a8eb-6b33-4215-b6dd-d656c94255b4/dqc_reference [2024-01-25 18:09:06,793] [INFO] ===== Start taxonomy check using ANI ===== [2024-01-25 18:09:06,793] [INFO] Task started: Prodigal [2024-01-25 18:09:06,794] [INFO] Running command: gunzip -c /var/lib/cwl/stg2b323aab-4bc2-414f-ad9b-1ba8fcfd37f1/GCF_022430545.2_ASM2243054v2_genomic.fna.gz | prodigal -d GCF_022430545.2_ASM2243054v2_genomic.fna/cds.fna -a GCF_022430545.2_ASM2243054v2_genomic.fna/protein.faa -g 11 -q > /dev/null [2024-01-25 18:09:23,575] [INFO] Task succeeded: Prodigal [2024-01-25 18:09:23,576] [INFO] Task started: HMMsearch [2024-01-25 18:09:23,576] [INFO] Running command: hmmsearch --tblout GCF_022430545.2_ASM2243054v2_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stga907a8eb-6b33-4215-b6dd-d656c94255b4/dqc_reference/reference_markers.hmm GCF_022430545.2_ASM2243054v2_genomic.fna/protein.faa > /dev/null [2024-01-25 18:09:23,817] [INFO] Task succeeded: HMMsearch [2024-01-25 18:09:23,818] [INFO] Found 6/6 markers. [2024-01-25 18:09:23,862] [INFO] Query marker FASTA was written to GCF_022430545.2_ASM2243054v2_genomic.fna/markers.fasta [2024-01-25 18:09:23,863] [INFO] Task started: Blastn [2024-01-25 18:09:23,863] [INFO] Running command: blastn -query GCF_022430545.2_ASM2243054v2_genomic.fna/markers.fasta -db /var/lib/cwl/stga907a8eb-6b33-4215-b6dd-d656c94255b4/dqc_reference/reference_markers.fasta -out GCF_022430545.2_ASM2243054v2_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-25 18:09:24,933] [INFO] Task succeeded: Blastn [2024-01-25 18:09:24,936] [INFO] Selected 25 target genomes. [2024-01-25 18:09:24,937] [INFO] Target genome list was writen to GCF_022430545.2_ASM2243054v2_genomic.fna/target_genomes.txt [2024-01-25 18:09:24,955] [INFO] Task started: fastANI [2024-01-25 18:09:24,955] [INFO] Running command: fastANI --query /var/lib/cwl/stg2b323aab-4bc2-414f-ad9b-1ba8fcfd37f1/GCF_022430545.2_ASM2243054v2_genomic.fna.gz --refList GCF_022430545.2_ASM2243054v2_genomic.fna/target_genomes.txt --output GCF_022430545.2_ASM2243054v2_genomic.fna/fastani_result.tsv --threads 1 [2024-01-25 18:09:51,601] [INFO] Task succeeded: fastANI [2024-01-25 18:09:51,602] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stga907a8eb-6b33-4215-b6dd-d656c94255b4/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2024-01-25 18:09:51,603] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stga907a8eb-6b33-4215-b6dd-d656c94255b4/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2024-01-25 18:09:51,619] [INFO] Found 25 fastANI hits (0 hits with ANI > threshold) [2024-01-25 18:09:51,620] [INFO] The taxonomy check result is classified as 'below_threshold'. [2024-01-25 18:09:51,620] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Mycobacterium cookii strain=JCM 12404 GCA_010727945.1 1775 1775 type True 81.7851 1137 1840 95 below_threshold Mycobacterium avium subsp. avium strain=ATCC 25291 GCA_000174035.1 44454 1764 type True 80.1836 822 1840 95 below_threshold Mycobacterium avium subsp. avium strain=DSM 44156 GCA_009741445.1 44454 1764 type True 80.1816 888 1840 95 below_threshold Mycobacterium paraense strain=IEC26 GCA_002101815.1 767916 767916 type True 80.0321 869 1840 95 below_threshold Mycobacterium paraintracellulare strain=KCTC 29084 GCA_002104735.1 1138383 1138383 suspected-type True 79.8165 820 1840 95 below_threshold Mycobacterium paraintracellulare strain=JCM 30622 GCA_010731935.1 1138383 1138383 suspected-type True 79.7992 836 1840 95 below_threshold Mycobacterium fragae strain=DSM 45731 GCA_002102185.1 1260918 1260918 type True 79.7712 828 1840 95 below_threshold Mycobacterium paraintracellulare strain=MOTT64 GCA_000276825.1 1138383 1138383 suspected-type True 79.7644 842 1840 95 below_threshold Mycobacterium mantenii strain=DSM 45255 GCA_002086335.1 560555 560555 type True 79.7641 806 1840 95 below_threshold Mycobacterium parmense strain=DSM 44553 GCA_002102335.1 185642 185642 type True 79.7481 808 1840 95 below_threshold Mycobacterium mantenii strain=JCM 18113 GCA_010731775.1 560555 560555 type True 79.7143 822 1840 95 below_threshold Mycobacterium bohemicum strain=DSM 44277 GCA_001053185.1 56425 56425 type True 79.6953 797 1840 95 below_threshold Mycobacterium bohemicum strain=DSM 44277 GCA_002102025.1 56425 56425 type True 79.6792 824 1840 95 below_threshold Mycobacterium parmense strain=JCM 14742 GCA_010730575.1 185642 185642 type True 79.6766 836 1840 95 below_threshold Mycobacterium shigaense strain=UN-152 GCA_002983495.1 722731 722731 type True 79.6512 773 1840 95 below_threshold Mycobacterium shigaense strain=JCM 32072 GCA_002356315.1 722731 722731 type True 79.5942 783 1840 95 below_threshold Mycobacterium ahvazicum strain=AFP003 GCA_900176255.2 1964395 1964395 type True 79.577 799 1840 95 below_threshold Mycobacterium saskatchewanense strain=JCM 13016 GCA_010729105.1 220927 220927 type True 79.3933 830 1840 95 below_threshold Mycobacterium saskatchewanense strain=DSM 44616 GCA_002101875.1 220927 220927 type True 79.3914 809 1840 95 below_threshold Mycolicibacterium rutilum strain=DSM 45405 GCA_900108565.1 370526 370526 type True 79.2683 753 1840 95 below_threshold Mycobacterium persicum strain=AFPC-000227 GCA_002086675.1 1487726 1487726 type True 78.8155 698 1840 95 below_threshold Mycolicibacterium houstonense strain=type strain: ATCC 49403 GCA_900078665.2 146021 146021 type True 78.8117 649 1840 95 below_threshold Mycolicibacterium fortuitum subsp. acetamidolyticum strain=JCM6368 GCA_001570465.1 144550 1766 type True 78.7051 634 1840 95 below_threshold Mycobacterium heckeshornense strain=JCM 15655 GCA_016592155.1 110505 110505 type True 78.6719 712 1840 95 below_threshold Mycolicibacterium setense strain=DSM 45070 GCA_000805385.1 431269 431269 type True 78.498 640 1840 95 below_threshold -------------------------------------------------------------------------------- [2024-01-25 18:09:51,623] [INFO] DFAST Taxonomy check result was written to GCF_022430545.2_ASM2243054v2_genomic.fna/tc_result.tsv [2024-01-25 18:09:51,624] [INFO] ===== Taxonomy check completed ===== [2024-01-25 18:09:51,624] [INFO] ===== Start completeness check using CheckM ===== [2024-01-25 18:09:51,624] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stga907a8eb-6b33-4215-b6dd-d656c94255b4/dqc_reference/checkm_data [2024-01-25 18:09:51,625] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2024-01-25 18:09:51,691] [INFO] Task started: CheckM [2024-01-25 18:09:51,691] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_022430545.2_ASM2243054v2_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_022430545.2_ASM2243054v2_genomic.fna/checkm_input GCF_022430545.2_ASM2243054v2_genomic.fna/checkm_result [2024-01-25 18:10:45,012] [INFO] Task succeeded: CheckM [2024-01-25 18:10:45,014] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 0.38% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2024-01-25 18:10:45,033] [INFO] ===== Completeness check finished ===== [2024-01-25 18:10:45,034] [INFO] ===== Start GTDB Search ===== [2024-01-25 18:10:45,035] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_022430545.2_ASM2243054v2_genomic.fna/markers.fasta) [2024-01-25 18:10:45,036] [INFO] Task started: Blastn [2024-01-25 18:10:45,036] [INFO] Running command: blastn -query GCF_022430545.2_ASM2243054v2_genomic.fna/markers.fasta -db /var/lib/cwl/stga907a8eb-6b33-4215-b6dd-d656c94255b4/dqc_reference/reference_markers_gtdb.fasta -out GCF_022430545.2_ASM2243054v2_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-25 18:10:46,622] [INFO] Task succeeded: Blastn [2024-01-25 18:10:46,624] [INFO] Selected 19 target genomes. [2024-01-25 18:10:46,624] [INFO] Target genome list was writen to GCF_022430545.2_ASM2243054v2_genomic.fna/target_genomes_gtdb.txt [2024-01-25 18:10:46,639] [INFO] Task started: fastANI [2024-01-25 18:10:46,639] [INFO] Running command: fastANI --query /var/lib/cwl/stg2b323aab-4bc2-414f-ad9b-1ba8fcfd37f1/GCF_022430545.2_ASM2243054v2_genomic.fna.gz --refList GCF_022430545.2_ASM2243054v2_genomic.fna/target_genomes_gtdb.txt --output GCF_022430545.2_ASM2243054v2_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2024-01-25 18:11:09,289] [INFO] Task succeeded: fastANI [2024-01-25 18:11:09,300] [INFO] Found 19 fastANI hits (0 hits with ANI > circumscription radius) [2024-01-25 18:11:09,300] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_001673405.1 s__Mycobacterium sp001673405 84.7905 1360 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 N/A N/A N/A N/A 1 - GCA_002863225.1 s__Mycobacterium sp002863225 82.8334 1216 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 N/A N/A N/A N/A 1 - GCF_010727945.1 s__Mycobacterium cookii 81.7886 1133 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 N/A N/A N/A N/A 1 - GCF_001666755.1 s__Mycobacterium sp001666755 80.3568 814 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 99.50 99.03 0.96 0.95 3 - GCF_001673535.1 s__Mycobacterium sp001673535 79.9233 837 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 N/A N/A N/A N/A 1 - GCF_002086635.1 s__Mycobacterium alsense 79.911 823 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 96.76 96.71 0.91 0.90 3 - GCF_001667885.1 s__Mycobacterium scrofulaceum_C 79.8271 823 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 98.31 98.31 0.93 0.93 2 - GCF_001668725.1 s__Mycobacterium sp001668725 79.7974 850 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 99.46 99.46 0.97 0.97 2 - GCF_002102185.1 s__Mycobacterium fragae 79.7813 826 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 N/A N/A N/A N/A 1 - GCF_010731775.1 s__Mycobacterium mantenii 79.7143 820 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 96.19 95.22 0.92 0.90 6 - GCF_001053185.1 s__Mycobacterium bohemicum 79.7142 794 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 99.81 99.81 0.97 0.97 2 - GCF_002102155.1 s__Mycobacterium europaeum 79.6334 840 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 99.26 99.26 0.97 0.97 2 - GCF_001667035.1 s__Mycobacterium sp001667035 79.5839 792 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 N/A N/A N/A N/A 1 - GCF_900176255.2 s__Mycobacterium ahvazicum 79.562 801 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 N/A N/A N/A N/A 1 - GCF_900240975.1 s__Mycobacterium sp900240975 79.5496 818 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 97.65 95.35 0.92 0.88 4 - GCF_010729105.1 s__Mycobacterium saskatchewanense 79.4 829 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 99.98 99.98 1.00 1.00 2 - GCA_019243155.1 s__Mycobacterium sp019243155 79.3146 667 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 N/A N/A N/A N/A 1 - GCF_000426065.1 s__Mycobacterium sp000426065 78.5128 640 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 N/A N/A N/A N/A 1 - GCF_000805385.1 s__Mycobacterium setense 78.4913 641 1840 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Mycobacterium 95.0 99.09 98.47 0.95 0.89 4 - -------------------------------------------------------------------------------- [2024-01-25 18:11:09,301] [INFO] GTDB search result was written to GCF_022430545.2_ASM2243054v2_genomic.fna/result_gtdb.tsv [2024-01-25 18:11:09,302] [INFO] ===== GTDB Search completed ===== [2024-01-25 18:11:09,306] [INFO] DFAST_QC result json was written to GCF_022430545.2_ASM2243054v2_genomic.fna/dqc_result.json [2024-01-25 18:11:09,306] [INFO] DFAST_QC completed! [2024-01-25 18:11:09,306] [INFO] Total running time: 0h2m4s