[2023-06-28 01:45:51,964] [INFO] DFAST_QC pipeline started. [2023-06-28 01:45:51,967] [INFO] DFAST_QC version: 0.5.7 [2023-06-28 01:45:51,967] [INFO] DQC Reference Directory: /var/lib/cwl/stgbd621306-c623-48d6-a15b-078f51b3d510/dqc_reference [2023-06-28 01:45:53,193] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-28 01:45:53,194] [INFO] Task started: Prodigal [2023-06-28 01:45:53,194] [INFO] Running command: gunzip -c /var/lib/cwl/stg0c3ab729-e129-4285-a9a8-d466ae2f2cc7/GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna.gz | prodigal -d GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/cds.fna -a GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-28 01:46:00,740] [INFO] Task succeeded: Prodigal [2023-06-28 01:46:00,741] [INFO] Task started: HMMsearch [2023-06-28 01:46:00,741] [INFO] Running command: hmmsearch --tblout GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgbd621306-c623-48d6-a15b-078f51b3d510/dqc_reference/reference_markers.hmm GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/protein.faa > /dev/null [2023-06-28 01:46:00,927] [INFO] Task succeeded: HMMsearch [2023-06-28 01:46:00,929] [INFO] Found 6/6 markers. [2023-06-28 01:46:00,959] [INFO] Query marker FASTA was written to GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/markers.fasta [2023-06-28 01:46:00,959] [INFO] Task started: Blastn [2023-06-28 01:46:00,959] [INFO] Running command: blastn -query GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/markers.fasta -db /var/lib/cwl/stgbd621306-c623-48d6-a15b-078f51b3d510/dqc_reference/reference_markers.fasta -out GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-28 01:46:02,093] [INFO] Task succeeded: Blastn [2023-06-28 01:46:02,097] [INFO] Selected 14 target genomes. [2023-06-28 01:46:02,097] [INFO] Target genome list was writen to GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/target_genomes.txt [2023-06-28 01:46:02,098] [INFO] Task started: fastANI [2023-06-28 01:46:02,099] [INFO] Running command: fastANI --query /var/lib/cwl/stg0c3ab729-e129-4285-a9a8-d466ae2f2cc7/GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna.gz --refList GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/target_genomes.txt --output GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/fastani_result.tsv --threads 1 [2023-06-28 01:46:10,950] [INFO] Task succeeded: fastANI [2023-06-28 01:46:10,950] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgbd621306-c623-48d6-a15b-078f51b3d510/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-28 01:46:10,951] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgbd621306-c623-48d6-a15b-078f51b3d510/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-28 01:46:10,961] [INFO] Found 12 fastANI hits (0 hits with ANI > threshold) [2023-06-28 01:46:10,961] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-28 01:46:10,961] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Microbacterium testaceum strain=NBRC 12675 GCA_006539145.1 2033 2033 suspected-type True 84.5616 514 826 95 below_threshold Microbacterium enclense strain=NIO-1002 GCA_900096885.1 993073 993073 type True 84.2936 492 826 95 below_threshold Microbacterium enclense strain=NIO-1002 GCA_001456955.1 993073 993073 type True 84.2731 493 826 95 below_threshold Microbacterium proteolyticum strain=CECT 8356 GCA_014192415.1 1572644 1572644 type True 83.7728 471 826 95 below_threshold Microbacterium imperiale strain=DSM 20530 GCA_017876655.1 33884 33884 type True 80.141 304 826 95 below_threshold Microbacterium flavescens strain=JCM 3877 GCA_018588945.1 69366 69366 type True 79.9848 306 826 95 below_threshold Microbacterium hibisci strain=CCTCC AB 2016180 GCA_015278255.1 2036000 2036000 type True 79.8229 325 826 95 below_threshold Microbacterium sulfonylureivorans strain=LAM7116 GCA_003999995.1 2486854 2486854 type True 79.7098 321 826 95 below_threshold Microbacterium hominis strain=NBRC 15708 GCA_001592125.1 162426 162426 type True 79.5794 334 826 95 below_threshold Sphingomonas folli strain=RHCKR7 GCA_019429525.1 2862497 2862497 type True 74.9721 78 826 95 below_threshold Sphingomonas yunnanensis strain=YIM 3 GCA_019898765.1 310400 310400 type True 74.8423 73 826 95 below_threshold Sphingomonas citri strain=RRHST34 GCA_019429485.1 2862499 2862499 type True 74.737 85 826 95 below_threshold -------------------------------------------------------------------------------- [2023-06-28 01:46:10,962] [INFO] DFAST Taxonomy check result was written to GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/tc_result.tsv [2023-06-28 01:46:10,963] [INFO] ===== Taxonomy check completed ===== [2023-06-28 01:46:10,963] [INFO] ===== Start completeness check using CheckM ===== [2023-06-28 01:46:10,963] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgbd621306-c623-48d6-a15b-078f51b3d510/dqc_reference/checkm_data [2023-06-28 01:46:10,964] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-28 01:46:10,998] [INFO] Task started: CheckM [2023-06-28 01:46:10,998] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/checkm_input GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/checkm_result [2023-06-28 01:46:38,620] [INFO] Task succeeded: CheckM [2023-06-28 01:46:38,621] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 62.44% Contamintation: 8.33% Strain heterogeneity: 50.00% -------------------------------------------------------------------------------- [2023-06-28 01:46:38,638] [INFO] ===== Completeness check finished ===== [2023-06-28 01:46:38,638] [INFO] ===== Start GTDB Search ===== [2023-06-28 01:46:38,638] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/markers.fasta) [2023-06-28 01:46:38,638] [INFO] Task started: Blastn [2023-06-28 01:46:38,638] [INFO] Running command: blastn -query GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/markers.fasta -db /var/lib/cwl/stgbd621306-c623-48d6-a15b-078f51b3d510/dqc_reference/reference_markers_gtdb.fasta -out GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-28 01:46:40,436] [INFO] Task succeeded: Blastn [2023-06-28 01:46:40,439] [INFO] Selected 18 target genomes. [2023-06-28 01:46:40,439] [INFO] Target genome list was writen to GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/target_genomes_gtdb.txt [2023-06-28 01:46:40,493] [INFO] Task started: fastANI [2023-06-28 01:46:40,493] [INFO] Running command: fastANI --query /var/lib/cwl/stg0c3ab729-e129-4285-a9a8-d466ae2f2cc7/GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna.gz --refList GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/target_genomes_gtdb.txt --output GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-28 01:46:52,415] [INFO] Task succeeded: fastANI [2023-06-28 01:46:52,428] [INFO] Found 16 fastANI hits (1 hits with ANI > circumscription radius) [2023-06-28 01:46:52,429] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_001476655.1 s__Microbacterium testaceum_C 98.0364 699 826 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 conclusive GCF_000202635.1 s__Microbacterium testaceum_F 89.3801 632 826 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_001476285.1 s__Microbacterium testaceum_B 87.6967 608 826 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 98.43 98.38 0.88 0.88 4 - GCF_004854025.1 s__Microbacterium hydrothermale 85.5274 529 826 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_900104345.1 s__Microbacterium testaceum_A 84.6169 484 826 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 97.34 95.96 0.91 0.90 3 - GCF_006539145.1 s__Microbacterium testaceum 84.5227 516 826 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 98.70 98.67 0.94 0.93 3 - GCF_900156435.1 s__Microbacterium sp900156435 84.3588 485 826 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 100.00 100.00 1.00 1.00 2 - GCF_007679045.1 s__Microbacterium sp007679045 84.3153 510 826 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 99.00 99.00 0.94 0.94 2 - GCF_017833635.1 s__Microbacterium sp017833635 84.3075 484 826 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_001792815.1 s__Microbacterium sp001792815 84.2889 515 826 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 97.83 97.73 0.90 0.88 7 - GCF_003075375.1 s__Microbacterium testaceum_E 83.9946 463 826 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 96.83 96.83 0.84 0.84 2 - GCF_001424225.1 s__Microbacterium sp001424225 83.9742 475 826 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_001984105.1 s__Microbacterium sp001984105 83.7784 497 826 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 98.34 96.71 0.91 0.85 3 - GCF_014193845.1 s__Sphingomonas sp014193845 74.8689 74 826 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 98.90 98.90 0.87 0.87 2 - GCF_004345855.1 s__Sphingomonas sp004345855 74.8476 86 826 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 - GCF_014194975.1 s__Sphingomonas sp014194975 74.8267 95 826 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-28 01:46:52,430] [INFO] GTDB search result was written to GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/result_gtdb.tsv [2023-06-28 01:46:52,431] [INFO] ===== GTDB Search completed ===== [2023-06-28 01:46:52,434] [INFO] DFAST_QC result json was written to GCA_913777975.1_SP92_3_metabat2_genome_mining.29_genomic.fna/dqc_result.json [2023-06-28 01:46:52,434] [INFO] DFAST_QC completed! [2023-06-28 01:46:52,434] [INFO] Total running time: 0h1m0s