[2024-01-24 14:30:59,793] [INFO] DFAST_QC pipeline started. [2024-01-24 14:30:59,794] [INFO] DFAST_QC version: 0.5.7 [2024-01-24 14:30:59,795] [INFO] DQC Reference Directory: /var/lib/cwl/stg096a7489-45de-4704-bf57-1a112dc275a5/dqc_reference [2024-01-24 14:31:01,213] [INFO] ===== Start taxonomy check using ANI ===== [2024-01-24 14:31:01,214] [INFO] Task started: Prodigal [2024-01-24 14:31:01,215] [INFO] Running command: gunzip -c /var/lib/cwl/stg50a47bfa-4533-41ff-834b-6e256692760f/GCF_014192015.1_ASM1419201v1_genomic.fna.gz | prodigal -d GCF_014192015.1_ASM1419201v1_genomic.fna/cds.fna -a GCF_014192015.1_ASM1419201v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2024-01-24 14:31:23,398] [INFO] Task succeeded: Prodigal [2024-01-24 14:31:23,398] [INFO] Task started: HMMsearch [2024-01-24 14:31:23,398] [INFO] Running command: hmmsearch --tblout GCF_014192015.1_ASM1419201v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg096a7489-45de-4704-bf57-1a112dc275a5/dqc_reference/reference_markers.hmm GCF_014192015.1_ASM1419201v1_genomic.fna/protein.faa > /dev/null [2024-01-24 14:31:23,871] [INFO] Task succeeded: HMMsearch [2024-01-24 14:31:23,873] [INFO] Found 6/6 markers. [2024-01-24 14:31:23,934] [INFO] Query marker FASTA was written to GCF_014192015.1_ASM1419201v1_genomic.fna/markers.fasta [2024-01-24 14:31:23,935] [INFO] Task started: Blastn [2024-01-24 14:31:23,935] [INFO] Running command: blastn -query GCF_014192015.1_ASM1419201v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg096a7489-45de-4704-bf57-1a112dc275a5/dqc_reference/reference_markers.fasta -out GCF_014192015.1_ASM1419201v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 14:31:24,604] [INFO] Task succeeded: Blastn [2024-01-24 14:31:24,607] [INFO] Selected 22 target genomes. [2024-01-24 14:31:24,608] [INFO] Target genome list was writen to GCF_014192015.1_ASM1419201v1_genomic.fna/target_genomes.txt [2024-01-24 14:31:24,647] [INFO] Task started: fastANI [2024-01-24 14:31:24,648] [INFO] Running command: fastANI --query /var/lib/cwl/stg50a47bfa-4533-41ff-834b-6e256692760f/GCF_014192015.1_ASM1419201v1_genomic.fna.gz --refList GCF_014192015.1_ASM1419201v1_genomic.fna/target_genomes.txt --output GCF_014192015.1_ASM1419201v1_genomic.fna/fastani_result.tsv --threads 1 [2024-01-24 14:31:51,067] [INFO] Task succeeded: fastANI [2024-01-24 14:31:51,067] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg096a7489-45de-4704-bf57-1a112dc275a5/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2024-01-24 14:31:51,068] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg096a7489-45de-4704-bf57-1a112dc275a5/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2024-01-24 14:31:51,091] [INFO] Found 22 fastANI hits (1 hits with ANI > threshold) [2024-01-24 14:31:51,091] [INFO] The taxonomy check result is classified as 'conclusive'. [2024-01-24 14:31:51,091] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Paenibacillus rhizosphaerae strain=CECT 5831 GCA_014192015.1 297318 297318 type True 100.0 2507 2509 95 conclusive Paenibacillus chibensis strain=NBRC 15958 GCA_004001045.1 59846 59846 type True 79.3702 682 2509 95 below_threshold Paenibacillus dokdonensis strain=YH-JAE5 GCA_004916975.1 2567944 2567944 type True 79.0156 600 2509 95 below_threshold Paenibacillus lactis strain=DSM 15596 GCA_017873605.1 228574 228574 type True 77.8477 426 2509 95 below_threshold Paenibacillus ihbetae strain=IHBB 9852 GCA_002741055.1 1870820 1870820 suspected-type True 77.8444 416 2509 95 below_threshold Paenibacillus rhizophilus strain=7197 GCA_003854965.1 1850366 1850366 type True 77.7837 234 2509 95 below_threshold Paenibacillus faecalis strain=Marseille-P3787 GCA_900289175.1 2079532 2079532 type True 77.6708 174 2509 95 below_threshold Paenibacillus uliginis strain=N3/975 GCA_900177425.1 683737 683737 type True 77.476 238 2509 95 below_threshold Paenibacillus zanthoxyli strain=JH29 GCA_000520715.1 369399 369399 type True 77.3769 193 2509 95 below_threshold Paenibacillus algicola strain=HB172198 GCA_005577435.1 2565926 2565926 type True 77.354 208 2509 95 below_threshold Paenibacillus oralis strain=KCOM 3021 GCA_003863965.1 2490856 2490856 type True 77.2495 338 2509 95 below_threshold Paenibacillus agri strain=JW14 GCA_013359945.1 2744309 2744309 type True 77.225 183 2509 95 below_threshold Paenibacillus phocaensis strain=mt24 GCA_900021165.1 1776378 1776378 type True 77.2163 294 2509 95 below_threshold Paenibacillus macerans strain=ATCC 8244 GCA_000746875.1 44252 44252 type True 77.2126 349 2509 95 below_threshold Paenibacillus macerans strain=NBRC 15307 GCA_004000965.1 44252 44252 type True 77.2088 325 2509 95 below_threshold Paenibacillus brevis strain=MSJ-6 GCA_018919145.1 2841508 2841508 type True 77.1176 172 2509 95 below_threshold Paenibacillus tianjinensis strain=TB2019 GCA_017086365.1 2810347 2810347 type True 77.0507 227 2509 95 below_threshold Paenibacillus typhae strain=CGMCC 1.11012 GCA_900099765.1 1174501 1174501 type True 76.9617 277 2509 95 below_threshold Paenibacillus artemisiicola strain=MWE-103 GCA_017652985.1 1172618 1172618 type True 76.8647 276 2509 95 below_threshold Paenibacillus lycopersici strain=12200R-189 GCA_010119935.1 2704462 2704462 type True 76.6716 250 2509 95 below_threshold Paenibacillus rhizovicinus strain=14171R-81 GCA_010365285.1 2704463 2704463 type True 76.604 191 2509 95 below_threshold Cohnella panacarvi strain=Gsoil 349 GCA_000515335.1 400776 400776 type True 76.1959 118 2509 95 below_threshold -------------------------------------------------------------------------------- [2024-01-24 14:31:51,093] [INFO] DFAST Taxonomy check result was written to GCF_014192015.1_ASM1419201v1_genomic.fna/tc_result.tsv [2024-01-24 14:31:51,093] [INFO] ===== Taxonomy check completed ===== [2024-01-24 14:31:51,094] [INFO] ===== Start completeness check using CheckM ===== [2024-01-24 14:31:51,094] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg096a7489-45de-4704-bf57-1a112dc275a5/dqc_reference/checkm_data [2024-01-24 14:31:51,095] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2024-01-24 14:31:51,173] [INFO] Task started: CheckM [2024-01-24 14:31:51,174] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_014192015.1_ASM1419201v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_014192015.1_ASM1419201v1_genomic.fna/checkm_input GCF_014192015.1_ASM1419201v1_genomic.fna/checkm_result [2024-01-24 14:32:52,799] [INFO] Task succeeded: CheckM [2024-01-24 14:32:52,800] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 98.96% Contamintation: 0.38% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2024-01-24 14:32:52,842] [INFO] ===== Completeness check finished ===== [2024-01-24 14:32:52,842] [INFO] ===== Start GTDB Search ===== [2024-01-24 14:32:52,843] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_014192015.1_ASM1419201v1_genomic.fna/markers.fasta) [2024-01-24 14:32:52,843] [INFO] Task started: Blastn [2024-01-24 14:32:52,843] [INFO] Running command: blastn -query GCF_014192015.1_ASM1419201v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg096a7489-45de-4704-bf57-1a112dc275a5/dqc_reference/reference_markers_gtdb.fasta -out GCF_014192015.1_ASM1419201v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 14:32:53,663] [INFO] Task succeeded: Blastn [2024-01-24 14:32:53,667] [INFO] Selected 10 target genomes. [2024-01-24 14:32:53,667] [INFO] Target genome list was writen to GCF_014192015.1_ASM1419201v1_genomic.fna/target_genomes_gtdb.txt [2024-01-24 14:32:53,675] [INFO] Task started: fastANI [2024-01-24 14:32:53,676] [INFO] Running command: fastANI --query /var/lib/cwl/stg50a47bfa-4533-41ff-834b-6e256692760f/GCF_014192015.1_ASM1419201v1_genomic.fna.gz --refList GCF_014192015.1_ASM1419201v1_genomic.fna/target_genomes_gtdb.txt --output GCF_014192015.1_ASM1419201v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2024-01-24 14:33:07,608] [INFO] Task succeeded: fastANI [2024-01-24 14:33:07,617] [INFO] Found 9 fastANI hits (1 hits with ANI > circumscription radius) [2024-01-24 14:33:07,617] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_014192015.1 s__Paenibacillus rhizosphaerae 100.0 2507 2509 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus 95.0 100.00 100.00 1.00 1.00 2 conclusive GCF_009363095.1 s__Paenibacillus cellulositrophicus_A 92.8817 2046 2509 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus 95.0 96.73 96.59 0.92 0.91 14 - GCF_002257645.1 s__Paenibacillus sp002257645 90.5689 2069 2509 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus 95.0 N/A N/A N/A N/A 1 - GCF_009870825.1 s__Paenibacillus albilobatus 79.5999 787 2509 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus 95.0 99.80 99.80 0.98 0.98 2 - GCF_018333375.1 s__Paenibacillus cookii 79.5117 748 2509 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus 95.0 99.09 99.09 0.93 0.93 2 - GCF_009807035.1 s__Paenibacillus sp009807035 79.4151 583 2509 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus 95.0 N/A N/A N/A N/A 1 - GCF_004001045.1 s__Paenibacillus chibensis 79.3837 680 2509 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus 95.0 N/A N/A N/A N/A 1 - GCF_003854965.1 s__Paenibacillus sp003854965 77.7837 234 2509 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus 95.0 95.02 95.02 0.86 0.86 2 - GCF_017086365.1 s__Paenibacillus sp017086365 77.053 229 2509 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2024-01-24 14:33:07,618] [INFO] GTDB search result was written to GCF_014192015.1_ASM1419201v1_genomic.fna/result_gtdb.tsv [2024-01-24 14:33:07,619] [INFO] ===== GTDB Search completed ===== [2024-01-24 14:33:07,629] [INFO] DFAST_QC result json was written to GCF_014192015.1_ASM1419201v1_genomic.fna/dqc_result.json [2024-01-24 14:33:07,629] [INFO] DFAST_QC completed! [2024-01-24 14:33:07,629] [INFO] Total running time: 0h2m8s