[2024-01-24 14:15:32,316] [INFO] DFAST_QC pipeline started. [2024-01-24 14:15:32,318] [INFO] DFAST_QC version: 0.5.7 [2024-01-24 14:15:32,318] [INFO] DQC Reference Directory: /var/lib/cwl/stg97374f27-1ae8-45fb-a653-a2f031d601c5/dqc_reference [2024-01-24 14:15:33,607] [INFO] ===== Start taxonomy check using ANI ===== [2024-01-24 14:15:33,609] [INFO] Task started: Prodigal [2024-01-24 14:15:33,611] [INFO] Running command: gunzip -c /var/lib/cwl/stg03c5b3db-4cb1-4773-8f5a-33b49dcb76c5/GCF_024722015.1_ASM2472201v1_genomic.fna.gz | prodigal -d GCF_024722015.1_ASM2472201v1_genomic.fna/cds.fna -a GCF_024722015.1_ASM2472201v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2024-01-24 14:15:56,615] [INFO] Task succeeded: Prodigal [2024-01-24 14:15:56,616] [INFO] Task started: HMMsearch [2024-01-24 14:15:56,616] [INFO] Running command: hmmsearch --tblout GCF_024722015.1_ASM2472201v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg97374f27-1ae8-45fb-a653-a2f031d601c5/dqc_reference/reference_markers.hmm GCF_024722015.1_ASM2472201v1_genomic.fna/protein.faa > /dev/null [2024-01-24 14:15:57,141] [INFO] Task succeeded: HMMsearch [2024-01-24 14:15:57,143] [INFO] Found 6/6 markers. [2024-01-24 14:15:57,454] [INFO] Query marker FASTA was written to GCF_024722015.1_ASM2472201v1_genomic.fna/markers.fasta [2024-01-24 14:15:57,454] [INFO] Task started: Blastn [2024-01-24 14:15:57,454] [INFO] Running command: blastn -query GCF_024722015.1_ASM2472201v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg97374f27-1ae8-45fb-a653-a2f031d601c5/dqc_reference/reference_markers.fasta -out GCF_024722015.1_ASM2472201v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 14:15:58,060] [INFO] Task succeeded: Blastn [2024-01-24 14:15:58,064] [INFO] Selected 25 target genomes. [2024-01-24 14:15:58,064] [INFO] Target genome list was writen to GCF_024722015.1_ASM2472201v1_genomic.fna/target_genomes.txt [2024-01-24 14:15:58,074] [INFO] Task started: fastANI [2024-01-24 14:15:58,074] [INFO] Running command: fastANI --query /var/lib/cwl/stg03c5b3db-4cb1-4773-8f5a-33b49dcb76c5/GCF_024722015.1_ASM2472201v1_genomic.fna.gz --refList GCF_024722015.1_ASM2472201v1_genomic.fna/target_genomes.txt --output GCF_024722015.1_ASM2472201v1_genomic.fna/fastani_result.tsv --threads 1 [2024-01-24 14:16:29,700] [INFO] Task succeeded: fastANI [2024-01-24 14:16:29,700] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg97374f27-1ae8-45fb-a653-a2f031d601c5/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2024-01-24 14:16:29,701] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg97374f27-1ae8-45fb-a653-a2f031d601c5/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2024-01-24 14:16:29,717] [INFO] Found 21 fastANI hits (0 hits with ANI > threshold) [2024-01-24 14:16:29,717] [INFO] The taxonomy check result is classified as 'below_threshold'. [2024-01-24 14:16:29,717] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Paenibacillus foliorum strain=LMG 31456 GCA_013141765.1 2654974 2654974 type True 86.158 1961 2896 95 below_threshold Paenibacillus periandrae strain=PM10 GCA_022458865.1 1761741 1761741 type True 78.9617 485 2896 95 below_threshold Paenibacillus planticolens strain=LMG 31457 GCA_013141725.1 2654976 2654976 type True 78.869 201 2896 95 below_threshold Paenibacillus piri strain=MS74 GCA_004354045.1 2547395 2547395 type True 78.8116 770 2896 95 below_threshold Paenibacillus rigui strain=JCM 16352 GCA_002234615.1 554312 554312 type True 78.3398 396 2896 95 below_threshold Paenibacillus xerothermodurans strain=ATCC 27380 GCA_002220865.2 1977292 1977292 type True 77.8879 229 2896 95 below_threshold Paenibacillus doosanensis strain=CAU 1055 GCA_025060755.1 1229154 1229154 type True 77.6836 380 2896 95 below_threshold Paenibacillus plantarum strain=LMG 31461 GCA_013141695.1 2654975 2654975 type True 77.4412 177 2896 95 below_threshold Paenibacillus eucommiae strain=DSM 26048 GCA_017874215.1 1355755 1355755 type True 77.3871 173 2896 95 below_threshold Paenibacillus nasutitermitis strain=CGMCC 1.15178 GCA_014641075.1 1652958 1652958 type True 77.1677 115 2896 95 below_threshold Paenibacillus validus strain=NBRC 15382 GCA_004000985.1 44253 44253 type True 77.0778 150 2896 95 below_threshold Paenibacillus guangzhouensis strain=KCTC 33171 GCA_009363075.1 1473112 1473112 type True 76.925 94 2896 95 below_threshold Paenibacillus naphthalenovorans strain=PR-N1 GCA_900099895.1 162209 162209 type True 76.8379 162 2896 95 below_threshold Paenibacillus algorifonticola strain=CGMCC 1.10223 GCA_900112925.1 684063 684063 type True 76.743 115 2896 95 below_threshold Cohnella herbarum strain=MFER-1 GCA_012849095.1 2728023 2728023 type True 76.741 71 2896 95 below_threshold Paenibacillus crassostreae strain=LPB0068 GCA_001857945.1 1763538 1763538 type True 76.7332 60 2896 95 below_threshold Paenibacillus algorifonticola strain=XJ259 GCA_000971975.1 684063 684063 type True 76.6892 112 2896 95 below_threshold Paenibacillus crassostreae strain=LPB0068 GCA_001637335.1 1763538 1763538 type True 76.6842 57 2896 95 below_threshold Paenibacillus roseus strain=MAHUQ-46 GCA_016424485.1 2798579 2798579 type True 76.4053 59 2896 95 below_threshold Paenibacillus oceani strain=IB182363 GCA_014705615.1 2772510 2772510 type True 75.9675 84 2896 95 below_threshold Paenibacillus turpanensis strain=YIM B00363 GCA_011421635.1 2689078 2689078 type True 75.6921 59 2896 95 below_threshold -------------------------------------------------------------------------------- [2024-01-24 14:16:29,719] [INFO] DFAST Taxonomy check result was written to GCF_024722015.1_ASM2472201v1_genomic.fna/tc_result.tsv [2024-01-24 14:16:29,719] [INFO] ===== Taxonomy check completed ===== [2024-01-24 14:16:29,719] [INFO] ===== Start completeness check using CheckM ===== [2024-01-24 14:16:29,720] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg97374f27-1ae8-45fb-a653-a2f031d601c5/dqc_reference/checkm_data [2024-01-24 14:16:29,721] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2024-01-24 14:16:29,800] [INFO] Task started: CheckM [2024-01-24 14:16:29,800] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_024722015.1_ASM2472201v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_024722015.1_ASM2472201v1_genomic.fna/checkm_input GCF_024722015.1_ASM2472201v1_genomic.fna/checkm_result [2024-01-24 14:17:36,115] [INFO] Task succeeded: CheckM [2024-01-24 14:17:36,117] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 4.17% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2024-01-24 14:17:36,143] [INFO] ===== Completeness check finished ===== [2024-01-24 14:17:36,144] [INFO] ===== Start GTDB Search ===== [2024-01-24 14:17:36,144] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_024722015.1_ASM2472201v1_genomic.fna/markers.fasta) [2024-01-24 14:17:36,145] [INFO] Task started: Blastn [2024-01-24 14:17:36,145] [INFO] Running command: blastn -query GCF_024722015.1_ASM2472201v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg97374f27-1ae8-45fb-a653-a2f031d601c5/dqc_reference/reference_markers_gtdb.fasta -out GCF_024722015.1_ASM2472201v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 14:17:36,950] [INFO] Task succeeded: Blastn [2024-01-24 14:17:36,955] [INFO] Selected 27 target genomes. [2024-01-24 14:17:36,955] [INFO] Target genome list was writen to GCF_024722015.1_ASM2472201v1_genomic.fna/target_genomes_gtdb.txt [2024-01-24 14:17:37,011] [INFO] Task started: fastANI [2024-01-24 14:17:37,011] [INFO] Running command: fastANI --query /var/lib/cwl/stg03c5b3db-4cb1-4773-8f5a-33b49dcb76c5/GCF_024722015.1_ASM2472201v1_genomic.fna.gz --refList GCF_024722015.1_ASM2472201v1_genomic.fna/target_genomes_gtdb.txt --output GCF_024722015.1_ASM2472201v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2024-01-24 14:18:13,822] [INFO] Task succeeded: fastANI [2024-01-24 14:18:13,842] [INFO] Found 25 fastANI hits (0 hits with ANI > circumscription radius) [2024-01-24 14:18:13,842] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_013141765.1 s__Paenibacillus_S foliorum 86.1701 1959 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__NBRC-103111;g__Paenibacillus_S 95.0 N/A N/A N/A N/A 1 - GCF_900114475.1 s__Paenibacillus_S sp900114475 79.1207 475 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__NBRC-103111;g__Paenibacillus_S 95.0 N/A N/A N/A N/A 1 - GCF_001956045.1 s__Paenibacillus_S sp001956045 78.8753 527 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__NBRC-103111;g__Paenibacillus_S 95.0 N/A N/A N/A N/A 1 - GCF_004354045.1 s__Paenibacillus_S piri 78.8009 767 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__NBRC-103111;g__Paenibacillus_S 95.0 N/A N/A N/A N/A 1 - GCF_002234615.1 s__Paenibacillus_S rigui 78.3362 395 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__NBRC-103111;g__Paenibacillus_S 95.0 N/A N/A N/A N/A 1 - GCF_000686845.1 s__Paenibacillus_S sp000686845 78.2538 419 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__NBRC-103111;g__Paenibacillus_S 95.0 N/A N/A N/A N/A 1 - GCF_011440305.1 s__Paenibacillus_S sp011440305 78.2015 398 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__NBRC-103111;g__Paenibacillus_S 95.0 N/A N/A N/A N/A 1 - GCF_002220865.2 s__Paenibacillus_S xerothermodurans 77.8838 230 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__NBRC-103111;g__Paenibacillus_S 95.0 N/A N/A N/A N/A 1 - GCF_002042965.1 s__Paenibacillus_S sp002042965 77.8528 374 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__NBRC-103111;g__Paenibacillus_S 95.0 97.63 97.63 0.93 0.93 2 - GCF_000722545.1 s__Paenibacillus_G tyrfis 77.3287 186 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__NBRC-103111;g__Paenibacillus_G 95.0 98.07 98.07 0.93 0.93 2 - GCF_014641075.1 s__Paenibacillus_Z nasutitermitis 77.1058 114 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus_Z 95.0 N/A N/A N/A N/A 1 - GCF_004000985.1 s__Paenibacillus_G validus 77.0778 150 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__NBRC-103111;g__Paenibacillus_G 95.0 99.29 99.15 0.90 0.89 6 - GCF_017874275.1 s__Paenibacillus_Z sepulcri 77.0372 135 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus_Z 95.0 N/A N/A N/A N/A 1 - GCF_001956095.1 s__Paenibacillus_C sp001956095 77.0334 97 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus_C 95.0 N/A N/A N/A N/A 1 - GCF_900101595.1 s__Paenibacillus_G sp900101595 76.9935 143 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__NBRC-103111;g__Paenibacillus_G 95.0 100.00 100.00 0.99 0.99 2 - GCF_018998565.1 s__Paenibacillus_G sp018998565 76.9377 200 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__NBRC-103111;g__Paenibacillus_G 95.0 N/A N/A N/A N/A 1 - GCF_009363075.1 s__Paenibacillus_K guangzhouensis 76.929 91 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus_K 95.0 N/A N/A N/A N/A 1 - GCF_900099895.1 s__Paenibacillus_G naphthalenovorans 76.8782 162 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__NBRC-103111;g__Paenibacillus_G 95.0 99.27 98.99 0.91 0.86 5 - GCF_900110075.1 s__Paenibacillus_Z sp900110075 76.7239 88 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus_Z 95.0 N/A N/A N/A N/A 1 - GCF_001857945.1 s__Paenibacillus crassostreae 76.7097 59 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus 95.0 100.00 100.00 1.00 1.00 2 - GCF_000971975.1 s__Paenibacillus_C algorifonticola 76.7006 113 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus_C 95.0 99.98 99.98 0.99 0.99 2 - GCF_000380965.1 s__Paenibacillus_M ginsengihumi 76.1063 100 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__NBRC-103111;g__Paenibacillus_M 95.0 N/A N/A N/A N/A 1 - GCF_014705615.1 s__Paenibacillus_AB sp014705615 75.9719 82 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__NBRC-103111;g__Paenibacillus_AB 95.0 N/A N/A N/A N/A 1 - GCF_003583765.1 s__Paenibacillus_C nanensis 75.8173 67 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__Paenibacillaceae;g__Paenibacillus_C 95.0 N/A N/A N/A N/A 1 - GCF_011421635.1 s__Paenibacillus_AK turpanensis 75.6921 59 2896 d__Bacteria;p__Firmicutes;c__Bacilli;o__Paenibacillales;f__YIM-B00363;g__Paenibacillus_AK 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2024-01-24 14:18:13,847] [INFO] GTDB search result was written to GCF_024722015.1_ASM2472201v1_genomic.fna/result_gtdb.tsv [2024-01-24 14:18:13,848] [INFO] ===== GTDB Search completed ===== [2024-01-24 14:18:13,855] [INFO] DFAST_QC result json was written to GCF_024722015.1_ASM2472201v1_genomic.fna/dqc_result.json [2024-01-24 14:18:13,856] [INFO] DFAST_QC completed! [2024-01-24 14:18:13,856] [INFO] Total running time: 0h2m42s