[2024-01-24 12:44:43,711] [INFO] DFAST_QC pipeline started. [2024-01-24 12:44:43,716] [INFO] DFAST_QC version: 0.5.7 [2024-01-24 12:44:43,717] [INFO] DQC Reference Directory: /var/lib/cwl/stg6db7b510-dabb-42ca-9891-c7af31aee59e/dqc_reference [2024-01-24 12:44:45,115] [INFO] ===== Start taxonomy check using ANI ===== [2024-01-24 12:44:45,116] [INFO] Task started: Prodigal [2024-01-24 12:44:45,116] [INFO] Running command: gunzip -c /var/lib/cwl/stg09245fec-694f-4849-9c20-f22d622f87c9/GCF_900446005.1_58116_B01_genomic.fna.gz | prodigal -d GCF_900446005.1_58116_B01_genomic.fna/cds.fna -a GCF_900446005.1_58116_B01_genomic.fna/protein.faa -g 11 -q > /dev/null [2024-01-24 12:44:55,907] [INFO] Task succeeded: Prodigal [2024-01-24 12:44:55,908] [INFO] Task started: HMMsearch [2024-01-24 12:44:55,908] [INFO] Running command: hmmsearch --tblout GCF_900446005.1_58116_B01_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg6db7b510-dabb-42ca-9891-c7af31aee59e/dqc_reference/reference_markers.hmm GCF_900446005.1_58116_B01_genomic.fna/protein.faa > /dev/null [2024-01-24 12:44:56,209] [INFO] Task succeeded: HMMsearch [2024-01-24 12:44:56,211] [INFO] Found 6/6 markers. [2024-01-24 12:44:56,248] [INFO] Query marker FASTA was written to GCF_900446005.1_58116_B01_genomic.fna/markers.fasta [2024-01-24 12:44:56,248] [INFO] Task started: Blastn [2024-01-24 12:44:56,248] [INFO] Running command: blastn -query GCF_900446005.1_58116_B01_genomic.fna/markers.fasta -db /var/lib/cwl/stg6db7b510-dabb-42ca-9891-c7af31aee59e/dqc_reference/reference_markers.fasta -out GCF_900446005.1_58116_B01_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 12:44:57,152] [INFO] Task succeeded: Blastn [2024-01-24 12:44:57,156] [INFO] Selected 15 target genomes. [2024-01-24 12:44:57,156] [INFO] Target genome list was writen to GCF_900446005.1_58116_B01_genomic.fna/target_genomes.txt [2024-01-24 12:44:57,164] [INFO] Task started: fastANI [2024-01-24 12:44:57,164] [INFO] Running command: fastANI --query /var/lib/cwl/stg09245fec-694f-4849-9c20-f22d622f87c9/GCF_900446005.1_58116_B01_genomic.fna.gz --refList GCF_900446005.1_58116_B01_genomic.fna/target_genomes.txt --output GCF_900446005.1_58116_B01_genomic.fna/fastani_result.tsv --threads 1 [2024-01-24 12:45:09,068] [INFO] Task succeeded: fastANI [2024-01-24 12:45:09,068] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg6db7b510-dabb-42ca-9891-c7af31aee59e/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2024-01-24 12:45:09,069] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg6db7b510-dabb-42ca-9891-c7af31aee59e/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2024-01-24 12:45:09,081] [WARNING] The ANI hits belong to more than one indistinguishable-group. The ANI hits will be classified as 'inconclusive,indistinguishable'. {}, {29459: 'Brucella melitensis', 236: 'Brucella ovis', 29460: 'Brucella neotomae', 29461: 'Brucella suis', 36855: 'Brucella canis'} [2024-01-24 12:45:09,082] [INFO] Found 15 fastANI hits (10 hits with ANI > threshold) [2024-01-24 12:45:09,082] [INFO] The taxonomy check result is classified as 'inconclusive'. [2024-01-24 12:45:09,082] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Brucella abortus strain=NCTC10093 GCA_900446005.1 235 235 type True 100.0 1093 1093 95 inconclusive Brucella abortus strain=544 GCA_000369945.1 235 235 type True 99.9861 1091 1093 95 inconclusive Brucella melitensis strain=16M GCA_000007125.1 29459 29459 suspected-type True 99.7065 1078 1093 95 inconclusive Brucella suis strain=NCTC10316 GCA_900460605.1 29461 29461 suspected-type True 99.6852 1078 1093 95 inconclusive Brucella melitensis strain=16M GCA_000160295.1 29459 29459 suspected-type True 99.6776 1074 1093 95 inconclusive Brucella suis strain=1330 GCA_000223195.1 29461 29461 suspected-type True 99.6769 1080 1093 95 inconclusive Brucella suis strain=1330 GCA_000007505.1 29461 29461 suspected-type True 99.6766 1081 1093 95 inconclusive Brucella melitensis strain=16M GCA_000250795.2 29459 29459 suspected-type True 99.6733 1040 1093 95 inconclusive Brucella ceti strain=B1/94 GCA_000158775.1 120577 120577 suspected-type True 99.6384 1076 1093 95 inconclusive Brucella ovis strain=ATCC 25840 GCA_000016845.1 236 236 type True 99.5124 1062 1093 95 inconclusive Pseudochrobactrum algeriensis GCA_018436245.1 2834768 2834768 type True 77.8324 263 1093 95 below_threshold Pseudochrobactrum algeriensis GCA_907164595.1 2834768 2834768 type True 77.7003 263 1093 95 below_threshold Mesorhizobium onobrychidis strain=OM4 GCA_024707545.1 2775404 2775404 type True 77.3889 254 1093 95 below_threshold Nitratireductor arenosus strain=CAU 1489 GCA_009742725.1 2682096 2682096 type True 77.0762 232 1093 95 below_threshold Nitratireductor alexandrii strain=Z3-1 GCA_004000215.1 2448161 2448161 type True 77.0589 243 1093 95 below_threshold -------------------------------------------------------------------------------- [2024-01-24 12:45:09,084] [INFO] DFAST Taxonomy check result was written to GCF_900446005.1_58116_B01_genomic.fna/tc_result.tsv [2024-01-24 12:45:09,084] [INFO] ===== Taxonomy check completed ===== [2024-01-24 12:45:09,085] [INFO] ===== Start completeness check using CheckM ===== [2024-01-24 12:45:09,085] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg6db7b510-dabb-42ca-9891-c7af31aee59e/dqc_reference/checkm_data [2024-01-24 12:45:09,086] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2024-01-24 12:45:09,120] [INFO] Task started: CheckM [2024-01-24 12:45:09,120] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_900446005.1_58116_B01_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_900446005.1_58116_B01_genomic.fna/checkm_input GCF_900446005.1_58116_B01_genomic.fna/checkm_result [2024-01-24 12:45:46,025] [INFO] Task succeeded: CheckM [2024-01-24 12:45:46,027] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2024-01-24 12:45:46,046] [INFO] ===== Completeness check finished ===== [2024-01-24 12:45:46,046] [INFO] ===== Start GTDB Search ===== [2024-01-24 12:45:46,046] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_900446005.1_58116_B01_genomic.fna/markers.fasta) [2024-01-24 12:45:46,047] [INFO] Task started: Blastn [2024-01-24 12:45:46,047] [INFO] Running command: blastn -query GCF_900446005.1_58116_B01_genomic.fna/markers.fasta -db /var/lib/cwl/stg6db7b510-dabb-42ca-9891-c7af31aee59e/dqc_reference/reference_markers_gtdb.fasta -out GCF_900446005.1_58116_B01_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 12:45:47,461] [INFO] Task succeeded: Blastn [2024-01-24 12:45:47,465] [INFO] Selected 14 target genomes. [2024-01-24 12:45:47,465] [INFO] Target genome list was writen to GCF_900446005.1_58116_B01_genomic.fna/target_genomes_gtdb.txt [2024-01-24 12:45:47,492] [INFO] Task started: fastANI [2024-01-24 12:45:47,492] [INFO] Running command: fastANI --query /var/lib/cwl/stg09245fec-694f-4849-9c20-f22d622f87c9/GCF_900446005.1_58116_B01_genomic.fna.gz --refList GCF_900446005.1_58116_B01_genomic.fna/target_genomes_gtdb.txt --output GCF_900446005.1_58116_B01_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2024-01-24 12:45:59,980] [INFO] Task succeeded: fastANI [2024-01-24 12:45:59,995] [INFO] Found 14 fastANI hits (1 hits with ANI > circumscription radius) [2024-01-24 12:45:59,995] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_000007125.1 s__Brucella melitensis 99.7065 1078 1093 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales_A;f__Rhizobiaceae_A;g__Brucella 95.0 99.71 97.51 0.99 0.88 877 conclusive GCF_000182645.1 s__Ochrobactrum intermedium 83.404 775 1093 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales_A;f__Rhizobiaceae_A;g__Ochrobactrum 95.0 97.99 97.16 0.92 0.84 56 - GCA_900473915.1 s__Ochrobactrum sp900473915 83.3307 772 1093 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales_A;f__Rhizobiaceae_A;g__Ochrobactrum 95.0 N/A N/A N/A N/A 1 - GCF_003550135.1 s__Ochrobactrum_B haematophila_B 83.3288 771 1093 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales_A;f__Rhizobiaceae_A;g__Ochrobactrum_B 95.0 N/A N/A N/A N/A 1 - GCF_008932435.1 s__Ochrobactrum pseudintermedium 83.2283 763 1093 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales_A;f__Rhizobiaceae_A;g__Ochrobactrum 95.0 98.90 97.85 0.90 0.89 7 - GCA_900470195.1 s__Ochrobactrum sp900470195 83.1346 761 1093 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales_A;f__Rhizobiaceae_A;g__Ochrobactrum 95.0 99.22 97.63 0.95 0.85 7 - GCF_014199265.1 s__Ochrobactrum_C daejeonensis 82.9996 761 1093 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales_A;f__Rhizobiaceae_A;g__Ochrobactrum_C 95.0 99.74 99.74 0.99 0.99 2 - GCF_000017405.1 s__Ochrobactrum anthropi 82.9966 770 1093 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales_A;f__Rhizobiaceae_A;g__Ochrobactrum 95.518 98.11 96.94 0.89 0.83 55 - GCF_008932295.1 s__Ochrobactrum tritici 82.8543 762 1093 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales_A;f__Rhizobiaceae_A;g__Ochrobactrum 95.0 98.18 96.73 0.91 0.84 6 - GCF_902825325.1 s__Ochrobactrum sp003176975 82.8312 769 1093 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales_A;f__Rhizobiaceae_A;g__Ochrobactrum 95.0 97.64 97.64 0.88 0.88 2 - GCF_005938105.1 s__Ochrobactrum_B haematophila 82.288 726 1093 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales_A;f__Rhizobiaceae_A;g__Ochrobactrum_B 95.0 98.61 97.38 0.92 0.84 3 - GCF_014253075.1 s__JACLZJ01 sp014253075 81.3965 727 1093 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales_A;f__Rhizobiaceae_A;g__JACLZJ01 95.0 N/A N/A N/A N/A 1 - GCF_019100495.1 s__Falsochrobactrum sp019100495 80.7623 618 1093 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales_A;f__Rhizobiaceae_A;g__Falsochrobactrum 95.0 N/A N/A N/A N/A 1 - GCA_900465915.1 s__63-22 sp900465915 80.6809 588 1093 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales_A;f__Rhizobiaceae_A;g__63-22 95.0 97.55 97.55 0.92 0.92 2 - -------------------------------------------------------------------------------- [2024-01-24 12:45:59,997] [INFO] GTDB search result was written to GCF_900446005.1_58116_B01_genomic.fna/result_gtdb.tsv [2024-01-24 12:45:59,997] [INFO] ===== GTDB Search completed ===== [2024-01-24 12:46:00,001] [INFO] DFAST_QC result json was written to GCF_900446005.1_58116_B01_genomic.fna/dqc_result.json [2024-01-24 12:46:00,001] [INFO] DFAST_QC completed! [2024-01-24 12:46:00,001] [INFO] Total running time: 0h1m16s