[2024-01-24 11:52:04,662] [INFO] DFAST_QC pipeline started. [2024-01-24 11:52:04,664] [INFO] DFAST_QC version: 0.5.7 [2024-01-24 11:52:04,664] [INFO] DQC Reference Directory: /var/lib/cwl/stg692835f1-b6b5-4e54-8747-d1cf0a2774ac/dqc_reference [2024-01-24 11:52:05,896] [INFO] ===== Start taxonomy check using ANI ===== [2024-01-24 11:52:05,897] [INFO] Task started: Prodigal [2024-01-24 11:52:05,898] [INFO] Running command: gunzip -c /var/lib/cwl/stgea274bc9-7599-494d-b5ae-9360df5b76aa/GCF_014197105.1_ASM1419710v1_genomic.fna.gz | prodigal -d GCF_014197105.1_ASM1419710v1_genomic.fna/cds.fna -a GCF_014197105.1_ASM1419710v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2024-01-24 11:52:16,361] [INFO] Task succeeded: Prodigal [2024-01-24 11:52:16,361] [INFO] Task started: HMMsearch [2024-01-24 11:52:16,362] [INFO] Running command: hmmsearch --tblout GCF_014197105.1_ASM1419710v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg692835f1-b6b5-4e54-8747-d1cf0a2774ac/dqc_reference/reference_markers.hmm GCF_014197105.1_ASM1419710v1_genomic.fna/protein.faa > /dev/null [2024-01-24 11:52:16,617] [INFO] Task succeeded: HMMsearch [2024-01-24 11:52:16,618] [INFO] Found 6/6 markers. [2024-01-24 11:52:16,651] [INFO] Query marker FASTA was written to GCF_014197105.1_ASM1419710v1_genomic.fna/markers.fasta [2024-01-24 11:52:16,651] [INFO] Task started: Blastn [2024-01-24 11:52:16,651] [INFO] Running command: blastn -query GCF_014197105.1_ASM1419710v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg692835f1-b6b5-4e54-8747-d1cf0a2774ac/dqc_reference/reference_markers.fasta -out GCF_014197105.1_ASM1419710v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 11:52:17,696] [INFO] Task succeeded: Blastn [2024-01-24 11:52:17,700] [INFO] Selected 21 target genomes. [2024-01-24 11:52:17,700] [INFO] Target genome list was writen to GCF_014197105.1_ASM1419710v1_genomic.fna/target_genomes.txt [2024-01-24 11:52:17,708] [INFO] Task started: fastANI [2024-01-24 11:52:17,708] [INFO] Running command: fastANI --query /var/lib/cwl/stgea274bc9-7599-494d-b5ae-9360df5b76aa/GCF_014197105.1_ASM1419710v1_genomic.fna.gz --refList GCF_014197105.1_ASM1419710v1_genomic.fna/target_genomes.txt --output GCF_014197105.1_ASM1419710v1_genomic.fna/fastani_result.tsv --threads 1 [2024-01-24 11:52:34,867] [INFO] Task succeeded: fastANI [2024-01-24 11:52:34,868] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg692835f1-b6b5-4e54-8747-d1cf0a2774ac/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2024-01-24 11:52:34,869] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg692835f1-b6b5-4e54-8747-d1cf0a2774ac/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2024-01-24 11:52:34,888] [INFO] Found 21 fastANI hits (1 hits with ANI > threshold) [2024-01-24 11:52:34,888] [INFO] The taxonomy check result is classified as 'conclusive'. [2024-01-24 11:52:34,888] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Sphingomonas jinjuensis strain=YC6723 GCA_014197105.1 535907 535907 type True 100.0 1270 1270 95 conclusive Sphingomonas taxi strain=ATCC 55669 GCA_000764535.1 1549858 1549858 type True 81.6423 743 1270 95 below_threshold Sphingomonas aquatilis strain=NBRC 16722 GCA_007990915.1 93063 93063 type True 81.5061 637 1270 95 below_threshold Sphingomonas metalli strain=CGMCC 1.15330 GCA_014641735.1 1779358 1779358 type True 81.5006 655 1270 95 below_threshold Sphingomonas aquatilis strain=DSM 15581 GCA_014196115.1 93063 93063 type True 81.4972 689 1270 95 below_threshold Sphingomonas melonis strain=DAPP-PG 224 GCA_000379045.1 152682 152682 type True 81.434 696 1270 95 below_threshold Sphingomonas insulae strain=KCTC 12872 GCA_010450875.1 424800 424800 type True 81.3648 672 1270 95 below_threshold Sphingomonas insulae strain=DSM 21792 GCA_011762035.1 424800 424800 type True 81.2761 677 1270 95 below_threshold Sphingomonas adhaesiva strain=NBRC 15099 GCA_001592345.1 28212 28212 type True 81.0888 536 1270 95 below_threshold Sphingomonas adhaesiva strain=DSM 7418 GCA_002374855.1 28212 28212 type True 80.9898 641 1270 95 below_threshold Sphingomonas ginsenosidimutans strain=KACC 14949 GCA_002374835.1 862134 862134 type True 80.7556 644 1270 95 below_threshold Sphingomonas endophytica strain=DSM 101535 GCA_014199415.1 869719 869719 type True 80.7541 619 1270 95 below_threshold Sphingomonas folli strain=RHCKR7 GCA_019429525.1 2862497 2862497 type True 80.7484 679 1270 95 below_threshold Sphingomonas citri strain=RRHST34 GCA_019429485.1 2862499 2862499 type True 80.6842 668 1270 95 below_threshold Sphingomonas yunnanensis strain=YIM 3 GCA_019898765.1 310400 310400 type True 80.508 657 1270 95 below_threshold Sphingomonas corticis strain=36D10-4-7 GCA_012035195.1 2722791 2722791 type True 80.5054 650 1270 95 below_threshold Sphingomonas phyllosphaerae strain=FA2 GCA_000427645.1 257003 257003 type True 80.4812 627 1270 95 below_threshold Sphingomonas pseudosanguinis strain=DSM 19512 GCA_014196255.1 413712 413712 type True 80.4792 649 1270 95 below_threshold Sphingomonas gellani strain=S6-262 GCA_900110035.1 1166340 1166340 type True 80.226 547 1270 95 below_threshold Sphingomonas aracearum strain=WZY 27 GCA_003345355.1 2283317 2283317 type True 79.9615 555 1270 95 below_threshold Sphingomonas radiodurans strain=S9-5 GCA_020866845.1 2890321 2890321 type True 79.5479 532 1270 95 below_threshold -------------------------------------------------------------------------------- [2024-01-24 11:52:34,890] [INFO] DFAST Taxonomy check result was written to GCF_014197105.1_ASM1419710v1_genomic.fna/tc_result.tsv [2024-01-24 11:52:34,890] [INFO] ===== Taxonomy check completed ===== [2024-01-24 11:52:34,891] [INFO] ===== Start completeness check using CheckM ===== [2024-01-24 11:52:34,891] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg692835f1-b6b5-4e54-8747-d1cf0a2774ac/dqc_reference/checkm_data [2024-01-24 11:52:34,892] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2024-01-24 11:52:34,934] [INFO] Task started: CheckM [2024-01-24 11:52:34,934] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_014197105.1_ASM1419710v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_014197105.1_ASM1419710v1_genomic.fna/checkm_input GCF_014197105.1_ASM1419710v1_genomic.fna/checkm_result [2024-01-24 11:53:08,102] [INFO] Task succeeded: CheckM [2024-01-24 11:53:08,103] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2024-01-24 11:53:08,129] [INFO] ===== Completeness check finished ===== [2024-01-24 11:53:08,129] [INFO] ===== Start GTDB Search ===== [2024-01-24 11:53:08,130] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_014197105.1_ASM1419710v1_genomic.fna/markers.fasta) [2024-01-24 11:53:08,130] [INFO] Task started: Blastn [2024-01-24 11:53:08,130] [INFO] Running command: blastn -query GCF_014197105.1_ASM1419710v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg692835f1-b6b5-4e54-8747-d1cf0a2774ac/dqc_reference/reference_markers_gtdb.fasta -out GCF_014197105.1_ASM1419710v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 11:53:10,125] [INFO] Task succeeded: Blastn [2024-01-24 11:53:10,129] [INFO] Selected 22 target genomes. [2024-01-24 11:53:10,130] [INFO] Target genome list was writen to GCF_014197105.1_ASM1419710v1_genomic.fna/target_genomes_gtdb.txt [2024-01-24 11:53:10,152] [INFO] Task started: fastANI [2024-01-24 11:53:10,152] [INFO] Running command: fastANI --query /var/lib/cwl/stgea274bc9-7599-494d-b5ae-9360df5b76aa/GCF_014197105.1_ASM1419710v1_genomic.fna.gz --refList GCF_014197105.1_ASM1419710v1_genomic.fna/target_genomes_gtdb.txt --output GCF_014197105.1_ASM1419710v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2024-01-24 11:53:28,553] [INFO] Task succeeded: fastANI [2024-01-24 11:53:28,583] [INFO] Found 22 fastANI hits (1 hits with ANI > circumscription radius) [2024-01-24 11:53:28,583] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_014197105.1 s__Sphingomonas jinjuensis 100.0 1270 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 conclusive GCF_013409985.1 s__Sphingomonas melonis_A 81.8562 711 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 - GCA_002292295.1 s__Sphingomonas sp002292295 81.798 716 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 - GCF_000764535.1 s__Sphingomonas taxi 81.5923 749 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 97.55 97.55 0.88 0.88 2 - GCF_000632225.1 s__Sphingomonas sp000632225 81.5506 712 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 - GCF_016107325.1 s__Sphingomonas sp016107325 81.5379 732 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 - GCF_014196115.1 s__Sphingomonas aquatilis 81.5153 687 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 96.36 95.86 0.87 0.82 11 - GCF_003355005.1 s__Sphingomonas sp003355005 81.5049 692 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 - GCF_014641735.1 s__Sphingomonas metalli 81.4694 658 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 - GCF_010450875.1 s__Sphingomonas insulae 81.3256 676 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 99.99 99.99 0.99 0.99 2 - GCA_003075315.1 s__Sphingomonas sp003075315 81.318 693 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 98.84 96.53 0.96 0.89 4 - GCF_016820445.1 s__Sphingomonas sp016820445 81.0777 666 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 97.21 97.21 0.90 0.90 2 - GCF_012035195.1 s__Sphingomonas sp012035195 80.5736 643 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 - GCF_001476895.1 s__Sphingomonas endophytica_A 80.5047 587 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 - GCF_014196255.1 s__Sphingomonas pseudosanguinis 80.476 649 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 99.99 99.99 1.00 1.00 2 - GCF_013344685.1 s__Sphingomonas sp013344685 80.4656 574 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 - GCF_006438955.1 s__Sphingomonas oligophenolica 80.4589 577 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 - GCA_001897375.1 s__Sphingomonas sp001897375 80.3294 570 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 99.01 98.04 0.97 0.96 3 - GCF_902498785.1 s__Sphingomonas sp902498785 80.3112 587 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 - GCF_900110035.1 s__Sphingomonas gellani 80.2751 542 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 - GCF_003345355.1 s__Sphingomonas sp003345355 79.9573 556 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 N/A N/A N/A N/A 1 - GCF_000980895.1 s__Sphingomonas olei 79.9149 523 1270 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 97.98 97.88 0.87 0.83 3 - -------------------------------------------------------------------------------- [2024-01-24 11:53:28,585] [INFO] GTDB search result was written to GCF_014197105.1_ASM1419710v1_genomic.fna/result_gtdb.tsv [2024-01-24 11:53:28,586] [INFO] ===== GTDB Search completed ===== [2024-01-24 11:53:28,591] [INFO] DFAST_QC result json was written to GCF_014197105.1_ASM1419710v1_genomic.fna/dqc_result.json [2024-01-24 11:53:28,591] [INFO] DFAST_QC completed! [2024-01-24 11:53:28,591] [INFO] Total running time: 0h1m24s