[2023-06-08 00:55:00,490] [INFO] DFAST_QC pipeline started. [2023-06-08 00:55:00,493] [INFO] DFAST_QC version: 0.5.7 [2023-06-08 00:55:00,493] [INFO] DQC Reference Directory: /var/lib/cwl/stge56f6bac-7429-4a0e-8695-10a9ef925162/dqc_reference [2023-06-08 00:55:01,843] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-08 00:55:01,845] [INFO] Task started: Prodigal [2023-06-08 00:55:01,845] [INFO] Running command: gunzip -c /var/lib/cwl/stg336a968f-e074-4a17-9229-de2c2cd0fe87/GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna.gz | prodigal -d GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/cds.fna -a GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-08 00:55:09,500] [INFO] Task succeeded: Prodigal [2023-06-08 00:55:09,500] [INFO] Task started: HMMsearch [2023-06-08 00:55:09,500] [INFO] Running command: hmmsearch --tblout GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stge56f6bac-7429-4a0e-8695-10a9ef925162/dqc_reference/reference_markers.hmm GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/protein.faa > /dev/null [2023-06-08 00:55:09,769] [INFO] Task succeeded: HMMsearch [2023-06-08 00:55:09,771] [WARNING] Found 5/6 markers. [/var/lib/cwl/stg336a968f-e074-4a17-9229-de2c2cd0fe87/GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna.gz] [2023-06-08 00:55:09,799] [INFO] Query marker FASTA was written to GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/markers.fasta [2023-06-08 00:55:09,800] [INFO] Task started: Blastn [2023-06-08 00:55:09,800] [INFO] Running command: blastn -query GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/markers.fasta -db /var/lib/cwl/stge56f6bac-7429-4a0e-8695-10a9ef925162/dqc_reference/reference_markers.fasta -out GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-08 00:55:10,491] [INFO] Task succeeded: Blastn [2023-06-08 00:55:10,495] [INFO] Selected 23 target genomes. [2023-06-08 00:55:10,495] [INFO] Target genome list was writen to GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/target_genomes.txt [2023-06-08 00:55:10,500] [INFO] Task started: fastANI [2023-06-08 00:55:10,500] [INFO] Running command: fastANI --query /var/lib/cwl/stg336a968f-e074-4a17-9229-de2c2cd0fe87/GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna.gz --refList GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/target_genomes.txt --output GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/fastani_result.tsv --threads 1 [2023-06-08 00:55:25,090] [INFO] Task succeeded: fastANI [2023-06-08 00:55:25,091] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stge56f6bac-7429-4a0e-8695-10a9ef925162/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-08 00:55:25,092] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stge56f6bac-7429-4a0e-8695-10a9ef925162/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-08 00:55:25,118] [INFO] Found 23 fastANI hits (0 hits with ANI > threshold) [2023-06-08 00:55:25,118] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-08 00:55:25,119] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Loktanella ponticola strain=DSM 101064 GCA_014199395.1 1524255 1524255 type True 77.4715 221 777 95 below_threshold Yoonia maritima strain=DSM 101533 GCA_003003285.1 1435347 1435347 type True 77.3692 172 777 95 below_threshold Yoonia rosea strain=DSM 29591 GCA_900156505.1 287098 287098 type True 77.2418 201 777 95 below_threshold Yoonia litorea strain=DSM 29433 GCA_900114675.1 1123755 1123755 type True 77.0916 142 777 95 below_threshold Yoonia maricola strain=DSM 29128 GCA_002797915.1 420999 420999 type True 77.0128 184 777 95 below_threshold Sulfitobacter pontiacus strain=DSM 10014 GCA_900106935.1 60137 60137 type True 76.8932 122 777 95 below_threshold Loktanella salsilacus strain=DSM 16199 GCA_900114485.1 195913 195913 type True 76.7981 214 777 95 below_threshold Cognatiyoonia koreensis strain=DSM 17925 GCA_900109295.1 364200 364200 type True 76.6528 143 777 95 below_threshold Yoonia vestfoldensis strain=DSM 16212 GCA_000382265.1 245188 245188 type True 76.6481 176 777 95 below_threshold Sulfitobacter indolifex strain=DSM 14862 GCA_022788655.1 225422 225422 type True 76.5585 125 777 95 below_threshold Sulfitobacter sabulilitoris strain=HSMS-29 GCA_005887615.1 2562655 2562655 type True 76.4262 98 777 95 below_threshold Pseudosulfitobacter pseudonitzschiae strain=DSM 26824 GCA_900129395.1 1402135 1402135 type True 76.3709 137 777 95 below_threshold Flavimaricola marinus strain=CECT 8899 GCA_900184895.1 1819565 1819565 type True 76.3363 126 777 95 below_threshold Pseudosulfitobacter pseudonitzschiae strain=H3 GCA_000712315.1 1402135 1402135 type True 76.3233 140 777 95 below_threshold Limimaricola soesokkakensis strain=DSM 29956 GCA_003014435.1 1343159 1343159 type True 76.2758 80 777 95 below_threshold Loktanella atrilutea strain=DSM 29326 GCA_900128995.1 366533 366533 type True 76.2622 126 777 95 below_threshold Marivivens aquimaris strain=GSB7 GCA_015220045.1 2774876 2774876 type True 76.2264 121 777 95 below_threshold Leisingera caerulea strain=DSM 24564 GCA_000473325.1 506591 506591 type True 76.182 97 777 95 below_threshold Leisingera methylohalidivorans strain=DSM 14336; MB2 GCA_000511355.1 133924 133924 type True 76.1742 78 777 95 below_threshold Roseibacterium elongatum strain=DFL-43 GCA_000590925.1 159346 159346 type True 75.8752 65 777 95 below_threshold Rhodobacter amnigenus strain=HSP-20 GCA_019130055.1 2852097 2852097 type True 75.8345 75 777 95 below_threshold Rhodobacter amnigenus strain=HSP-20 GCA_009908265.2 2852097 2852097 type True 75.8345 75 777 95 below_threshold Cereibacter ovatus strain=JA234 GCA_900207575.1 439529 439529 type True 75.7329 55 777 95 below_threshold -------------------------------------------------------------------------------- [2023-06-08 00:55:25,121] [INFO] DFAST Taxonomy check result was written to GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/tc_result.tsv [2023-06-08 00:55:25,122] [INFO] ===== Taxonomy check completed ===== [2023-06-08 00:55:25,122] [INFO] ===== Start completeness check using CheckM ===== [2023-06-08 00:55:25,122] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stge56f6bac-7429-4a0e-8695-10a9ef925162/dqc_reference/checkm_data [2023-06-08 00:55:25,124] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-08 00:55:25,157] [INFO] Task started: CheckM [2023-06-08 00:55:25,158] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/checkm_input GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/checkm_result [2023-06-08 00:55:53,682] [INFO] Task succeeded: CheckM [2023-06-08 00:55:53,684] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 86.36% Contamintation: 4.69% Strain heterogeneity: 100.00% -------------------------------------------------------------------------------- [2023-06-08 00:55:53,708] [INFO] ===== Completeness check finished ===== [2023-06-08 00:55:53,709] [INFO] ===== Start GTDB Search ===== [2023-06-08 00:55:53,709] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/markers.fasta) [2023-06-08 00:55:53,709] [INFO] Task started: Blastn [2023-06-08 00:55:53,710] [INFO] Running command: blastn -query GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/markers.fasta -db /var/lib/cwl/stge56f6bac-7429-4a0e-8695-10a9ef925162/dqc_reference/reference_markers_gtdb.fasta -out GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-08 00:55:54,760] [INFO] Task succeeded: Blastn [2023-06-08 00:55:54,764] [INFO] Selected 17 target genomes. [2023-06-08 00:55:54,764] [INFO] Target genome list was writen to GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/target_genomes_gtdb.txt [2023-06-08 00:55:55,058] [INFO] Task started: fastANI [2023-06-08 00:55:55,059] [INFO] Running command: fastANI --query /var/lib/cwl/stg336a968f-e074-4a17-9229-de2c2cd0fe87/GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna.gz --refList GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/target_genomes_gtdb.txt --output GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-08 00:56:05,049] [INFO] Task succeeded: fastANI [2023-06-08 00:56:05,066] [INFO] Found 17 fastANI hits (1 hits with ANI > circumscription radius) [2023-06-08 00:56:05,067] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_002378665.1 s__Yoonia sp002378665 99.1936 699 777 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Yoonia 95.0 99.59 99.15 0.91 0.87 4 conclusive GCA_905182645.1 s__Yoonia sp905182645 78.6266 279 777 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Yoonia 95.0 N/A N/A N/A N/A 1 - GCF_014199395.1 s__Yoonia ponticola 77.4715 221 777 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Yoonia 95.0 N/A N/A N/A N/A 1 - GCF_000967725.1 s__Yoonia sp000967725 77.4128 145 777 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Yoonia 95.0 N/A N/A N/A N/A 1 - GCF_003003285.1 s__Yoonia maritima 77.3692 172 777 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Yoonia 95.0 N/A N/A N/A N/A 1 - GCF_003201935.1 s__Yoonia sp003201935 77.2667 209 777 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Yoonia 95.0 N/A N/A N/A N/A 1 - GCF_900156505.1 s__Yoonia rosea 77.2418 201 777 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Yoonia 95.0 96.70 96.70 0.95 0.95 2 - GCF_001419985.1 s__Yoonia sp001419985 77.1313 211 777 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Yoonia 95.0 N/A N/A N/A N/A 1 - GCF_002933395.1 s__Yoonia maritima_A 77.1101 167 777 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Yoonia 95.0 N/A N/A N/A N/A 1 - GCA_017854565.1 s__Yoonia sp017854565 77.095 189 777 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Yoonia 95.0 N/A N/A N/A N/A 1 - GCF_000169435.1 s__Yoonia sp000169435 77.0646 191 777 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Yoonia 95.0 N/A N/A N/A N/A 1 - GCF_002797915.1 s__Yoonia maricola 77.0263 183 777 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Yoonia 95.0 N/A N/A N/A N/A 1 - GCF_900143545.1 s__Yoonia sp900143545 76.9602 206 777 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Yoonia 95.0 N/A N/A N/A N/A 1 - GCA_014859355.1 s__Yoonia sp014859355 76.8816 155 777 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Yoonia 95.0 N/A N/A N/A N/A 1 - GCF_900114485.1 s__Loktanella salsilacus 76.7973 214 777 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Loktanella 95.0 97.76 97.75 0.91 0.90 3 - GCA_001650895.1 s__EhC02 sp001650895 76.191 94 777 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__EhC02 95.0 N/A N/A N/A N/A 1 - GCF_000153305.1 s__Oceanicola granulosus 75.6949 67 777 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Oceanicola 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-08 00:56:05,069] [INFO] GTDB search result was written to GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/result_gtdb.tsv [2023-06-08 00:56:05,070] [INFO] ===== GTDB Search completed ===== [2023-06-08 00:56:05,077] [INFO] DFAST_QC result json was written to GCA_902751295.1_P1994_104_bin73_mag_fasta_genomic.fna/dqc_result.json [2023-06-08 00:56:05,078] [INFO] DFAST_QC completed! [2023-06-08 00:56:05,078] [INFO] Total running time: 0h1m5s