[2024-01-24 13:37:19,779] [INFO] DFAST_QC pipeline started. [2024-01-24 13:37:19,780] [INFO] DFAST_QC version: 0.5.7 [2024-01-24 13:37:19,781] [INFO] DQC Reference Directory: /var/lib/cwl/stg4d5550bd-17fe-4f26-b660-afd58193a09f/dqc_reference [2024-01-24 13:37:21,114] [INFO] ===== Start taxonomy check using ANI ===== [2024-01-24 13:37:21,115] [INFO] Task started: Prodigal [2024-01-24 13:37:21,116] [INFO] Running command: gunzip -c /var/lib/cwl/stg9a578690-179a-4b1f-a917-1234350b357d/GCF_014763045.1_ASM1476304v1_genomic.fna.gz | prodigal -d GCF_014763045.1_ASM1476304v1_genomic.fna/cds.fna -a GCF_014763045.1_ASM1476304v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2024-01-24 13:37:36,178] [INFO] Task succeeded: Prodigal [2024-01-24 13:37:36,178] [INFO] Task started: HMMsearch [2024-01-24 13:37:36,178] [INFO] Running command: hmmsearch --tblout GCF_014763045.1_ASM1476304v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg4d5550bd-17fe-4f26-b660-afd58193a09f/dqc_reference/reference_markers.hmm GCF_014763045.1_ASM1476304v1_genomic.fna/protein.faa > /dev/null [2024-01-24 13:37:36,525] [INFO] Task succeeded: HMMsearch [2024-01-24 13:37:36,526] [INFO] Found 6/6 markers. [2024-01-24 13:37:36,571] [INFO] Query marker FASTA was written to GCF_014763045.1_ASM1476304v1_genomic.fna/markers.fasta [2024-01-24 13:37:36,572] [INFO] Task started: Blastn [2024-01-24 13:37:36,572] [INFO] Running command: blastn -query GCF_014763045.1_ASM1476304v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg4d5550bd-17fe-4f26-b660-afd58193a09f/dqc_reference/reference_markers.fasta -out GCF_014763045.1_ASM1476304v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 13:37:37,346] [INFO] Task succeeded: Blastn [2024-01-24 13:37:37,354] [INFO] Selected 24 target genomes. [2024-01-24 13:37:37,354] [INFO] Target genome list was writen to GCF_014763045.1_ASM1476304v1_genomic.fna/target_genomes.txt [2024-01-24 13:37:37,367] [INFO] Task started: fastANI [2024-01-24 13:37:37,367] [INFO] Running command: fastANI --query /var/lib/cwl/stg9a578690-179a-4b1f-a917-1234350b357d/GCF_014763045.1_ASM1476304v1_genomic.fna.gz --refList GCF_014763045.1_ASM1476304v1_genomic.fna/target_genomes.txt --output GCF_014763045.1_ASM1476304v1_genomic.fna/fastani_result.tsv --threads 1 [2024-01-24 13:37:55,131] [INFO] Task succeeded: fastANI [2024-01-24 13:37:55,131] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg4d5550bd-17fe-4f26-b660-afd58193a09f/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2024-01-24 13:37:55,132] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg4d5550bd-17fe-4f26-b660-afd58193a09f/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2024-01-24 13:37:55,151] [INFO] Found 24 fastANI hits (1 hits with ANI > threshold) [2024-01-24 13:37:55,152] [INFO] The taxonomy check result is classified as 'conclusive'. [2024-01-24 13:37:55,152] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Sulfitobacter aestuariivivens strain=TSTF-M16 GCA_014763045.1 2766981 2766981 type True 100.0 1391 1391 95 conclusive Sulfitobacter maritimus strain=S0837 GCA_013346665.1 2741719 2741719 type True 78.0739 318 1391 95 below_threshold Sulfitobacter sediminilitoris strain=JBTF-M27 GCA_010667575.1 2698830 2698830 type True 77.9449 456 1391 95 below_threshold Sulfitobacter mediterraneus strain=DSM 12244 GCA_003054325.1 83219 83219 type True 77.879 457 1391 95 below_threshold Sulfitobacter indolifex strain=DSM 14862 GCA_022788655.1 225422 225422 type True 77.8246 345 1391 95 below_threshold Sulfitobacter geojensis strain=MM-124 GCA_000622325.1 1342299 1342299 type True 77.723 414 1391 95 below_threshold Sulfitobacter geojensis strain=DSM 101063 GCA_013407805.1 1342299 1342299 type True 77.7102 414 1391 95 below_threshold Sulfitobacter noctilucae strain=NB-68 GCA_000622365.1 1342302 1342302 type True 77.6772 405 1391 95 below_threshold Pseudosulfitobacter pseudonitzschiae strain=H3 GCA_000712315.1 1402135 1402135 type True 77.6664 401 1391 95 below_threshold Pseudosulfitobacter pseudonitzschiae strain=DSM 26824 GCA_900129395.1 1402135 1402135 type True 77.6646 404 1391 95 below_threshold Sulfitobacter dubius strain=DSM 16472 GCA_900113435.1 218673 218673 type True 77.6538 344 1391 95 below_threshold Sulfitobacter noctilucicola strain=NB-77 GCA_000622385.1 1342301 1342301 type True 77.4307 376 1391 95 below_threshold Sulfitobacter noctilucicola strain=DSM 101015 GCA_014197555.1 1342301 1342301 type True 77.3836 375 1391 95 below_threshold Tateyamaria omphalii strain=KCTC 12333 GCA_014651375.1 299262 299262 type True 77.1955 292 1391 95 below_threshold Ruegeria pomeroyi strain=DSS-3 GCA_000011965.2 89184 89184 suspected-type True 77.1805 335 1391 95 below_threshold Leisingera methylohalidivorans strain=DSM 14336; MB2 GCA_000511355.1 133924 133924 type True 77.1513 283 1391 95 below_threshold Sulfitobacter litoralis strain=DSM 17584 GCA_900103185.1 335975 335975 type True 77.1116 298 1391 95 below_threshold Antarcticimicrobium luteum strain=318-1 GCA_004358185.1 2547397 2547397 type True 77.0337 313 1391 95 below_threshold Sulfitobacter marinus strain=DSM 23422 GCA_900116285.1 394264 394264 type True 76.9106 293 1391 95 below_threshold Roseibacterium elongatum strain=DFL-43 GCA_000590925.1 159346 159346 type True 76.8761 187 1391 95 below_threshold Zongyanglinia huanghaiensis strain=CY05 GCA_009753675.1 2682100 2682100 type True 76.8554 139 1391 95 below_threshold Ruegeria marina strain=CGMCC 1.9108 GCA_900101475.1 639004 639004 type True 76.8173 280 1391 95 below_threshold Pseudophaeobacter flagellatus strain=MA21411-1 GCA_021228235.1 2899119 2899119 type True 76.6931 203 1391 95 below_threshold Salipiger pallidus strain=CGMCC 1.15762 GCA_014643635.1 1775170 1775170 type True 76.5431 189 1391 95 below_threshold -------------------------------------------------------------------------------- [2024-01-24 13:37:55,154] [INFO] DFAST Taxonomy check result was written to GCF_014763045.1_ASM1476304v1_genomic.fna/tc_result.tsv [2024-01-24 13:37:55,155] [INFO] ===== Taxonomy check completed ===== [2024-01-24 13:37:55,155] [INFO] ===== Start completeness check using CheckM ===== [2024-01-24 13:37:55,156] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg4d5550bd-17fe-4f26-b660-afd58193a09f/dqc_reference/checkm_data [2024-01-24 13:37:55,158] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2024-01-24 13:37:55,208] [INFO] Task started: CheckM [2024-01-24 13:37:55,208] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_014763045.1_ASM1476304v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_014763045.1_ASM1476304v1_genomic.fna/checkm_input GCF_014763045.1_ASM1476304v1_genomic.fna/checkm_result [2024-01-24 13:38:39,583] [INFO] Task succeeded: CheckM [2024-01-24 13:38:39,585] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2024-01-24 13:38:39,604] [INFO] ===== Completeness check finished ===== [2024-01-24 13:38:39,605] [INFO] ===== Start GTDB Search ===== [2024-01-24 13:38:39,605] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_014763045.1_ASM1476304v1_genomic.fna/markers.fasta) [2024-01-24 13:38:39,606] [INFO] Task started: Blastn [2024-01-24 13:38:39,606] [INFO] Running command: blastn -query GCF_014763045.1_ASM1476304v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg4d5550bd-17fe-4f26-b660-afd58193a09f/dqc_reference/reference_markers_gtdb.fasta -out GCF_014763045.1_ASM1476304v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 13:38:40,650] [INFO] Task succeeded: Blastn [2024-01-24 13:38:40,654] [INFO] Selected 20 target genomes. [2024-01-24 13:38:40,654] [INFO] Target genome list was writen to GCF_014763045.1_ASM1476304v1_genomic.fna/target_genomes_gtdb.txt [2024-01-24 13:38:40,698] [INFO] Task started: fastANI [2024-01-24 13:38:40,698] [INFO] Running command: fastANI --query /var/lib/cwl/stg9a578690-179a-4b1f-a917-1234350b357d/GCF_014763045.1_ASM1476304v1_genomic.fna.gz --refList GCF_014763045.1_ASM1476304v1_genomic.fna/target_genomes_gtdb.txt --output GCF_014763045.1_ASM1476304v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2024-01-24 13:38:56,108] [INFO] Task succeeded: fastANI [2024-01-24 13:38:56,127] [INFO] Found 20 fastANI hits (1 hits with ANI > circumscription radius) [2024-01-24 13:38:56,127] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_014763045.1 s__Sulfitobacter aestuariivivens 100.0 1391 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter 95.0 N/A N/A N/A N/A 1 conclusive GCF_003335585.1 s__Sulfitobacter sp003335585 78.3389 564 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter 95.0 N/A N/A N/A N/A 1 - GCF_001650875.1 s__Sulfitobacter sp001650875 78.2257 532 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter 95.0 N/A N/A N/A N/A 1 - GCF_009363555.1 s__Sulfitobacter sp009363555 78.146 495 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter 95.0 N/A N/A N/A N/A 1 - GCF_010667575.1 s__Sulfitobacter sediminilitoris 77.9286 455 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter 95.0 N/A N/A N/A N/A 1 - GCF_016801775.1 s__Sulfitobacter mediterraneus_A 77.8967 470 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter 95.0 100.00 99.99 1.00 1.00 16 - GCF_001886735.1 s__Sulfitobacter alexandrii 77.8719 452 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter 95.0 N/A N/A N/A N/A 1 - GCF_003335545.1 s__Ascidiaceihabitans sp003335545 77.8565 377 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Ascidiaceihabitans 95.0 N/A N/A N/A N/A 1 - GCF_003352065.1 s__Sulfitobacter sp003352065 77.833 387 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter 95.0 N/A N/A N/A N/A 1 - GCF_017743885.1 s__Sulfitobacter sp001635605 77.8173 395 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter 95.0 98.54 95.97 0.90 0.86 4 - GCA_017854855.1 s__Sulfitobacter sp017854855 77.8084 474 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter 95.0 N/A N/A N/A N/A 1 - GCA_013002405.1 s__Sulfitobacter sp013002405 77.6977 386 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter 95.0 N/A N/A N/A N/A 1 - GCF_000622365.1 s__Sulfitobacter noctilucae 77.6693 406 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter 95.0 N/A N/A N/A N/A 1 - GCF_015644665.1 s__Roseobacter sp015644665 77.5741 387 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseobacter 95.0 N/A N/A N/A N/A 1 - GCF_013282155.1 s__Sulfitobacter sp013282155 77.4994 353 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter 95.0 99.41 99.36 0.93 0.90 3 - GCF_002222635.1 s__Ascidiaceihabitans pseudonitzschiae_A 77.3545 331 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Ascidiaceihabitans 95.0 99.92 99.92 0.96 0.96 2 - GCF_000511355.1 s__Leisingera methylohalidivorans 77.141 282 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Leisingera 95.0 N/A N/A N/A N/A 1 - GCF_004358185.1 s__Antarcticimicrobium luteum 77.0337 313 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Antarcticimicrobium 95.0 N/A N/A N/A N/A 1 - GCA_013139555.1 s__Sulfitobacter sp013139555 77.031 249 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter 95.0 N/A N/A N/A N/A 1 - GCF_900101475.1 s__Ruegeria_B marina 76.8173 280 1391 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Ruegeria_B 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2024-01-24 13:38:56,130] [INFO] GTDB search result was written to GCF_014763045.1_ASM1476304v1_genomic.fna/result_gtdb.tsv [2024-01-24 13:38:56,130] [INFO] ===== GTDB Search completed ===== [2024-01-24 13:38:56,138] [INFO] DFAST_QC result json was written to GCF_014763045.1_ASM1476304v1_genomic.fna/dqc_result.json [2024-01-24 13:38:56,138] [INFO] DFAST_QC completed! [2024-01-24 13:38:56,139] [INFO] Total running time: 0h1m36s