[2023-06-27 09:21:33,150] [INFO] DFAST_QC pipeline started. [2023-06-27 09:21:33,153] [INFO] DFAST_QC version: 0.5.7 [2023-06-27 09:21:33,153] [INFO] DQC Reference Directory: /var/lib/cwl/stga740ce00-180c-4b18-9ff0-a50ebd6ad20a/dqc_reference [2023-06-27 09:21:34,345] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-27 09:21:34,346] [INFO] Task started: Prodigal [2023-06-27 09:21:34,346] [INFO] Running command: gunzip -c /var/lib/cwl/stg6970c943-51d1-444b-8e6f-8f0b043a834a/GCA_026647775.1_ASM2664777v1_genomic.fna.gz | prodigal -d GCA_026647775.1_ASM2664777v1_genomic.fna/cds.fna -a GCA_026647775.1_ASM2664777v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-27 09:21:53,060] [INFO] Task succeeded: Prodigal [2023-06-27 09:21:53,061] [INFO] Task started: HMMsearch [2023-06-27 09:21:53,061] [INFO] Running command: hmmsearch --tblout GCA_026647775.1_ASM2664777v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stga740ce00-180c-4b18-9ff0-a50ebd6ad20a/dqc_reference/reference_markers.hmm GCA_026647775.1_ASM2664777v1_genomic.fna/protein.faa > /dev/null [2023-06-27 09:21:53,429] [INFO] Task succeeded: HMMsearch [2023-06-27 09:21:53,431] [INFO] Found 6/6 markers. [2023-06-27 09:21:53,515] [INFO] Query marker FASTA was written to GCA_026647775.1_ASM2664777v1_genomic.fna/markers.fasta [2023-06-27 09:21:53,516] [INFO] Task started: Blastn [2023-06-27 09:21:53,516] [INFO] Running command: blastn -query GCA_026647775.1_ASM2664777v1_genomic.fna/markers.fasta -db /var/lib/cwl/stga740ce00-180c-4b18-9ff0-a50ebd6ad20a/dqc_reference/reference_markers.fasta -out GCA_026647775.1_ASM2664777v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-27 09:21:54,317] [INFO] Task succeeded: Blastn [2023-06-27 09:21:54,344] [INFO] Selected 27 target genomes. [2023-06-27 09:21:54,344] [INFO] Target genome list was writen to GCA_026647775.1_ASM2664777v1_genomic.fna/target_genomes.txt [2023-06-27 09:21:54,349] [INFO] Task started: fastANI [2023-06-27 09:21:54,350] [INFO] Running command: fastANI --query /var/lib/cwl/stg6970c943-51d1-444b-8e6f-8f0b043a834a/GCA_026647775.1_ASM2664777v1_genomic.fna.gz --refList GCA_026647775.1_ASM2664777v1_genomic.fna/target_genomes.txt --output GCA_026647775.1_ASM2664777v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-27 09:22:26,415] [INFO] Task succeeded: fastANI [2023-06-27 09:22:26,415] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stga740ce00-180c-4b18-9ff0-a50ebd6ad20a/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-27 09:22:26,416] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stga740ce00-180c-4b18-9ff0-a50ebd6ad20a/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-27 09:22:26,436] [INFO] Found 26 fastANI hits (0 hits with ANI > threshold) [2023-06-27 09:22:26,436] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-27 09:22:26,436] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Sandaracinus amylolyticus strain=DSM 53668 GCA_000737325.2 927083 927083 type True 76.7331 1269 2437 95 below_threshold Polyangium spumosum strain=DSM 14734 GCA_009649845.1 889282 889282 type True 75.2883 830 2437 95 below_threshold Labilithrix luteola strain=DSM 27648 GCA_001263205.1 1391654 1391654 type True 75.2251 505 2437 95 below_threshold Polyangium aurulentum strain=SDU3-1 GCA_005144635.2 2567896 2567896 type True 75.2213 863 2437 95 below_threshold Polyangium fumosum strain=DSM 14668 GCA_005144585.1 889272 889272 neotype True 75.2004 825 2437 95 below_threshold Geobacter pickeringii strain=G13 GCA_000817955.1 345632 345632 type True 75.1671 60 2437 95 below_threshold Chondromyces crocatus strain=Cm c5 GCA_001189295.1 52 52 type True 75.0607 460 2437 95 below_threshold Thauera phenylacetica strain=B4P GCA_000310225.1 164400 164400 type True 75.0164 334 2437 95 below_threshold Malikia granosa strain=P1 GCA_002980595.1 263067 263067 type True 74.9062 153 2437 95 below_threshold Fulvimonas soli strain=LMG 19981 GCA_006352285.1 155197 155197 type True 74.9023 333 2437 95 below_threshold Fulvimonas soli strain=DSM 14263 GCA_003148905.1 155197 155197 type True 74.8995 360 2437 95 below_threshold Conexibacter arvalis strain=DSM 23288 GCA_014199525.1 912552 912552 type True 74.8944 647 2437 95 below_threshold Salinarimonas rosea strain=DSM 21201 GCA_000429045.1 552063 552063 type True 74.8839 533 2437 95 below_threshold Rhizomicrobium electricum strain=DSM 21034 GCA_011762045.1 480070 480070 type True 74.8669 85 2437 95 below_threshold Bauldia litoralis strain=ATCC 35022 GCA_900104485.1 665467 665467 type True 74.863 157 2437 95 below_threshold Pseudomonas tohonis strain=TUM18999 GCA_012767755.2 2725477 2725477 type True 74.8598 164 2437 95 below_threshold Aeromicrobium massiliense strain=JC14 GCA_000312105.1 1464554 1464554 type True 74.8396 292 2437 95 below_threshold Pseudoxanthomonas sangjuensis strain=DSM 28345 GCA_010211755.1 1503750 1503750 type True 74.8329 197 2437 95 below_threshold Pseudomonas carbonaria strain=CIP 111764 GCA_904061905.1 2762745 2762745 type True 74.8321 153 2437 95 below_threshold Shinella pollutisoli strain=KCTC 52677 GCA_024609765.1 2250594 2250594 type True 74.781 295 2437 95 below_threshold Nocardioides carbamazepini strain=CBZ_1 GCA_024614185.1 2854259 2854259 type True 74.7516 402 2437 95 below_threshold Oceanithermus profundus strain=DSM 14977 GCA_000183745.1 187137 187137 type True 74.7468 164 2437 95 below_threshold Pimelobacter simplex strain=ATCC 6946 GCA_900114845.1 2045 2045 type True 74.7184 451 2437 95 below_threshold Pimelobacter simplex strain=NBRC 12069 GCA_006538965.1 2045 2045 type True 74.7153 458 2437 95 below_threshold Luteimonas abyssi strain=XH031 GCA_001482195.1 1247514 1247514 type True 74.707 187 2437 95 below_threshold Saccharothrix saharensis strain=DSM 45456 GCA_006716745.1 571190 571190 type True 74.6833 396 2437 95 below_threshold -------------------------------------------------------------------------------- [2023-06-27 09:22:26,438] [INFO] DFAST Taxonomy check result was written to GCA_026647775.1_ASM2664777v1_genomic.fna/tc_result.tsv [2023-06-27 09:22:26,439] [INFO] ===== Taxonomy check completed ===== [2023-06-27 09:22:26,439] [INFO] ===== Start completeness check using CheckM ===== [2023-06-27 09:22:26,439] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stga740ce00-180c-4b18-9ff0-a50ebd6ad20a/dqc_reference/checkm_data [2023-06-27 09:22:26,441] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-27 09:22:26,518] [INFO] Task started: CheckM [2023-06-27 09:22:26,519] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_026647775.1_ASM2664777v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_026647775.1_ASM2664777v1_genomic.fna/checkm_input GCA_026647775.1_ASM2664777v1_genomic.fna/checkm_result [2023-06-27 09:23:52,639] [INFO] Task succeeded: CheckM [2023-06-27 09:23:52,640] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 58.71% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-27 09:23:52,667] [INFO] ===== Completeness check finished ===== [2023-06-27 09:23:52,667] [INFO] ===== Start GTDB Search ===== [2023-06-27 09:23:52,668] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_026647775.1_ASM2664777v1_genomic.fna/markers.fasta) [2023-06-27 09:23:52,668] [INFO] Task started: Blastn [2023-06-27 09:23:52,668] [INFO] Running command: blastn -query GCA_026647775.1_ASM2664777v1_genomic.fna/markers.fasta -db /var/lib/cwl/stga740ce00-180c-4b18-9ff0-a50ebd6ad20a/dqc_reference/reference_markers_gtdb.fasta -out GCA_026647775.1_ASM2664777v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-27 09:23:53,945] [INFO] Task succeeded: Blastn [2023-06-27 09:23:53,950] [INFO] Selected 19 target genomes. [2023-06-27 09:23:53,950] [INFO] Target genome list was writen to GCA_026647775.1_ASM2664777v1_genomic.fna/target_genomes_gtdb.txt [2023-06-27 09:23:53,958] [INFO] Task started: fastANI [2023-06-27 09:23:53,959] [INFO] Running command: fastANI --query /var/lib/cwl/stg6970c943-51d1-444b-8e6f-8f0b043a834a/GCA_026647775.1_ASM2664777v1_genomic.fna.gz --refList GCA_026647775.1_ASM2664777v1_genomic.fna/target_genomes_gtdb.txt --output GCA_026647775.1_ASM2664777v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-27 09:24:24,339] [INFO] Task succeeded: fastANI [2023-06-27 09:24:24,361] [INFO] Found 19 fastANI hits (0 hits with ANI > circumscription radius) [2023-06-27 09:24:24,361] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_004213565.1 s__MED-G138 sp004213565 77.945 1121 2437 d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__Sandaracinaceae;g__MED-G138 95.0 N/A N/A N/A N/A 1 - GCF_000737325.1 s__Sandaracinus amylolyticus 76.7685 1247 2437 d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__Sandaracinaceae;g__Sandaracinus 95.0 N/A N/A N/A N/A 1 - GCA_002699025.1 s__GCA-2699025 sp002699025 76.2257 1022 2437 d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__SG8-38;g__GCA-2699025 95.0 99.90 99.85 0.97 0.97 3 - GCA_017303575.1 s__JAFLCL01 sp017303575 76.197 904 2437 d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__Sandaracinaceae;g__JAFLCL01 95.0 N/A N/A N/A N/A 1 - GCA_009992505.1 s__JAADHO01 sp009992505 76.1927 303 2437 d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__SG8-38;g__JAADHO01 95.0 N/A N/A N/A N/A 1 - GCA_903828805.1 s__Sandaracinus sp903828805 76.149 552 2437 d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__Sandaracinaceae;g__Sandaracinus 95.0 N/A N/A N/A N/A 1 - GCA_017644045.1 s__UBA1660 sp017644045 76.0419 772 2437 d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__SG8-38;g__UBA1660 95.0 N/A N/A N/A N/A 1 - GCA_012744445.1 s__JAAYBZ01 sp012744445 75.9586 517 2437 d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__SG8-38;g__JAAYBZ01 95.0 N/A N/A N/A N/A 1 - GCA_002320815.1 s__UBA1660 sp002320815 75.9023 673 2437 d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__SG8-38;g__UBA1660 95.0 99.96 99.96 0.96 0.96 2 - GCA_013151775.1 s__JAADHO01 sp013151775 75.8233 467 2437 d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__SG8-38;g__JAADHO01 95.0 N/A N/A N/A N/A 1 - GCA_016706685.1 s__JADJJE01 sp016706685 75.7731 557 2437 d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__SG8-38;g__JADJJE01 95.0 99.15 99.06 0.93 0.90 7 - GCA_003647065.1 s__B10-G4 sp003647065 75.7011 339 2437 d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__SG8-38;g__B10-G4 95.0 N/A N/A N/A N/A 1 - GCA_016704445.1 s__SCUS01 sp016704445 75.3482 699 2437 d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__Ga0077539;g__SCUS01 95.0 98.47 98.28 0.90 0.89 4 - GCA_016212965.1 s__JACRDA01 sp016212965 75.2947 912 2437 d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__Polyangiaceae;g__JACRDA01 95.0 N/A N/A N/A N/A 1 - GCF_005144635.1 s__Polyangium sp005144635 75.2289 857 2437 d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__Polyangiaceae;g__Polyangium 95.0 N/A N/A N/A N/A 1 - GCA_016190375.1 s__JACPQR01 sp016190375 75.0861 558 2437 d__Bacteria;p__Myxococcota;c__Polyangia;o__JACPQR01;f__JACPQR01;g__JACPQR01 95.0 N/A N/A N/A N/A 1 - GCA_016792845.1 s__JAEUKH01 sp016792845 75.0423 466 2437 d__Bacteria;p__Myxococcota;c__Polyangia;o__Polyangiales;f__Ga0077539;g__JAEUKH01 95.0 N/A N/A N/A N/A 1 - GCA_016200545.1 s__DP-19 sp016200545 74.9063 349 2437 d__Bacteria;p__Desulfobacterota_B;c__Binatia;o__UTPRO1;f__UTPRO1;g__DP-19 95.0 N/A N/A N/A N/A 1 - GCA_007121785.1 s__SKTG01 sp007121785 74.6857 171 2437 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Nitriliruptorales;f__Nitriliruptoraceae;g__SKTG01 95.0 99.65 99.65 0.87 0.87 2 - -------------------------------------------------------------------------------- [2023-06-27 09:24:24,368] [INFO] GTDB search result was written to GCA_026647775.1_ASM2664777v1_genomic.fna/result_gtdb.tsv [2023-06-27 09:24:24,369] [INFO] ===== GTDB Search completed ===== [2023-06-27 09:24:24,376] [INFO] DFAST_QC result json was written to GCA_026647775.1_ASM2664777v1_genomic.fna/dqc_result.json [2023-06-27 09:24:24,376] [INFO] DFAST_QC completed! [2023-06-27 09:24:24,376] [INFO] Total running time: 0h2m51s