[2023-06-18 11:44:54,622] [INFO] DFAST_QC pipeline started. [2023-06-18 11:44:54,626] [INFO] DFAST_QC version: 0.5.7 [2023-06-18 11:44:54,626] [INFO] DQC Reference Directory: /var/lib/cwl/stga8b0fa1c-431f-45ac-9e83-950856ed0959/dqc_reference [2023-06-18 11:44:57,947] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-18 11:44:57,948] [INFO] Task started: Prodigal [2023-06-18 11:44:57,949] [INFO] Running command: gunzip -c /var/lib/cwl/stg776b0fcc-e7d7-4273-b887-6e93e45a49ba/GCA_018268545.1_ASM1826854v1_genomic.fna.gz | prodigal -d GCA_018268545.1_ASM1826854v1_genomic.fna/cds.fna -a GCA_018268545.1_ASM1826854v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-18 11:45:04,812] [INFO] Task succeeded: Prodigal [2023-06-18 11:45:04,813] [INFO] Task started: HMMsearch [2023-06-18 11:45:04,813] [INFO] Running command: hmmsearch --tblout GCA_018268545.1_ASM1826854v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stga8b0fa1c-431f-45ac-9e83-950856ed0959/dqc_reference/reference_markers.hmm GCA_018268545.1_ASM1826854v1_genomic.fna/protein.faa > /dev/null [2023-06-18 11:45:05,037] [INFO] Task succeeded: HMMsearch [2023-06-18 11:45:05,039] [WARNING] Found 5/6 markers. [/var/lib/cwl/stg776b0fcc-e7d7-4273-b887-6e93e45a49ba/GCA_018268545.1_ASM1826854v1_genomic.fna.gz] [2023-06-18 11:45:05,072] [INFO] Query marker FASTA was written to GCA_018268545.1_ASM1826854v1_genomic.fna/markers.fasta [2023-06-18 11:45:05,072] [INFO] Task started: Blastn [2023-06-18 11:45:05,072] [INFO] Running command: blastn -query GCA_018268545.1_ASM1826854v1_genomic.fna/markers.fasta -db /var/lib/cwl/stga8b0fa1c-431f-45ac-9e83-950856ed0959/dqc_reference/reference_markers.fasta -out GCA_018268545.1_ASM1826854v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-18 11:45:05,769] [INFO] Task succeeded: Blastn [2023-06-18 11:45:05,774] [INFO] Selected 13 target genomes. [2023-06-18 11:45:05,774] [INFO] Target genome list was writen to GCA_018268545.1_ASM1826854v1_genomic.fna/target_genomes.txt [2023-06-18 11:45:05,778] [INFO] Task started: fastANI [2023-06-18 11:45:05,779] [INFO] Running command: fastANI --query /var/lib/cwl/stg776b0fcc-e7d7-4273-b887-6e93e45a49ba/GCA_018268545.1_ASM1826854v1_genomic.fna.gz --refList GCA_018268545.1_ASM1826854v1_genomic.fna/target_genomes.txt --output GCA_018268545.1_ASM1826854v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-18 11:45:15,020] [INFO] Task succeeded: fastANI [2023-06-18 11:45:15,021] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stga8b0fa1c-431f-45ac-9e83-950856ed0959/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-18 11:45:15,021] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stga8b0fa1c-431f-45ac-9e83-950856ed0959/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-18 11:45:15,033] [INFO] Found 9 fastANI hits (0 hits with ANI > threshold) [2023-06-18 11:45:15,033] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-18 11:45:15,034] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Rhabdothermincola sediminis strain=SYSU G02662 GCA_014805525.1 2751370 2751370 type True 76.5766 81 677 95 below_threshold Rhabdothermincola salaria strain=EGI L10124 GCA_021246445.1 2903142 2903142 type True 76.47 104 677 95 below_threshold Actinomarinicola tropica strain=SCSIO 58843 GCA_009650215.1 2789776 2789776 type True 76.3878 114 677 95 below_threshold Ilumatobacter fluminis strain=DSM 18936 GCA_004364865.1 467091 467091 type True 76.0433 82 677 95 below_threshold Desertimonas flava strain=SYSU D60003 GCA_003426815.1 2064846 2064846 type True 75.6785 111 677 95 below_threshold Ilumatobacter coccineus strain=YM16-304 GCA_000348785.1 467094 467094 type True 75.4602 92 677 95 below_threshold Amycolatopsis jiangsuensis strain=DSM 45859 GCA_014204865.1 1181879 1181879 type True 75.2849 71 677 95 below_threshold Amycolatopsis endophytica strain=DSM 104006 GCA_013410405.1 860233 860233 type True 75.1492 81 677 95 below_threshold Amycolatopsis jejuensis strain=NRRL B-24427 GCA_000717335.1 330084 330084 type True 75.1147 72 677 95 below_threshold -------------------------------------------------------------------------------- [2023-06-18 11:45:15,035] [INFO] DFAST Taxonomy check result was written to GCA_018268545.1_ASM1826854v1_genomic.fna/tc_result.tsv [2023-06-18 11:45:15,036] [INFO] ===== Taxonomy check completed ===== [2023-06-18 11:45:15,036] [INFO] ===== Start completeness check using CheckM ===== [2023-06-18 11:45:15,036] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stga8b0fa1c-431f-45ac-9e83-950856ed0959/dqc_reference/checkm_data [2023-06-18 11:45:15,037] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-18 11:45:15,068] [INFO] Task started: CheckM [2023-06-18 11:45:15,068] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_018268545.1_ASM1826854v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_018268545.1_ASM1826854v1_genomic.fna/checkm_input GCA_018268545.1_ASM1826854v1_genomic.fna/checkm_result [2023-06-18 11:45:53,791] [INFO] Task succeeded: CheckM [2023-06-18 11:45:53,793] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 82.29% Contamintation: 0.46% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-18 11:45:53,810] [INFO] ===== Completeness check finished ===== [2023-06-18 11:45:53,810] [INFO] ===== Start GTDB Search ===== [2023-06-18 11:45:53,811] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_018268545.1_ASM1826854v1_genomic.fna/markers.fasta) [2023-06-18 11:45:53,811] [INFO] Task started: Blastn [2023-06-18 11:45:53,811] [INFO] Running command: blastn -query GCA_018268545.1_ASM1826854v1_genomic.fna/markers.fasta -db /var/lib/cwl/stga8b0fa1c-431f-45ac-9e83-950856ed0959/dqc_reference/reference_markers_gtdb.fasta -out GCA_018268545.1_ASM1826854v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-18 11:45:54,797] [INFO] Task succeeded: Blastn [2023-06-18 11:45:54,801] [INFO] Selected 17 target genomes. [2023-06-18 11:45:54,802] [INFO] Target genome list was writen to GCA_018268545.1_ASM1826854v1_genomic.fna/target_genomes_gtdb.txt [2023-06-18 11:45:54,813] [INFO] Task started: fastANI [2023-06-18 11:45:54,813] [INFO] Running command: fastANI --query /var/lib/cwl/stg776b0fcc-e7d7-4273-b887-6e93e45a49ba/GCA_018268545.1_ASM1826854v1_genomic.fna.gz --refList GCA_018268545.1_ASM1826854v1_genomic.fna/target_genomes_gtdb.txt --output GCA_018268545.1_ASM1826854v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-18 11:46:03,603] [INFO] Task succeeded: fastANI [2023-06-18 11:46:03,620] [INFO] Found 17 fastANI hits (1 hits with ANI > circumscription radius) [2023-06-18 11:46:03,621] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_018268545.1 s__JAFDWI01 sp018268545 100.0 669 677 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__JAAYBP01;g__JAFDWI01 95.0 N/A N/A N/A N/A 1 conclusive GCA_017882935.1 s__JADGOP01 sp017882935 76.6714 107 677 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__JAAYBP01;g__JADGOP01 95.0 N/A N/A N/A N/A 1 - GCA_004366205.1 s__SIRW01 sp004366205 76.4623 79 677 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__SIRW01;g__SIRW01 95.0 N/A N/A N/A N/A 1 - GCA_018268735.1 s__AWTP1-35 sp018268735 76.3103 69 677 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Microtrichaceae;g__AWTP1-35 95.0 N/A N/A N/A N/A 1 - GCA_017577565.1 s__ZC4RG19 sp017577565 76.2847 116 677 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__JACDCH01;g__ZC4RG19 95.0 N/A N/A N/A N/A 1 - GCA_013694495.1 s__JACDCH01 sp013694495 76.1971 80 677 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__JACDCH01;g__JACDCH01 95.0 N/A N/A N/A N/A 1 - GCA_019247975.1 s__JAFBAD01 sp019247975 76.1164 78 677 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__JAAYBP01;g__JAFBAD01 95.0 N/A N/A N/A N/A 1 - GCA_016185275.1 s__JACPNX01 sp016185275 76.1012 116 677 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__JAAYBP01;g__JACPNX01 95.0 N/A N/A N/A N/A 1 - GCA_903868545.1 s__CAIPVR01 sp903868545 76.0866 74 677 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Microtrichaceae;g__CAIPVR01 95.0 99.51 99.50 0.87 0.86 3 - GCA_012513855.1 s__JAAZBK01 sp012513855 76.0465 77 677 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__JAAYBP01;g__JAAZBK01 95.0 N/A N/A N/A N/A 1 - GCA_016794585.1 s__JAEUJM01 sp016794585 76.0291 109 677 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__JAEUJM01;g__JAEUJM01 95.0 N/A N/A N/A N/A 1 - GCA_012730075.1 s__JAAYBP01 sp012730075 76.0261 116 677 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__JAAYBP01;g__JAAYBP01 95.0 N/A N/A N/A N/A 1 - GCA_003962875.1 s__AWTP1-35 sp003962875 75.9756 97 677 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Microtrichaceae;g__AWTP1-35 95.0 N/A N/A N/A N/A 1 - GCA_016781105.1 s__JADDRA01 sp016781105 75.8751 54 677 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__CADCSY01;g__JADDRA01 95.0 99.25 99.25 0.80 0.80 2 - GCA_016870245.1 s__CAIUKV01 sp016870245 75.841 87 677 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__IMCC26256;f__PALSA-555;g__CAIUKV01 95.0 N/A N/A N/A N/A 1 - GCA_903899845.1 s__CAIUKV01 sp903899845 75.717 92 677 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__IMCC26256;f__PALSA-555;g__CAIUKV01 95.0 99.48 99.41 0.90 0.89 4 - GCA_903921205.1 s__CAIXPF01 sp903921205 75.3356 78 677 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__CAIXPF01;g__CAIXPF01 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-18 11:46:03,623] [INFO] GTDB search result was written to GCA_018268545.1_ASM1826854v1_genomic.fna/result_gtdb.tsv [2023-06-18 11:46:03,623] [INFO] ===== GTDB Search completed ===== [2023-06-18 11:46:03,627] [INFO] DFAST_QC result json was written to GCA_018268545.1_ASM1826854v1_genomic.fna/dqc_result.json [2023-06-18 11:46:03,627] [INFO] DFAST_QC completed! [2023-06-18 11:46:03,628] [INFO] Total running time: 0h1m9s