[2023-06-16 21:40:33,394] [INFO] DFAST_QC pipeline started. [2023-06-16 21:40:33,397] [INFO] DFAST_QC version: 0.5.7 [2023-06-16 21:40:33,397] [INFO] DQC Reference Directory: /var/lib/cwl/stg8da8d5a7-a103-4bf5-bddc-216db8161a97/dqc_reference [2023-06-16 21:40:34,695] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-16 21:40:34,696] [INFO] Task started: Prodigal [2023-06-16 21:40:34,696] [INFO] Running command: gunzip -c /var/lib/cwl/stg1f03fd7e-d205-4873-95dd-2183905a32f7/GCA_028067765.1_ASM2806776v1_genomic.fna.gz | prodigal -d GCA_028067765.1_ASM2806776v1_genomic.fna/cds.fna -a GCA_028067765.1_ASM2806776v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-16 21:40:46,720] [INFO] Task succeeded: Prodigal [2023-06-16 21:40:46,720] [INFO] Task started: HMMsearch [2023-06-16 21:40:46,720] [INFO] Running command: hmmsearch --tblout GCA_028067765.1_ASM2806776v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg8da8d5a7-a103-4bf5-bddc-216db8161a97/dqc_reference/reference_markers.hmm GCA_028067765.1_ASM2806776v1_genomic.fna/protein.faa > /dev/null [2023-06-16 21:40:47,025] [INFO] Task succeeded: HMMsearch [2023-06-16 21:40:47,026] [INFO] Found 6/6 markers. [2023-06-16 21:40:47,069] [INFO] Query marker FASTA was written to GCA_028067765.1_ASM2806776v1_genomic.fna/markers.fasta [2023-06-16 21:40:47,069] [INFO] Task started: Blastn [2023-06-16 21:40:47,070] [INFO] Running command: blastn -query GCA_028067765.1_ASM2806776v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg8da8d5a7-a103-4bf5-bddc-216db8161a97/dqc_reference/reference_markers.fasta -out GCA_028067765.1_ASM2806776v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-16 21:40:47,766] [INFO] Task succeeded: Blastn [2023-06-16 21:40:47,771] [INFO] Selected 28 target genomes. [2023-06-16 21:40:47,771] [INFO] Target genome list was writen to GCA_028067765.1_ASM2806776v1_genomic.fna/target_genomes.txt [2023-06-16 21:40:47,777] [INFO] Task started: fastANI [2023-06-16 21:40:47,777] [INFO] Running command: fastANI --query /var/lib/cwl/stg1f03fd7e-d205-4873-95dd-2183905a32f7/GCA_028067765.1_ASM2806776v1_genomic.fna.gz --refList GCA_028067765.1_ASM2806776v1_genomic.fna/target_genomes.txt --output GCA_028067765.1_ASM2806776v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-16 21:41:11,371] [INFO] Task succeeded: fastANI [2023-06-16 21:41:11,372] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg8da8d5a7-a103-4bf5-bddc-216db8161a97/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-16 21:41:11,372] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg8da8d5a7-a103-4bf5-bddc-216db8161a97/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-16 21:41:11,400] [INFO] Found 23 fastANI hits (0 hits with ANI > threshold) [2023-06-16 21:41:11,400] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-16 21:41:11,400] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Rhabdothermincola sediminis strain=SYSU G02662 GCA_014805525.1 2751370 2751370 type True 76.0052 93 1303 95 below_threshold Actinomarinicola tropica strain=SCSIO 58843 GCA_009650215.1 2789776 2789776 type True 75.8446 175 1303 95 below_threshold Rhabdothermincola salaria strain=EGI L10124 GCA_021246445.1 2903142 2903142 type True 75.82 154 1303 95 below_threshold Desertimonas flava strain=SYSU D60003 GCA_003426815.1 2064846 2064846 type True 75.4293 189 1303 95 below_threshold Janibacter anophelis strain=NBRC 107843 GCA_001570945.1 319054 319054 type True 75.2526 96 1303 95 below_threshold Pedococcus cremeus strain=CGMCC 1.6963 GCA_900111375.1 587636 587636 type True 75.2265 118 1303 95 below_threshold Barrientosiimonas humi strain=type strain: 39 GCA_910573815.1 999931 999931 type True 75.2008 124 1303 95 below_threshold Barrientosiimonas humi strain=DSM 24617 GCA_006716095.1 999931 999931 type True 75.1945 125 1303 95 below_threshold Actinomadura geliboluensis strain=A8036 GCA_005889745.1 882440 882440 type True 75.0722 187 1303 95 below_threshold Actinomadura formosensis strain=NBRC 14204 GCA_001552155.1 60706 60706 type True 75.0396 142 1303 95 below_threshold Rhodococcus corynebacterioides strain=NBRC 14404 GCA_001894765.1 53972 53972 suspected-type True 75.0336 74 1303 95 below_threshold Actinomadura rifamycini strain=DSM 43936 GCA_000425065.1 31962 31962 type True 75.0246 198 1303 95 below_threshold Catenulispora acidiphila strain=DSM 44928 GCA_000024025.1 304895 304895 type True 75.0227 140 1303 95 below_threshold Goekera deserti strain=CPCC 205119 GCA_010685995.1 2497753 2497753 type True 75.013 134 1303 95 below_threshold Saccharopolyspora gloriosae strain=DSM 45582 GCA_014203325.1 455344 455344 type True 74.988 111 1303 95 below_threshold Conexibacter arvalis strain=DSM 23288 GCA_014199525.1 912552 912552 type True 74.9847 194 1303 95 below_threshold Kibdelosporangium banguiense strain=DSM 46670 GCA_017876405.1 1365924 1365924 type True 74.9521 96 1303 95 below_threshold Catenulispora pinistramenti strain=NL8 GCA_018274385.1 2705254 2705254 type True 74.8924 126 1303 95 below_threshold Bifidobacterium italicum strain=Rab10A GCA_002286915.1 1960968 1960968 type True 74.89 54 1303 95 below_threshold Protaetiibacter larvae strain=KACC 19322 GCA_008365275.1 2592654 2592654 type True 74.8725 77 1303 95 below_threshold Tenggerimyces flavus strain=DSM 28944 GCA_016907715.1 1708749 1708749 type True 74.8656 149 1303 95 below_threshold Catenulispora pinisilvae strain=NH11 GCA_015356865.1 2705253 2705253 type True 74.848 121 1303 95 below_threshold Catenulispora rubra strain=DSM 44948 GCA_015356825.1 280293 280293 type True 74.8374 162 1303 95 below_threshold -------------------------------------------------------------------------------- [2023-06-16 21:41:11,402] [INFO] DFAST Taxonomy check result was written to GCA_028067765.1_ASM2806776v1_genomic.fna/tc_result.tsv [2023-06-16 21:41:11,403] [INFO] ===== Taxonomy check completed ===== [2023-06-16 21:41:11,403] [INFO] ===== Start completeness check using CheckM ===== [2023-06-16 21:41:11,403] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg8da8d5a7-a103-4bf5-bddc-216db8161a97/dqc_reference/checkm_data [2023-06-16 21:41:11,404] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-16 21:41:11,449] [INFO] Task started: CheckM [2023-06-16 21:41:11,449] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_028067765.1_ASM2806776v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_028067765.1_ASM2806776v1_genomic.fna/checkm_input GCA_028067765.1_ASM2806776v1_genomic.fna/checkm_result [2023-06-16 21:41:55,157] [INFO] Task succeeded: CheckM [2023-06-16 21:41:55,158] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 78.03% Contamintation: 0.46% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-16 21:41:55,182] [INFO] ===== Completeness check finished ===== [2023-06-16 21:41:55,183] [INFO] ===== Start GTDB Search ===== [2023-06-16 21:41:55,183] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_028067765.1_ASM2806776v1_genomic.fna/markers.fasta) [2023-06-16 21:41:55,184] [INFO] Task started: Blastn [2023-06-16 21:41:55,184] [INFO] Running command: blastn -query GCA_028067765.1_ASM2806776v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg8da8d5a7-a103-4bf5-bddc-216db8161a97/dqc_reference/reference_markers_gtdb.fasta -out GCA_028067765.1_ASM2806776v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-16 21:41:56,143] [INFO] Task succeeded: Blastn [2023-06-16 21:41:56,147] [INFO] Selected 25 target genomes. [2023-06-16 21:41:56,147] [INFO] Target genome list was writen to GCA_028067765.1_ASM2806776v1_genomic.fna/target_genomes_gtdb.txt [2023-06-16 21:41:56,155] [INFO] Task started: fastANI [2023-06-16 21:41:56,155] [INFO] Running command: fastANI --query /var/lib/cwl/stg1f03fd7e-d205-4873-95dd-2183905a32f7/GCA_028067765.1_ASM2806776v1_genomic.fna.gz --refList GCA_028067765.1_ASM2806776v1_genomic.fna/target_genomes_gtdb.txt --output GCA_028067765.1_ASM2806776v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-16 21:42:09,588] [INFO] Task succeeded: fastANI [2023-06-16 21:42:09,609] [INFO] Found 20 fastANI hits (0 hits with ANI > circumscription radius) [2023-06-16 21:42:09,609] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_003155135.1 s__Bog-515 sp003155135 77.3202 188 1303 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__RAAP-2;g__Bog-515 95.0 99.89 99.76 0.97 0.95 17 - GCA_003151235.1 s__Palsa-461 sp003151235 77.1277 164 1303 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__RAAP-2;g__Palsa-461 95.0 N/A N/A N/A N/A 1 - GCA_015478655.1 s__Bog-756 sp015478655 77.0405 115 1303 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__RAAP-2;g__Bog-756 95.0 N/A N/A N/A N/A 1 - GCA_003164095.1 s__Bog-756 sp003164095 76.9506 139 1303 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__RAAP-2;g__Bog-756 95.0 N/A N/A N/A N/A 1 - GCA_003161575.1 s__PALSA-457 sp003161575 76.9122 147 1303 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__RAAP-2;g__PALSA-457 95.0 N/A N/A N/A N/A 1 - GCA_003161255.1 s__Palsa-461 sp003161255 76.873 157 1303 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__RAAP-2;g__Palsa-461 95.0 N/A N/A N/A N/A 1 - GCA_003138855.1 s__Bog-756 sp003138855 76.705 133 1303 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__RAAP-2;g__Bog-756 95.0 99.82 99.61 0.95 0.93 14 - GCA_017882995.1 s__Chersky-840 sp017882995 76.5898 120 1303 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__UBA8190;g__Chersky-840 95.0 N/A N/A N/A N/A 1 - GCA_903894295.1 s__Bog-756 sp903894295 76.567 76 1303 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__RAAP-2;g__Bog-756 95.0 99.67 99.55 0.93 0.92 5 - GCA_003453695.1 s__UBA8190 sp003453695 76.5269 129 1303 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__UBA8190;g__UBA8190 95.0 N/A N/A N/A N/A 1 - GCA_003169235.1 s__Bog-473 sp003169235 76.5145 147 1303 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__RAAP-2;g__Bog-473 95.0 99.17 99.17 0.82 0.82 2 - GCA_017883065.1 s__Bog-756 sp017883065 76.4356 123 1303 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__RAAP-2;g__Bog-756 95.0 N/A N/A N/A N/A 1 - GCA_003151955.1 s__PALSA-743 sp003151955 76.3613 144 1303 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__UBA8190;g__PALSA-743 95.0 N/A N/A N/A N/A 1 - GCA_017577565.1 s__ZC4RG19 sp017577565 76.1507 180 1303 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__JACDCH01;g__ZC4RG19 95.0 N/A N/A N/A N/A 1 - GCA_003133665.1 s__RAAP-2 sp003133665 75.9736 68 1303 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__RAAP-2;g__RAAP-2 95.0 N/A N/A N/A N/A 1 - GCA_005888295.1 s__AC-9 sp005888295 75.9357 151 1303 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__QHCF01;g__AC-9 95.0 N/A N/A N/A N/A 1 - GCA_005884125.1 s__AC-35 sp005884125 75.9149 118 1303 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__AC-35;g__AC-35 95.0 N/A N/A N/A N/A 1 - GCA_016700055.1 s__Kalu-18 sp016700055 75.6516 114 1303 d__Bacteria;p__Actinobacteriota;c__Acidimicrobiia;o__Acidimicrobiales;f__Ilumatobacteraceae;g__Kalu-18 95.0 N/A N/A N/A N/A 1 - GCA_016873275.1 s__VGXO01 sp016873275 74.7807 88 1303 d__Bacteria;p__Planctomycetota;c__Phycisphaerae;o__Phycisphaerales;f__SM1A02;g__VGXO01 95.0 N/A N/A N/A N/A 1 - GCA_016873255.1 s__UBA966 sp016873255 74.7337 60 1303 d__Bacteria;p__Planctomycetota;c__Phycisphaerae;o__Phycisphaerales;f__SM1A02;g__UBA966 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-16 21:42:09,612] [INFO] GTDB search result was written to GCA_028067765.1_ASM2806776v1_genomic.fna/result_gtdb.tsv [2023-06-16 21:42:09,613] [INFO] ===== GTDB Search completed ===== [2023-06-16 21:42:09,618] [INFO] DFAST_QC result json was written to GCA_028067765.1_ASM2806776v1_genomic.fna/dqc_result.json [2023-06-16 21:42:09,618] [INFO] DFAST_QC completed! [2023-06-16 21:42:09,618] [INFO] Total running time: 0h1m36s