[2023-06-27 14:02:59,224] [INFO] DFAST_QC pipeline started. [2023-06-27 14:02:59,233] [INFO] DFAST_QC version: 0.5.7 [2023-06-27 14:02:59,233] [INFO] DQC Reference Directory: /var/lib/cwl/stg3062b4f5-abac-46c0-bf64-b1adad28ea74/dqc_reference [2023-06-27 14:03:01,074] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-27 14:03:01,075] [INFO] Task started: Prodigal [2023-06-27 14:03:01,076] [INFO] Running command: gunzip -c /var/lib/cwl/stg0f8767d2-7bf7-45eb-85d7-48c22756feae/GCA_026708005.1_ASM2670800v1_genomic.fna.gz | prodigal -d GCA_026708005.1_ASM2670800v1_genomic.fna/cds.fna -a GCA_026708005.1_ASM2670800v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-27 14:03:10,474] [INFO] Task succeeded: Prodigal [2023-06-27 14:03:10,475] [INFO] Task started: HMMsearch [2023-06-27 14:03:10,475] [INFO] Running command: hmmsearch --tblout GCA_026708005.1_ASM2670800v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg3062b4f5-abac-46c0-bf64-b1adad28ea74/dqc_reference/reference_markers.hmm GCA_026708005.1_ASM2670800v1_genomic.fna/protein.faa > /dev/null [2023-06-27 14:03:10,660] [INFO] Task succeeded: HMMsearch [2023-06-27 14:03:10,662] [INFO] Found 6/6 markers. [2023-06-27 14:03:10,692] [INFO] Query marker FASTA was written to GCA_026708005.1_ASM2670800v1_genomic.fna/markers.fasta [2023-06-27 14:03:10,692] [INFO] Task started: Blastn [2023-06-27 14:03:10,692] [INFO] Running command: blastn -query GCA_026708005.1_ASM2670800v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg3062b4f5-abac-46c0-bf64-b1adad28ea74/dqc_reference/reference_markers.fasta -out GCA_026708005.1_ASM2670800v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-27 14:03:11,320] [INFO] Task succeeded: Blastn [2023-06-27 14:03:11,323] [INFO] Selected 19 target genomes. [2023-06-27 14:03:11,324] [INFO] Target genome list was writen to GCA_026708005.1_ASM2670800v1_genomic.fna/target_genomes.txt [2023-06-27 14:03:11,330] [INFO] Task started: fastANI [2023-06-27 14:03:11,330] [INFO] Running command: fastANI --query /var/lib/cwl/stg0f8767d2-7bf7-45eb-85d7-48c22756feae/GCA_026708005.1_ASM2670800v1_genomic.fna.gz --refList GCA_026708005.1_ASM2670800v1_genomic.fna/target_genomes.txt --output GCA_026708005.1_ASM2670800v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-27 14:03:26,871] [INFO] Task succeeded: fastANI [2023-06-27 14:03:26,871] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg3062b4f5-abac-46c0-bf64-b1adad28ea74/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-27 14:03:26,871] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg3062b4f5-abac-46c0-bf64-b1adad28ea74/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-27 14:03:26,884] [INFO] Found 15 fastANI hits (0 hits with ANI > threshold) [2023-06-27 14:03:26,884] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-27 14:03:26,884] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Gaiella occulta strain=F2-233 GCA_003351045.1 1002870 1002870 type True 76.3105 159 1146 95 below_threshold Miltoncostaea oceani strain=SCSIO 61214 GCA_018141545.1 2843216 2843216 type True 75.648 108 1146 95 below_threshold Capillimicrobium parvum strain=0166_1 GCA_021172045.1 2884022 2884022 type True 75.5929 170 1146 95 below_threshold Patulibacter minatonensis strain=DSM 18081 GCA_000519325.1 298163 298163 type True 75.5705 95 1146 95 below_threshold Solirubrobacter pauli strain=DSM 14954 GCA_003633755.1 166793 166793 type True 75.4006 150 1146 95 below_threshold Conexibacter woesei strain=DSM 14684 GCA_000025265.1 191495 191495 type True 75.3826 155 1146 95 below_threshold Paraconexibacter algicola strain=Seoho-28 GCA_003044185.1 2133960 2133960 type True 75.2808 122 1146 95 below_threshold Micromonospora parathelypteridis strain=DSM 103125 GCA_014201145.1 1839617 1839617 type True 75.051 63 1146 95 below_threshold Actinomycetospora corticicola strain=DSM 45772 GCA_013409505.1 663602 663602 type True 75.0468 108 1146 95 below_threshold Actinomycetospora soli strain=SF1 GCA_021026295.1 2893887 2893887 type True 75.032 122 1146 95 below_threshold Acrocarpospora pleiomorpha strain=NBRC 16267 GCA_009687885.1 90975 90975 type True 74.9913 74 1146 95 below_threshold Acrocarpospora macrocephala strain=NBRC 16266 GCA_009687865.1 150177 150177 type True 74.9429 89 1146 95 below_threshold Micromonospora parathelypteridis strain=CGMCC 4.7347 GCA_014646315.1 1839617 1839617 type True 74.9127 60 1146 95 below_threshold Acrocarpospora phusangensis strain=NBRC 108782 GCA_016862995.1 1070424 1070424 type True 74.9023 85 1146 95 below_threshold Micromonospora aurantiaca strain=ATCC 27029 GCA_003721415.1 47850 47850 type True 74.7866 88 1146 95 below_threshold -------------------------------------------------------------------------------- [2023-06-27 14:03:26,886] [INFO] DFAST Taxonomy check result was written to GCA_026708005.1_ASM2670800v1_genomic.fna/tc_result.tsv [2023-06-27 14:03:26,887] [INFO] ===== Taxonomy check completed ===== [2023-06-27 14:03:26,887] [INFO] ===== Start completeness check using CheckM ===== [2023-06-27 14:03:26,887] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg3062b4f5-abac-46c0-bf64-b1adad28ea74/dqc_reference/checkm_data [2023-06-27 14:03:26,888] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-27 14:03:26,925] [INFO] Task started: CheckM [2023-06-27 14:03:26,925] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_026708005.1_ASM2670800v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_026708005.1_ASM2670800v1_genomic.fna/checkm_input GCA_026708005.1_ASM2670800v1_genomic.fna/checkm_result [2023-06-27 14:03:56,713] [INFO] Task succeeded: CheckM [2023-06-27 14:03:56,714] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-27 14:03:56,732] [INFO] ===== Completeness check finished ===== [2023-06-27 14:03:56,733] [INFO] ===== Start GTDB Search ===== [2023-06-27 14:03:56,733] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_026708005.1_ASM2670800v1_genomic.fna/markers.fasta) [2023-06-27 14:03:56,733] [INFO] Task started: Blastn [2023-06-27 14:03:56,733] [INFO] Running command: blastn -query GCA_026708005.1_ASM2670800v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg3062b4f5-abac-46c0-bf64-b1adad28ea74/dqc_reference/reference_markers_gtdb.fasta -out GCA_026708005.1_ASM2670800v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-27 14:03:57,663] [INFO] Task succeeded: Blastn [2023-06-27 14:03:57,680] [INFO] Selected 21 target genomes. [2023-06-27 14:03:57,681] [INFO] Target genome list was writen to GCA_026708005.1_ASM2670800v1_genomic.fna/target_genomes_gtdb.txt [2023-06-27 14:03:57,696] [INFO] Task started: fastANI [2023-06-27 14:03:57,697] [INFO] Running command: fastANI --query /var/lib/cwl/stg0f8767d2-7bf7-45eb-85d7-48c22756feae/GCA_026708005.1_ASM2670800v1_genomic.fna.gz --refList GCA_026708005.1_ASM2670800v1_genomic.fna/target_genomes_gtdb.txt --output GCA_026708005.1_ASM2670800v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-27 14:04:06,645] [INFO] Task succeeded: fastANI [2023-06-27 14:04:06,660] [INFO] Found 19 fastANI hits (0 hits with ANI > circumscription radius) [2023-06-27 14:04:06,660] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_003139545.1 s__Palsa-739 sp003139545 76.5427 118 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__Palsa-739 95.0 N/A N/A N/A N/A 1 - GCA_005883335.1 s__AC-32 sp005883335 76.5133 101 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__AC-32 95.0 N/A N/A N/A N/A 1 - GCA_019236545.1 s__Palsa-739 sp019236545 76.4971 114 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__Palsa-739 95.0 98.19 98.19 0.82 0.82 2 - GCA_005883365.1 s__Palsa-739 sp005883365 76.4751 86 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__Palsa-739 95.0 N/A N/A N/A N/A 1 - GCA_005883955.1 s__13-2-20CM-68-14 sp005883955 76.4693 106 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__13-2-20CM-68-14 95.0 N/A N/A N/A N/A 1 - GCA_005884155.1 s__AC-32 sp005884155 76.4523 102 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__AC-32 95.0 N/A N/A N/A N/A 1 - GCA_005884575.1 s__AC-16 sp005884575 76.4123 161 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__AC-16 95.0 N/A N/A N/A N/A 1 - GCA_013696185.1 s__JACCZA01 sp013696185 76.3985 117 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__JACCZA01 95.0 N/A N/A N/A N/A 1 - GCA_003161615.1 s__Palsa-739 sp003161615 76.3799 102 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__Palsa-739 95.0 N/A N/A N/A N/A 1 - GCA_005885245.1 s__3-1-20CM-4-69-9 sp005885245 76.3655 106 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__3-1-20CM-4-69-9 95.0 N/A N/A N/A N/A 1 - GCF_003351045.1 s__Gaiella occulta 76.3219 158 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__Gaiella 95.0 N/A N/A N/A N/A 1 - GCA_013816275.1 s__JACDER01 sp013816275 76.3138 60 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__JACDER01 95.0 N/A N/A N/A N/A 1 - GCA_016870235.1 s__Gaiella sp016870235 76.3106 128 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__Gaiella 95.0 N/A N/A N/A N/A 1 - GCA_005885565.1 s__AC-50 sp005885565 76.238 132 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__AC-50 95.0 N/A N/A N/A N/A 1 - GCA_013812265.1 s__JACCTQ01 sp013812265 76.0928 100 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__JACCTQ01 95.0 N/A N/A N/A N/A 1 - GCA_001920085.1 s__3-1-20CM-4-69-9 sp001920085 76.0732 72 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__3-1-20CM-4-69-9 95.0 N/A N/A N/A N/A 1 - GCA_005883235.1 s__AC-50 sp005883235 76.0253 77 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__AC-50 95.0 N/A N/A N/A N/A 1 - GCA_014534155.1 s__JACVSB01 sp014534155 75.6664 76 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__JACVSB01 95.0 N/A N/A N/A N/A 1 - GCA_003152275.1 s__Palsa-739 sp003152275 75.6482 86 1146 d__Bacteria;p__Actinobacteriota;c__Thermoleophilia;o__Gaiellales;f__Gaiellaceae;g__Palsa-739 95.0 96.63 96.63 0.81 0.81 2 - -------------------------------------------------------------------------------- [2023-06-27 14:04:06,666] [INFO] GTDB search result was written to GCA_026708005.1_ASM2670800v1_genomic.fna/result_gtdb.tsv [2023-06-27 14:04:06,666] [INFO] ===== GTDB Search completed ===== [2023-06-27 14:04:06,670] [INFO] DFAST_QC result json was written to GCA_026708005.1_ASM2670800v1_genomic.fna/dqc_result.json [2023-06-27 14:04:06,670] [INFO] DFAST_QC completed! [2023-06-27 14:04:06,670] [INFO] Total running time: 0h1m7s