[2023-06-29 16:38:52,412] [INFO] DFAST_QC pipeline started. [2023-06-29 16:38:52,415] [INFO] DFAST_QC version: 0.5.7 [2023-06-29 16:38:52,415] [INFO] DQC Reference Directory: /var/lib/cwl/stge15700a6-bfaf-4984-912d-c7de7ae1af36/dqc_reference [2023-06-29 16:38:53,788] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-29 16:38:53,789] [INFO] Task started: Prodigal [2023-06-29 16:38:53,790] [INFO] Running command: gunzip -c /var/lib/cwl/stg67adaad7-c266-4048-b6fe-a64b8b619ab1/GCA_011053145.1_ASM1105314v1_genomic.fna.gz | prodigal -d GCA_011053145.1_ASM1105314v1_genomic.fna/cds.fna -a GCA_011053145.1_ASM1105314v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-29 16:39:03,191] [INFO] Task succeeded: Prodigal [2023-06-29 16:39:03,191] [INFO] Task started: HMMsearch [2023-06-29 16:39:03,191] [INFO] Running command: hmmsearch --tblout GCA_011053145.1_ASM1105314v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stge15700a6-bfaf-4984-912d-c7de7ae1af36/dqc_reference/reference_markers.hmm GCA_011053145.1_ASM1105314v1_genomic.fna/protein.faa > /dev/null [2023-06-29 16:39:03,458] [INFO] Task succeeded: HMMsearch [2023-06-29 16:39:03,464] [INFO] Found 6/6 markers. [2023-06-29 16:39:03,514] [INFO] Query marker FASTA was written to GCA_011053145.1_ASM1105314v1_genomic.fna/markers.fasta [2023-06-29 16:39:03,515] [INFO] Task started: Blastn [2023-06-29 16:39:03,515] [INFO] Running command: blastn -query GCA_011053145.1_ASM1105314v1_genomic.fna/markers.fasta -db /var/lib/cwl/stge15700a6-bfaf-4984-912d-c7de7ae1af36/dqc_reference/reference_markers.fasta -out GCA_011053145.1_ASM1105314v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-29 16:39:04,323] [INFO] Task succeeded: Blastn [2023-06-29 16:39:04,328] [INFO] Selected 29 target genomes. [2023-06-29 16:39:04,328] [INFO] Target genome list was writen to GCA_011053145.1_ASM1105314v1_genomic.fna/target_genomes.txt [2023-06-29 16:39:04,331] [INFO] Task started: fastANI [2023-06-29 16:39:04,331] [INFO] Running command: fastANI --query /var/lib/cwl/stg67adaad7-c266-4048-b6fe-a64b8b619ab1/GCA_011053145.1_ASM1105314v1_genomic.fna.gz --refList GCA_011053145.1_ASM1105314v1_genomic.fna/target_genomes.txt --output GCA_011053145.1_ASM1105314v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-29 16:39:26,651] [INFO] Task succeeded: fastANI [2023-06-29 16:39:26,651] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stge15700a6-bfaf-4984-912d-c7de7ae1af36/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-29 16:39:26,652] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stge15700a6-bfaf-4984-912d-c7de7ae1af36/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-29 16:39:26,666] [INFO] Found 15 fastANI hits (0 hits with ANI > threshold) [2023-06-29 16:39:26,666] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-29 16:39:26,666] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Paludisphaera soli strain=JC670 GCA_011064595.1 2712865 2712865 type True 74.9883 88 1154 95 below_threshold Phycisphaera mikurensis strain=NBRC 102666 GCA_000284115.1 547188 547188 type True 74.9506 70 1154 95 below_threshold Posidoniimonas polymericola strain=Pla123a GCA_007859935.1 2528002 2528002 type True 74.9396 56 1154 95 below_threshold Phycisphaera mikurensis strain=DSM 103959 GCA_014207395.1 547188 547188 type True 74.9378 72 1154 95 below_threshold Tautonia plasticadhaerens strain=ElP GCA_007752535.1 2527974 2527974 type True 74.9114 94 1154 95 below_threshold Methylobacterium tardum strain=DSM 19566 GCA_023546765.1 374432 374432 type True 74.7899 55 1154 95 below_threshold Vulcaniibacterium tengchongense strain=YIM 77520 GCA_008033455.1 1273429 1273429 type True 74.7884 54 1154 95 below_threshold Vulcaniibacterium tengchongense strain=DSM 25623 GCA_003814555.1 1273429 1273429 type True 74.7863 55 1154 95 below_threshold Methylobrevis albus strain=L22 GCA_015904235.1 2793297 2793297 type True 74.776 67 1154 95 below_threshold Actinoplanes globisporus strain=DSM 43857 GCA_000379645.1 113565 113565 type True 74.756 84 1154 95 below_threshold Methylobacterium radiotolerans strain=NBRC 15690 GCA_007991055.1 31998 31998 type True 74.7477 73 1154 95 below_threshold Methylobacterium radiotolerans strain=JCM 2831 GCA_000019725.1 31998 31998 type True 74.7337 76 1154 95 below_threshold Paraconexibacter algicola strain=Seoho-28 GCA_003044185.1 2133960 2133960 type True 74.6943 100 1154 95 below_threshold Methylobacterium indicum strain=SE2.11 GCA_001043895.1 1775910 1775910 type True 74.6894 78 1154 95 below_threshold Methylobacterium nonmethylotrophicum strain=6HR-1 GCA_004745635.1 1141884 1141884 type True 74.662 76 1154 95 below_threshold -------------------------------------------------------------------------------- [2023-06-29 16:39:26,668] [INFO] DFAST Taxonomy check result was written to GCA_011053145.1_ASM1105314v1_genomic.fna/tc_result.tsv [2023-06-29 16:39:26,669] [INFO] ===== Taxonomy check completed ===== [2023-06-29 16:39:26,669] [INFO] ===== Start completeness check using CheckM ===== [2023-06-29 16:39:26,669] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stge15700a6-bfaf-4984-912d-c7de7ae1af36/dqc_reference/checkm_data [2023-06-29 16:39:26,671] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-29 16:39:26,712] [INFO] Task started: CheckM [2023-06-29 16:39:26,713] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_011053145.1_ASM1105314v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_011053145.1_ASM1105314v1_genomic.fna/checkm_input GCA_011053145.1_ASM1105314v1_genomic.fna/checkm_result [2023-06-29 16:39:58,195] [INFO] Task succeeded: CheckM [2023-06-29 16:39:58,197] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 91.67% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-29 16:39:58,221] [INFO] ===== Completeness check finished ===== [2023-06-29 16:39:58,221] [INFO] ===== Start GTDB Search ===== [2023-06-29 16:39:58,221] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_011053145.1_ASM1105314v1_genomic.fna/markers.fasta) [2023-06-29 16:39:58,222] [INFO] Task started: Blastn [2023-06-29 16:39:58,222] [INFO] Running command: blastn -query GCA_011053145.1_ASM1105314v1_genomic.fna/markers.fasta -db /var/lib/cwl/stge15700a6-bfaf-4984-912d-c7de7ae1af36/dqc_reference/reference_markers_gtdb.fasta -out GCA_011053145.1_ASM1105314v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-29 16:39:59,478] [INFO] Task succeeded: Blastn [2023-06-29 16:39:59,483] [INFO] Selected 22 target genomes. [2023-06-29 16:39:59,483] [INFO] Target genome list was writen to GCA_011053145.1_ASM1105314v1_genomic.fna/target_genomes_gtdb.txt [2023-06-29 16:39:59,496] [INFO] Task started: fastANI [2023-06-29 16:39:59,496] [INFO] Running command: fastANI --query /var/lib/cwl/stg67adaad7-c266-4048-b6fe-a64b8b619ab1/GCA_011053145.1_ASM1105314v1_genomic.fna.gz --refList GCA_011053145.1_ASM1105314v1_genomic.fna/target_genomes_gtdb.txt --output GCA_011053145.1_ASM1105314v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-29 16:40:14,509] [INFO] Task succeeded: fastANI [2023-06-29 16:40:14,527] [INFO] Found 15 fastANI hits (1 hits with ANI > circumscription radius) [2023-06-29 16:40:14,527] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_011053145.1 s__DRLC01 sp011053145 100.0 1135 1154 d__Bacteria;p__Planctomycetota;c__UBA1135;o__UBA1135;f__GCA-002686595;g__DRLC01 95.0 N/A N/A N/A N/A 1 conclusive GCA_013140785.1 s__JABFRZ01 sp013140785 76.936 167 1154 d__Bacteria;p__Planctomycetota;c__UBA1135;o__UBA1135;f__GCA-002686595;g__JABFRZ01 95.0 N/A N/A N/A N/A 1 - GCA_902826945.1 s__RBC036 sp902826945 76.9155 285 1154 d__Bacteria;p__Planctomycetota;c__UBA1135;o__UBA1135;f__GCA-002686595;g__RBC036 95.0 N/A N/A N/A N/A 1 - GCA_009691855.1 s__JABFRZ01 sp009691855 76.8512 202 1154 d__Bacteria;p__Planctomycetota;c__UBA1135;o__UBA1135;f__GCA-002686595;g__JABFRZ01 95.0 N/A N/A N/A N/A 1 - GCA_016125265.1 s__RI-242 sp016125265 76.5903 142 1154 d__Bacteria;p__Planctomycetota;c__UBA1135;o__UBA1135;f__GCA-002686595;g__RI-242 95.0 N/A N/A N/A N/A 1 - GCA_016872755.1 s__GW928-bin9 sp016872755 76.5732 158 1154 d__Bacteria;p__Planctomycetota;c__UBA1135;o__UBA1135;f__GCA-002686595;g__GW928-bin9 95.0 N/A N/A N/A N/A 1 - GCA_007750655.1 s__Pla163 sp007750655 76.5028 179 1154 d__Bacteria;p__Planctomycetota;c__UBA1135;o__UBA1135;f__GCA-002686595;g__Pla163 95.0 N/A N/A N/A N/A 1 - GCA_011525785.1 s__WYBT01 sp011525785 76.2968 192 1154 d__Bacteria;p__Planctomycetota;c__UBA1135;o__UBA1135;f__GCA-002686595;g__WYBT01 95.0 N/A N/A N/A N/A 1 - GCA_016795205.1 s__JAEUHZ01 sp016795205 76.2474 152 1154 d__Bacteria;p__Planctomycetota;c__UBA1135;o__UBA1135;f__GCA-002686595;g__JAEUHZ01 95.0 N/A N/A N/A N/A 1 - GCA_004296785.1 s__GW928-bin9 sp004296785 76.1856 146 1154 d__Bacteria;p__Planctomycetota;c__UBA1135;o__UBA1135;f__GCA-002686595;g__GW928-bin9 95.0 N/A N/A N/A N/A 1 - GCA_016220085.1 s__JACRIN01 sp016220085 76.0496 191 1154 d__Bacteria;p__Planctomycetota;c__UBA1135;o__UBA1135;f__GCA-002686595;g__JACRIN01 95.0 N/A N/A N/A N/A 1 - GCA_016220035.1 s__JADJNY01 sp016220035 75.9782 199 1154 d__Bacteria;p__Planctomycetota;c__UBA1135;o__UBA1135;f__GCA-002686595;g__JADJNY01 95.0 N/A N/A N/A N/A 1 - GCA_017569225.1 s__Cellulomonas sp017569225 74.8457 63 1154 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Cellulomonadaceae;g__Cellulomonas 95.0 N/A N/A N/A N/A 1 - GCF_000306785.1 s__Modestobacter marinus_A 74.7689 64 1154 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Geodermatophilaceae;g__Modestobacter 95.0 N/A N/A N/A N/A 1 - GCF_900143215.1 s__Geodermatophilus obscurus_A 74.7664 64 1154 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Geodermatophilaceae;g__Geodermatophilus 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-29 16:40:14,529] [INFO] GTDB search result was written to GCA_011053145.1_ASM1105314v1_genomic.fna/result_gtdb.tsv [2023-06-29 16:40:14,530] [INFO] ===== GTDB Search completed ===== [2023-06-29 16:40:14,534] [INFO] DFAST_QC result json was written to GCA_011053145.1_ASM1105314v1_genomic.fna/dqc_result.json [2023-06-29 16:40:14,534] [INFO] DFAST_QC completed! [2023-06-29 16:40:14,534] [INFO] Total running time: 0h1m22s