[2024-01-24 12:16:04,998] [INFO] DFAST_QC pipeline started. [2024-01-24 12:16:05,000] [INFO] DFAST_QC version: 0.5.7 [2024-01-24 12:16:05,001] [INFO] DQC Reference Directory: /var/lib/cwl/stg5711c82e-f715-4ebd-9543-c451617b7a84/dqc_reference [2024-01-24 12:16:06,206] [INFO] ===== Start taxonomy check using ANI ===== [2024-01-24 12:16:06,207] [INFO] Task started: Prodigal [2024-01-24 12:16:06,207] [INFO] Running command: gunzip -c /var/lib/cwl/stg9ee4b3db-6773-4904-9352-c87cab1c2e92/GCF_025165815.1_ASM2516581v1_genomic.fna.gz | prodigal -d GCF_025165815.1_ASM2516581v1_genomic.fna/cds.fna -a GCF_025165815.1_ASM2516581v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2024-01-24 12:16:29,282] [INFO] Task succeeded: Prodigal [2024-01-24 12:16:29,283] [INFO] Task started: HMMsearch [2024-01-24 12:16:29,283] [INFO] Running command: hmmsearch --tblout GCF_025165815.1_ASM2516581v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg5711c82e-f715-4ebd-9543-c451617b7a84/dqc_reference/reference_markers.hmm GCF_025165815.1_ASM2516581v1_genomic.fna/protein.faa > /dev/null [2024-01-24 12:16:29,635] [INFO] Task succeeded: HMMsearch [2024-01-24 12:16:29,636] [INFO] Found 6/6 markers. [2024-01-24 12:16:29,695] [INFO] Query marker FASTA was written to GCF_025165815.1_ASM2516581v1_genomic.fna/markers.fasta [2024-01-24 12:16:29,696] [INFO] Task started: Blastn [2024-01-24 12:16:29,696] [INFO] Running command: blastn -query GCF_025165815.1_ASM2516581v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg5711c82e-f715-4ebd-9543-c451617b7a84/dqc_reference/reference_markers.fasta -out GCF_025165815.1_ASM2516581v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 12:16:31,110] [INFO] Task succeeded: Blastn [2024-01-24 12:16:31,113] [INFO] Selected 20 target genomes. [2024-01-24 12:16:31,114] [INFO] Target genome list was writen to GCF_025165815.1_ASM2516581v1_genomic.fna/target_genomes.txt [2024-01-24 12:16:31,122] [INFO] Task started: fastANI [2024-01-24 12:16:31,122] [INFO] Running command: fastANI --query /var/lib/cwl/stg9ee4b3db-6773-4904-9352-c87cab1c2e92/GCF_025165815.1_ASM2516581v1_genomic.fna.gz --refList GCF_025165815.1_ASM2516581v1_genomic.fna/target_genomes.txt --output GCF_025165815.1_ASM2516581v1_genomic.fna/fastani_result.tsv --threads 1 [2024-01-24 12:17:07,259] [INFO] Task succeeded: fastANI [2024-01-24 12:17:07,259] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg5711c82e-f715-4ebd-9543-c451617b7a84/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2024-01-24 12:17:07,260] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg5711c82e-f715-4ebd-9543-c451617b7a84/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2024-01-24 12:17:07,275] [INFO] Found 20 fastANI hits (0 hits with ANI > threshold) [2024-01-24 12:17:07,275] [INFO] The taxonomy check result is classified as 'below_threshold'. [2024-01-24 12:17:07,275] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Actinophytocola xinjiangensis strain=CGMCC 4.4663 GCA_001921215.1 485602 485602 type True 83.3074 1614 2434 95 below_threshold Actinophytocola xanthii strain=11-183 GCA_001921205.1 1912961 1912961 type True 83.1456 1485 2434 95 below_threshold Actinophytocola algeriensis strain=DSM 46746 GCA_014874055.1 1768010 1768010 type True 81.548 1533 2434 95 below_threshold Actinophytocola algeriensis strain=CECT 8960 GCA_014203735.1 1768010 1768010 type True 81.4691 1546 2434 95 below_threshold Actinophytocola oryzae strain=DSM 45499 GCA_004364325.1 502181 502181 type True 80.593 1373 2434 95 below_threshold Actinokineospora iranica strain=IBRC-M 10403 GCA_900101685.1 1271860 1271860 type True 79.5202 922 2434 95 below_threshold Actinokineospora spheciospongiae strain=EG49 GCA_000564855.1 909613 909613 type True 79.4139 982 2434 95 below_threshold Actinokineospora globicatena strain=DSM 44256 GCA_024171945.1 103729 103729 type True 79.3248 901 2434 95 below_threshold Actinokineospora terrae strain=DSM 44260 GCA_900111175.1 155974 155974 type True 79.2616 941 2434 95 below_threshold Actinokineospora enzanensis strain=DSM 44649 GCA_000374445.1 155975 155975 type True 79.0894 980 2434 95 below_threshold Amycolatopsis thermalba strain=NRRL B-24845 GCA_003385215.1 944492 944492 type True 78.8244 961 2434 95 below_threshold Saccharopolyspora hordei strain=DSM 44065 GCA_013410345.1 1838 1838 type True 78.3328 812 2434 95 below_threshold Actinosynnema mirum strain=DSM 43827 GCA_000023245.1 40567 40567 type True 78.3236 1042 2434 95 below_threshold Prauserella cavernicola strain=ASG 168 GCA_016595675.1 2800127 2800127 type True 78.267 896 2434 95 below_threshold Amycolatopsis camponoti strain=A23 GCA_902497555.1 2606593 2606593 type True 78.2571 1068 2434 95 below_threshold Kibdelosporangium banguiense strain=DSM 46670 GCA_017876405.1 1365924 1365924 type True 78.1557 942 2434 95 below_threshold Saccharopolyspora flava strain=DSM 44771 GCA_900116135.1 95161 95161 type True 78.1223 825 2434 95 below_threshold Amycolatopsis aidingensis strain=YIM 96748 GCA_018885265.1 2842453 2842453 type True 78.1191 887 2434 95 below_threshold Crossiella cryophila strain=DSM 44230 GCA_014204915.1 43355 43355 type True 78.0165 998 2434 95 below_threshold Saccharopolyspora shandongensis strain=CGMCC 4.3530 GCA_900106995.1 418495 418495 type True 77.9936 896 2434 95 below_threshold -------------------------------------------------------------------------------- [2024-01-24 12:17:07,277] [INFO] DFAST Taxonomy check result was written to GCF_025165815.1_ASM2516581v1_genomic.fna/tc_result.tsv [2024-01-24 12:17:07,278] [INFO] ===== Taxonomy check completed ===== [2024-01-24 12:17:07,278] [INFO] ===== Start completeness check using CheckM ===== [2024-01-24 12:17:07,278] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg5711c82e-f715-4ebd-9543-c451617b7a84/dqc_reference/checkm_data [2024-01-24 12:17:07,280] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2024-01-24 12:17:07,344] [INFO] Task started: CheckM [2024-01-24 12:17:07,344] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_025165815.1_ASM2516581v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_025165815.1_ASM2516581v1_genomic.fna/checkm_input GCF_025165815.1_ASM2516581v1_genomic.fna/checkm_result [2024-01-24 12:18:36,814] [INFO] Task succeeded: CheckM [2024-01-24 12:18:36,816] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 2.78% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2024-01-24 12:18:36,842] [INFO] ===== Completeness check finished ===== [2024-01-24 12:18:36,842] [INFO] ===== Start GTDB Search ===== [2024-01-24 12:18:36,843] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_025165815.1_ASM2516581v1_genomic.fna/markers.fasta) [2024-01-24 12:18:36,843] [INFO] Task started: Blastn [2024-01-24 12:18:36,843] [INFO] Running command: blastn -query GCF_025165815.1_ASM2516581v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg5711c82e-f715-4ebd-9543-c451617b7a84/dqc_reference/reference_markers_gtdb.fasta -out GCF_025165815.1_ASM2516581v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 12:18:38,962] [INFO] Task succeeded: Blastn [2024-01-24 12:18:38,966] [INFO] Selected 18 target genomes. [2024-01-24 12:18:38,966] [INFO] Target genome list was writen to GCF_025165815.1_ASM2516581v1_genomic.fna/target_genomes_gtdb.txt [2024-01-24 12:18:38,998] [INFO] Task started: fastANI [2024-01-24 12:18:38,999] [INFO] Running command: fastANI --query /var/lib/cwl/stg9ee4b3db-6773-4904-9352-c87cab1c2e92/GCF_025165815.1_ASM2516581v1_genomic.fna.gz --refList GCF_025165815.1_ASM2516581v1_genomic.fna/target_genomes_gtdb.txt --output GCF_025165815.1_ASM2516581v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2024-01-24 12:19:10,221] [INFO] Task succeeded: fastANI [2024-01-24 12:19:10,237] [INFO] Found 18 fastANI hits (0 hits with ANI > circumscription radius) [2024-01-24 12:19:10,238] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_001921215.1 s__Actinophytocola xinjiangensis 83.2899 1618 2434 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__Actinophytocola 95.0 N/A N/A N/A N/A 1 - GCF_001921205.1 s__Actinophytocola xanthii 83.2232 1472 2434 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__Actinophytocola 95.0 N/A N/A N/A N/A 1 - GCA_009379765.1 s__Actinophytocola sp009379765 82.8951 1384 2434 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__Actinophytocola 95.0 N/A N/A N/A N/A 1 - GCF_014203735.1 s__Actinophytocola algeriensis 81.4965 1540 2434 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__Actinophytocola 95.0 100.00 100.00 1.00 1.00 2 - GCF_004364325.1 s__Actinophytocola oryzae 80.5984 1372 2434 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__Actinophytocola 95.0 N/A N/A N/A N/A 1 - GCF_900101685.1 s__Actinokineospora iranica 79.5205 922 2434 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__Actinokineospora 95.0 N/A N/A N/A N/A 1 - GCF_009745975.1 s__Actinokineospora pegani 79.3518 924 2434 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__Actinokineospora 95.0 N/A N/A N/A N/A 1 - GCF_004216555.1 s__Herbihabitans rhizosphaerae 79.3303 964 2434 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__Herbihabitans 95.0 N/A N/A N/A N/A 1 - GCF_900070365.1 s__Actinokineospora sp900070365 79.1805 935 2434 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__Actinokineospora 95.0 N/A N/A N/A N/A 1 - GCF_000374445.1 s__Actinokineospora enzanensis 79.1268 972 2434 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__Actinokineospora 95.0 N/A N/A N/A N/A 1 - GCF_002934265.1 s__Actinokineospora auranticolor 79.0865 1012 2434 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__Actinokineospora 95.0 N/A N/A N/A N/A 1 - GCF_004362825.1 s__Labedaea rhizosphaerae 79.0462 1052 2434 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__Labedaea 95.0 N/A N/A N/A N/A 1 - GCF_011758765.1 s__Amycolatopsis viridis 78.6921 861 2434 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__Amycolatopsis 95.9842 N/A N/A N/A N/A 1 - GCF_008386585.1 s__AN110305 sp008386585 78.5287 1023 2434 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__AN110305 95.0 N/A N/A N/A N/A 1 - GCF_000023245.1 s__Actinosynnema mirum 78.3379 1037 2434 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__Actinosynnema 96.4703 99.51 99.51 0.95 0.95 3 - GCF_002354875.1 s__Actinosynnema auranticum 78.2248 1029 2434 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__Actinosynnema 96.4703 100.00 100.00 1.00 1.00 2 - GCF_017876405.1 s__Kibdelosporangium banguiense 78.1666 938 2434 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__Kibdelosporangium 95.0 N/A N/A N/A N/A 1 - GCF_900116135.1 s__Saccharopolyspora flava 78.1351 824 2434 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Pseudonocardiaceae;g__Saccharopolyspora 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2024-01-24 12:19:10,239] [INFO] GTDB search result was written to GCF_025165815.1_ASM2516581v1_genomic.fna/result_gtdb.tsv [2024-01-24 12:19:10,240] [INFO] ===== GTDB Search completed ===== [2024-01-24 12:19:10,244] [INFO] DFAST_QC result json was written to GCF_025165815.1_ASM2516581v1_genomic.fna/dqc_result.json [2024-01-24 12:19:10,244] [INFO] DFAST_QC completed! [2024-01-24 12:19:10,244] [INFO] Total running time: 0h3m5s