[2024-01-24 12:29:37,428] [INFO] DFAST_QC pipeline started. [2024-01-24 12:29:37,430] [INFO] DFAST_QC version: 0.5.7 [2024-01-24 12:29:37,431] [INFO] DQC Reference Directory: /var/lib/cwl/stgaeb6a96d-b302-458f-9628-66f99ec729a0/dqc_reference [2024-01-24 12:29:38,708] [INFO] ===== Start taxonomy check using ANI ===== [2024-01-24 12:29:38,709] [INFO] Task started: Prodigal [2024-01-24 12:29:38,710] [INFO] Running command: gunzip -c /var/lib/cwl/stg5530794d-9e7e-4c2a-a1d8-bee8a94406fb/GCF_019038455.1_ASM1903845v1_genomic.fna.gz | prodigal -d GCF_019038455.1_ASM1903845v1_genomic.fna/cds.fna -a GCF_019038455.1_ASM1903845v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2024-01-24 12:30:06,195] [INFO] Task succeeded: Prodigal [2024-01-24 12:30:06,195] [INFO] Task started: HMMsearch [2024-01-24 12:30:06,195] [INFO] Running command: hmmsearch --tblout GCF_019038455.1_ASM1903845v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgaeb6a96d-b302-458f-9628-66f99ec729a0/dqc_reference/reference_markers.hmm GCF_019038455.1_ASM1903845v1_genomic.fna/protein.faa > /dev/null [2024-01-24 12:30:06,558] [INFO] Task succeeded: HMMsearch [2024-01-24 12:30:06,559] [INFO] Found 6/6 markers. [2024-01-24 12:30:06,634] [INFO] Query marker FASTA was written to GCF_019038455.1_ASM1903845v1_genomic.fna/markers.fasta [2024-01-24 12:30:06,635] [INFO] Task started: Blastn [2024-01-24 12:30:06,635] [INFO] Running command: blastn -query GCF_019038455.1_ASM1903845v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgaeb6a96d-b302-458f-9628-66f99ec729a0/dqc_reference/reference_markers.fasta -out GCF_019038455.1_ASM1903845v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 12:30:07,787] [INFO] Task succeeded: Blastn [2024-01-24 12:30:07,792] [INFO] Selected 17 target genomes. [2024-01-24 12:30:07,793] [INFO] Target genome list was writen to GCF_019038455.1_ASM1903845v1_genomic.fna/target_genomes.txt [2024-01-24 12:30:07,809] [INFO] Task started: fastANI [2024-01-24 12:30:07,809] [INFO] Running command: fastANI --query /var/lib/cwl/stg5530794d-9e7e-4c2a-a1d8-bee8a94406fb/GCF_019038455.1_ASM1903845v1_genomic.fna.gz --refList GCF_019038455.1_ASM1903845v1_genomic.fna/target_genomes.txt --output GCF_019038455.1_ASM1903845v1_genomic.fna/fastani_result.tsv --threads 1 [2024-01-24 12:30:38,558] [INFO] Task succeeded: fastANI [2024-01-24 12:30:38,559] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgaeb6a96d-b302-458f-9628-66f99ec729a0/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2024-01-24 12:30:38,559] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgaeb6a96d-b302-458f-9628-66f99ec729a0/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2024-01-24 12:30:38,572] [INFO] Found 17 fastANI hits (2 hits with ANI > threshold) [2024-01-24 12:30:38,573] [INFO] The taxonomy check result is classified as 'inconclusive'. [2024-01-24 12:30:38,573] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Nocardia noduli strain=ncl1 GCA_019038455.1 2815722 2815722 type True 100.0 2815 2834 95 inconclusive Nocardia aurea strain=SYSU K10002 GCA_003123685.1 2144174 2144174 type True 95.1538 2142 2834 95 inconclusive Nocardia arizonensis strain=NBRC 108935 GCA_001618405.1 1141647 1141647 type True 82.7677 1385 2834 95 below_threshold Nocardia arizonensis strain=W9405 GCA_001310275.1 1141647 1141647 type True 82.7458 1367 2834 95 below_threshold Nocardia bovistercoris strain=NEAU-351 GCA_015674855.1 2785916 2785916 type True 82.6688 1538 2834 95 below_threshold Nocardia takedensis strain=NBRC 100417 GCA_000308695.1 259390 259390 type True 82.4864 1492 2834 95 below_threshold Nocardia lijiangensis strain=NBRC 108240 GCA_001613045.1 299618 299618 type True 81.7142 1291 2834 95 below_threshold Nocardia amikacinitolerans strain=DSM 45539 GCA_024172045.1 756689 756689 type True 81.6919 1315 2834 95 below_threshold Nocardia xishanensis strain=NBRC 101358 GCA_001613365.1 238964 238964 type True 81.6638 1266 2834 95 below_threshold Nocardia amikacinitolerans strain=NBRC 108937 GCA_001612615.1 756689 756689 type True 81.6443 1302 2834 95 below_threshold Nocardia bhagyanarayanae strain=DSM 103495 GCA_006716565.1 1215925 1215925 type True 81.5969 1327 2834 95 below_threshold Nocardia suismassiliense strain=S-137 GCA_900269665.1 2077092 2077092 type True 80.9487 1234 2834 95 below_threshold Nocardia vulneris strain=W9851 GCA_000811985.1 1141657 1141657 type True 80.924 1223 2834 95 below_threshold Nocardia vulneris strain=NBRC 108936 GCA_001613425.1 1141657 1141657 type True 80.8926 1248 2834 95 below_threshold Nocardia iowensis strain=NRRL 5646 GCA_019222765.1 204891 204891 type True 80.8521 1204 2834 95 below_threshold Nocardia pseudovaccinii strain=NBRC 100343 GCA_001613225.1 189540 189540 type True 80.5947 1227 2834 95 below_threshold Nocardia rhizosphaerihabitans strain=CGMCC 4.7329 GCA_014646295.1 1691570 1691570 type True 80.5092 1147 2834 95 below_threshold -------------------------------------------------------------------------------- [2024-01-24 12:30:38,575] [INFO] DFAST Taxonomy check result was written to GCF_019038455.1_ASM1903845v1_genomic.fna/tc_result.tsv [2024-01-24 12:30:38,576] [INFO] ===== Taxonomy check completed ===== [2024-01-24 12:30:38,576] [INFO] ===== Start completeness check using CheckM ===== [2024-01-24 12:30:38,577] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgaeb6a96d-b302-458f-9628-66f99ec729a0/dqc_reference/checkm_data [2024-01-24 12:30:38,579] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2024-01-24 12:30:38,662] [INFO] Task started: CheckM [2024-01-24 12:30:38,663] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_019038455.1_ASM1903845v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_019038455.1_ASM1903845v1_genomic.fna/checkm_input GCF_019038455.1_ASM1903845v1_genomic.fna/checkm_result [2024-01-24 12:32:02,294] [INFO] Task succeeded: CheckM [2024-01-24 12:32:02,295] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 4.17% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2024-01-24 12:32:02,320] [INFO] ===== Completeness check finished ===== [2024-01-24 12:32:02,320] [INFO] ===== Start GTDB Search ===== [2024-01-24 12:32:02,321] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_019038455.1_ASM1903845v1_genomic.fna/markers.fasta) [2024-01-24 12:32:02,321] [INFO] Task started: Blastn [2024-01-24 12:32:02,322] [INFO] Running command: blastn -query GCF_019038455.1_ASM1903845v1_genomic.fna/markers.fasta -db /var/lib/cwl/stgaeb6a96d-b302-458f-9628-66f99ec729a0/dqc_reference/reference_markers_gtdb.fasta -out GCF_019038455.1_ASM1903845v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 12:32:03,999] [INFO] Task succeeded: Blastn [2024-01-24 12:32:04,005] [INFO] Selected 20 target genomes. [2024-01-24 12:32:04,005] [INFO] Target genome list was writen to GCF_019038455.1_ASM1903845v1_genomic.fna/target_genomes_gtdb.txt [2024-01-24 12:32:04,030] [INFO] Task started: fastANI [2024-01-24 12:32:04,030] [INFO] Running command: fastANI --query /var/lib/cwl/stg5530794d-9e7e-4c2a-a1d8-bee8a94406fb/GCF_019038455.1_ASM1903845v1_genomic.fna.gz --refList GCF_019038455.1_ASM1903845v1_genomic.fna/target_genomes_gtdb.txt --output GCF_019038455.1_ASM1903845v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2024-01-24 12:32:35,560] [INFO] Task succeeded: fastANI [2024-01-24 12:32:35,578] [INFO] Found 20 fastANI hits (1 hits with ANI > circumscription radius) [2024-01-24 12:32:35,579] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_003123685.1 s__Nocardia aurea 95.1476 2143 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Nocardia 95.0 95.19 95.19 0.76 0.76 2 conclusive GCF_001310275.1 s__Nocardia arizonensis 82.761 1365 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Nocardia 95.0 99.98 99.98 0.99 0.99 2 - GCF_015674855.1 s__Nocardia bovistercoris 82.6499 1541 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Nocardia 95.0 N/A N/A N/A N/A 1 - GCF_000308695.1 s__Nocardia takedensis 82.5111 1487 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Nocardia 95.0 N/A N/A N/A N/A 1 - GCF_001613045.1 s__Nocardia lijiangensis 81.7076 1292 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Nocardia 95.0 N/A N/A N/A N/A 1 - GCF_001613365.1 s__Nocardia xishanensis 81.6529 1268 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Nocardia 95.0 N/A N/A N/A N/A 1 - GCF_001612615.1 s__Nocardia amikacinitolerans 81.6255 1307 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Nocardia 95.0 97.80 97.80 0.89 0.89 2 - GCF_006716565.1 s__Nocardia bhagyanarayanae 81.5817 1330 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Nocardia 95.0 N/A N/A N/A N/A 1 - GCF_900269665.1 s__Nocardia suismassiliense 80.9694 1230 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Nocardia 95.0 95.20 95.20 0.87 0.87 2 - GCF_000308475.2 s__Nocardia brasiliensis 80.9255 1206 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Nocardia 95.8238 100.00 99.99 1.00 1.00 3 - GCF_010858045.1 s__Nocardia cyriacigeorgica_E 80.9141 1050 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Nocardia 95.0 98.94 98.91 0.92 0.91 4 - GCF_000308715.1 s__Nocardia tenerifensis 80.8821 1133 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Nocardia 95.0 99.99 99.99 1.00 1.00 2 - GCF_001613425.1 s__Nocardia vulneris 80.8777 1251 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Nocardia 95.8238 100.00 100.00 0.99 0.99 2 - GCF_019222765.1 s__Nocardia iowensis 80.8575 1203 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Nocardia 95.0 N/A N/A N/A N/A 1 - GCF_001613225.1 s__Nocardia pseudovaccinii 80.5923 1226 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Nocardia 95.0 N/A N/A N/A N/A 1 - GCF_900637185.1 s__Nocardia asteroides 80.5667 1123 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Nocardia 95.0 100.00 100.00 1.00 1.00 5 - GCF_010868145.1 s__Nocardia cyriacigeorgica_D 80.5175 997 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Nocardia 95.0 99.70 99.66 0.98 0.98 4 - GCF_017168335.1 s__Rhodococcus_B sp017168335 77.4256 398 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Mycobacteriales;f__Mycobacteriaceae;g__Rhodococcus_B 95.0 N/A N/A N/A N/A 1 - GCA_000620545.1 s__Terrabacter sp000620545 76.1999 197 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Dermatophilaceae;g__Terrabacter 95.0 N/A N/A N/A N/A 1 - GCF_001046875.1 s__Tetrasphaera_A jenkinsii 76.0201 142 2834 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Dermatophilaceae;g__Tetrasphaera_A 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2024-01-24 12:32:35,581] [INFO] GTDB search result was written to GCF_019038455.1_ASM1903845v1_genomic.fna/result_gtdb.tsv [2024-01-24 12:32:35,581] [INFO] ===== GTDB Search completed ===== [2024-01-24 12:32:35,588] [INFO] DFAST_QC result json was written to GCF_019038455.1_ASM1903845v1_genomic.fna/dqc_result.json [2024-01-24 12:32:35,588] [INFO] DFAST_QC completed! [2024-01-24 12:32:35,588] [INFO] Total running time: 0h2m58s