[2023-03-15 17:24:01,306] [INFO] DFAST_QC pipeline started. [2023-03-15 17:24:01,308] [INFO] DFAST_QC version: 0.5.7 [2023-03-15 17:24:01,308] [INFO] DQC Reference Directory: /var/lib/cwl/stgdffbd5a6-54a9-4458-9a84-bb9df26c35ef/dqc_reference [2023-03-15 17:24:02,579] [INFO] ===== Start taxonomy check using ANI ===== [2023-03-15 17:24:02,579] [INFO] Task started: Prodigal [2023-03-15 17:24:02,580] [INFO] Running command: cat /var/lib/cwl/stg64053398-e052-41af-854a-10048b76a8c5/OceanDNA-b2647.fa | prodigal -d OceanDNA-b2647/cds.fna -a OceanDNA-b2647/protein.faa -g 11 -q > /dev/null [2023-03-15 17:24:23,096] [INFO] Task succeeded: Prodigal [2023-03-15 17:24:23,097] [INFO] Task started: HMMsearch [2023-03-15 17:24:23,097] [INFO] Running command: hmmsearch --tblout OceanDNA-b2647/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgdffbd5a6-54a9-4458-9a84-bb9df26c35ef/dqc_reference/reference_markers.hmm OceanDNA-b2647/protein.faa > /dev/null [2023-03-15 17:24:23,311] [INFO] Task succeeded: HMMsearch [2023-03-15 17:24:23,311] [INFO] Found 6/6 markers. [2023-03-15 17:24:23,332] [INFO] Query marker FASTA was written to OceanDNA-b2647/markers.fasta [2023-03-15 17:24:23,333] [INFO] Task started: Blastn [2023-03-15 17:24:23,333] [INFO] Running command: blastn -query OceanDNA-b2647/markers.fasta -db /var/lib/cwl/stgdffbd5a6-54a9-4458-9a84-bb9df26c35ef/dqc_reference/reference_markers.fasta -out OceanDNA-b2647/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-15 17:24:24,469] [INFO] Task succeeded: Blastn [2023-03-15 17:24:24,470] [INFO] Selected 19 target genomes. [2023-03-15 17:24:24,470] [INFO] Target genome list was writen to OceanDNA-b2647/target_genomes.txt [2023-03-15 17:24:24,482] [INFO] Task started: fastANI [2023-03-15 17:24:24,483] [INFO] Running command: fastANI --query /var/lib/cwl/stg64053398-e052-41af-854a-10048b76a8c5/OceanDNA-b2647.fa --refList OceanDNA-b2647/target_genomes.txt --output OceanDNA-b2647/fastani_result.tsv --threads 1 [2023-03-15 17:24:38,705] [INFO] Task succeeded: fastANI [2023-03-15 17:24:38,705] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgdffbd5a6-54a9-4458-9a84-bb9df26c35ef/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-03-15 17:24:38,705] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgdffbd5a6-54a9-4458-9a84-bb9df26c35ef/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-03-15 17:24:38,717] [INFO] Found 19 fastANI hits (1 hits with ANI > threshold) [2023-03-15 17:24:38,717] [INFO] The taxonomy check result is classified as 'conclusive'. [2023-03-15 17:24:38,717] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Microbacterium luteum strain=A18JL200 GCA_015277875.1 2782167 2782167 type True 97.7006 937 1090 95 conclusive Microbacterium thalassium strain=DSM 12511 GCA_014208045.1 362649 362649 type True 80.5188 551 1090 95 below_threshold Microbacterium hibisci strain=CCTCC AB 2016180 GCA_015278255.1 2036000 2036000 type True 80.4657 531 1090 95 below_threshold Microbacterium kyungheense strain=DSM 105492 GCA_006783905.1 1263636 1263636 type True 80.179 533 1090 95 below_threshold Microbacterium helvum strain=NEAU-LLC GCA_014779795.1 2773713 2773713 type True 80.155 558 1090 95 below_threshold Microbacterium trichothecenolyticum strain=DSM 8608 GCA_000956465.1 69370 69370 type True 80.1203 554 1090 95 below_threshold Microbacterium gallinarum strain=Sa1CUA4 GCA_014837165.1 2762209 2762209 type True 80.1192 512 1090 95 below_threshold Microbacterium flavescens strain=JCM 3877 GCA_018588945.1 69366 69366 type True 80.1051 495 1090 95 below_threshold Microbacterium atlanticum strain=WY121 GCA_015277815.1 2782168 2782168 type True 80.0929 548 1090 95 below_threshold Microbacterium sulfonylureivorans strain=LAM7116 GCA_003999995.1 2486854 2486854 type True 80.0669 512 1090 95 below_threshold Microbacterium immunditiarum strain=DSM 24662 GCA_013409785.1 337480 337480 type True 79.8828 497 1090 95 below_threshold Microbacterium pullorum strain=Sa4CUA7 GCA_014836535.1 2762236 2762236 type True 79.8488 492 1090 95 below_threshold Microbacterium yannicii strain=DSM 23203 GCA_024055635.1 671622 671622 type True 79.831 536 1090 95 below_threshold Microbacterium hatanonis strain=JCM14558 GCA_008017415.1 404366 404366 type True 79.7695 490 1090 95 below_threshold Microbacterium timonense strain=Marseille-P5731 GCA_900292075.1 2086576 2086576 type True 79.7384 515 1090 95 below_threshold Microbacterium ginsengisoli strain=DSM 18659 GCA_000956535.1 400772 400772 type True 79.6778 455 1090 95 below_threshold Microbacterium luticocti strain=DSM 19459 GCA_000422405.1 451764 451764 type True 79.3936 439 1090 95 below_threshold Microbacterium profundi strain=Shh49 GCA_000763375.1 450380 450380 type True 79.3145 378 1090 95 below_threshold Microbacterium resistens strain=NBRC 103078 GCA_001552355.1 156977 156977 type True 79.156 378 1090 95 below_threshold -------------------------------------------------------------------------------- [2023-03-15 17:24:38,717] [INFO] DFAST Taxonomy check result was written to OceanDNA-b2647/tc_result.tsv [2023-03-15 17:24:38,717] [INFO] ===== Taxonomy check completed ===== [2023-03-15 17:24:38,717] [INFO] ===== Start completeness check using CheckM ===== [2023-03-15 17:24:38,718] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgdffbd5a6-54a9-4458-9a84-bb9df26c35ef/dqc_reference/checkm_data [2023-03-15 17:24:38,718] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-03-15 17:24:38,723] [INFO] Task started: CheckM [2023-03-15 17:24:38,723] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b2647/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b2647/checkm_input OceanDNA-b2647/checkm_result [2023-03-15 17:25:39,603] [INFO] Task succeeded: CheckM [2023-03-15 17:25:39,603] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 0.52% Strain heterogeneity: 100.00% -------------------------------------------------------------------------------- [2023-03-15 17:25:39,606] [INFO] ===== Completeness check finished ===== [2023-03-15 17:25:39,606] [INFO] ===== Start GTDB Search ===== [2023-03-15 17:25:39,606] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b2647/markers.fasta) [2023-03-15 17:25:39,607] [INFO] Task started: Blastn [2023-03-15 17:25:39,607] [INFO] Running command: blastn -query OceanDNA-b2647/markers.fasta -db /var/lib/cwl/stgdffbd5a6-54a9-4458-9a84-bb9df26c35ef/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b2647/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-15 17:25:41,270] [INFO] Task succeeded: Blastn [2023-03-15 17:25:41,271] [INFO] Selected 21 target genomes. [2023-03-15 17:25:41,271] [INFO] Target genome list was writen to OceanDNA-b2647/target_genomes_gtdb.txt [2023-03-15 17:25:41,296] [INFO] Task started: fastANI [2023-03-15 17:25:41,296] [INFO] Running command: fastANI --query /var/lib/cwl/stg64053398-e052-41af-854a-10048b76a8c5/OceanDNA-b2647.fa --refList OceanDNA-b2647/target_genomes_gtdb.txt --output OceanDNA-b2647/fastani_result_gtdb.tsv --threads 1 [2023-03-15 17:25:56,285] [INFO] Task succeeded: fastANI [2023-03-15 17:25:56,296] [INFO] Found 21 fastANI hits (1 hits with ANI > circumscription radius) [2023-03-15 17:25:56,297] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_015277875.1 s__Microbacterium sp900098805 97.6783 938 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 97.54 97.00 0.83 0.71 9 conclusive GCA_002703245.1 s__Microbacterium sp002703245 83.0611 601 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_015278255.1 s__Microbacterium hibisci 80.4767 530 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_000380605.1 s__Microbacterium sp000380605 80.222 538 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_014779795.1 s__Microbacterium helvum 80.1856 555 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_006783905.1 s__Microbacterium kyungheense 80.1792 533 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_000956465.1 s__Microbacterium trichothecenolyticum 80.1716 549 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_014837165.1 s__Microbacterium sp014837165 80.1566 509 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_018588945.1 s__Microbacterium flavescens 80.1091 495 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_017831975.1 s__Microbacterium terrae 80.0963 559 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 100.00 100.00 0.99 0.99 2 - GCF_003999995.1 s__Microbacterium sp003999995 80.0893 510 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_009858275.1 s__Microbacterium sp009858275 80.0545 546 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_012847295.1 s__Microbacterium sp012847295 80.0385 496 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 100.00 100.00 1.00 1.00 2 - GCF_900155915.1 s__Microbacterium sp900155915 79.931 486 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_014836535.1 s__Microbacterium sp014836535 79.8379 493 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_008017415.1 s__Microbacterium hatanonis 79.7587 491 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_000799385.1 s__Microbacterium sp000799385 79.7501 482 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_902706225.1 s__Microbacterium sp902706225 79.7372 471 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_900292075.1 s__Microbacterium timonense 79.7339 516 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCA_016870695.1 s__Microbacterium sp016870695 79.5783 347 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - GCF_001552355.1 s__Microbacterium resistens 79.1365 379 1090 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Microbacterium 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-03-15 17:25:56,297] [INFO] GTDB search result was written to OceanDNA-b2647/result_gtdb.tsv [2023-03-15 17:25:56,297] [INFO] ===== GTDB Search completed ===== [2023-03-15 17:25:56,299] [INFO] DFAST_QC result json was written to OceanDNA-b2647/dqc_result.json [2023-03-15 17:25:56,299] [INFO] DFAST_QC completed! [2023-03-15 17:25:56,299] [INFO] Total running time: 0h1m55s