[2023-06-17 01:44:47,978] [INFO] DFAST_QC pipeline started. [2023-06-17 01:44:47,982] [INFO] DFAST_QC version: 0.5.7 [2023-06-17 01:44:47,983] [INFO] DQC Reference Directory: /var/lib/cwl/stga9588ca4-2b43-475c-8246-faaa29ab6f85/dqc_reference [2023-06-17 01:44:49,896] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-17 01:44:49,897] [INFO] Task started: Prodigal [2023-06-17 01:44:49,897] [INFO] Running command: gunzip -c /var/lib/cwl/stg78508d8a-fb80-4f70-a521-fab30172e8c7/GCA_028287355.1_ASM2828735v1_genomic.fna.gz | prodigal -d GCA_028287355.1_ASM2828735v1_genomic.fna/cds.fna -a GCA_028287355.1_ASM2828735v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-17 01:44:57,030] [INFO] Task succeeded: Prodigal [2023-06-17 01:44:57,030] [INFO] Task started: HMMsearch [2023-06-17 01:44:57,030] [INFO] Running command: hmmsearch --tblout GCA_028287355.1_ASM2828735v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stga9588ca4-2b43-475c-8246-faaa29ab6f85/dqc_reference/reference_markers.hmm GCA_028287355.1_ASM2828735v1_genomic.fna/protein.faa > /dev/null [2023-06-17 01:44:57,254] [INFO] Task succeeded: HMMsearch [2023-06-17 01:44:57,255] [WARNING] Found 5/6 markers. [/var/lib/cwl/stg78508d8a-fb80-4f70-a521-fab30172e8c7/GCA_028287355.1_ASM2828735v1_genomic.fna.gz] [2023-06-17 01:44:57,286] [INFO] Query marker FASTA was written to GCA_028287355.1_ASM2828735v1_genomic.fna/markers.fasta [2023-06-17 01:44:57,287] [INFO] Task started: Blastn [2023-06-17 01:44:57,287] [INFO] Running command: blastn -query GCA_028287355.1_ASM2828735v1_genomic.fna/markers.fasta -db /var/lib/cwl/stga9588ca4-2b43-475c-8246-faaa29ab6f85/dqc_reference/reference_markers.fasta -out GCA_028287355.1_ASM2828735v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-17 01:44:58,020] [INFO] Task succeeded: Blastn [2023-06-17 01:44:58,025] [INFO] Selected 15 target genomes. [2023-06-17 01:44:58,026] [INFO] Target genome list was writen to GCA_028287355.1_ASM2828735v1_genomic.fna/target_genomes.txt [2023-06-17 01:44:58,048] [INFO] Task started: fastANI [2023-06-17 01:44:58,049] [INFO] Running command: fastANI --query /var/lib/cwl/stg78508d8a-fb80-4f70-a521-fab30172e8c7/GCA_028287355.1_ASM2828735v1_genomic.fna.gz --refList GCA_028287355.1_ASM2828735v1_genomic.fna/target_genomes.txt --output GCA_028287355.1_ASM2828735v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-17 01:45:10,125] [INFO] Task succeeded: fastANI [2023-06-17 01:45:10,125] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stga9588ca4-2b43-475c-8246-faaa29ab6f85/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-17 01:45:10,126] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stga9588ca4-2b43-475c-8246-faaa29ab6f85/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-17 01:45:10,136] [INFO] Found 11 fastANI hits (0 hits with ANI > threshold) [2023-06-17 01:45:10,136] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-17 01:45:10,136] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Gemmatirosa kalamazoonensis strain=KBS708 GCA_000522985.1 861299 861299 type True 77.7515 397 800 95 below_threshold Gemmatimonas groenlandica strain=TET16 GCA_013004105.1 2732249 2732249 type True 76.387 132 800 95 below_threshold Gemmatimonas aurantiaca strain=T-27 GCA_000010305.1 173480 173480 type True 76.2567 99 800 95 below_threshold Gemmatimonas phototrophica strain=AP64 GCA_000695095.2 1379270 1379270 type True 76.0081 93 800 95 below_threshold Frateuria terrea strain=CGMCC 1.7053 GCA_900115705.1 529704 529704 type True 75.0922 59 800 95 below_threshold Azospirillum rugosum strain=IMMIB AFH-6 GCA_017876155.1 416170 416170 type True 74.8922 117 800 95 below_threshold Agromyces kandeliae strain=Q22 GCA_009674665.1 2666141 2666141 type True 74.8882 94 800 95 below_threshold Kribbella sindirgiensis strain=DSM 27082 GCA_004331435.1 1124744 1124744 type True 74.8094 87 800 95 below_threshold Cellulosimicrobium funkei strain=NBRC 104118 GCA_001570825.1 264251 264251 suspected-type True 74.775 135 800 95 below_threshold Mangrovactinospora gilvigriseus strain=MUSC 26 GCA_001879105.1 1428644 1428644 type True 74.7647 113 800 95 below_threshold Thioalbus denitrificans strain=DSM 26407 GCA_003337735.1 547122 547122 type True 74.7644 64 800 95 below_threshold -------------------------------------------------------------------------------- [2023-06-17 01:45:10,139] [INFO] DFAST Taxonomy check result was written to GCA_028287355.1_ASM2828735v1_genomic.fna/tc_result.tsv [2023-06-17 01:45:10,139] [INFO] ===== Taxonomy check completed ===== [2023-06-17 01:45:10,140] [INFO] ===== Start completeness check using CheckM ===== [2023-06-17 01:45:10,140] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stga9588ca4-2b43-475c-8246-faaa29ab6f85/dqc_reference/checkm_data [2023-06-17 01:45:10,142] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-17 01:45:10,173] [INFO] Task started: CheckM [2023-06-17 01:45:10,174] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_028287355.1_ASM2828735v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_028287355.1_ASM2828735v1_genomic.fna/checkm_input GCA_028287355.1_ASM2828735v1_genomic.fna/checkm_result [2023-06-17 01:45:37,740] [INFO] Task succeeded: CheckM [2023-06-17 01:45:37,746] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 67.71% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-17 01:45:37,768] [INFO] ===== Completeness check finished ===== [2023-06-17 01:45:37,769] [INFO] ===== Start GTDB Search ===== [2023-06-17 01:45:37,769] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_028287355.1_ASM2828735v1_genomic.fna/markers.fasta) [2023-06-17 01:45:37,770] [INFO] Task started: Blastn [2023-06-17 01:45:37,770] [INFO] Running command: blastn -query GCA_028287355.1_ASM2828735v1_genomic.fna/markers.fasta -db /var/lib/cwl/stga9588ca4-2b43-475c-8246-faaa29ab6f85/dqc_reference/reference_markers_gtdb.fasta -out GCA_028287355.1_ASM2828735v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-17 01:45:38,868] [INFO] Task succeeded: Blastn [2023-06-17 01:45:38,872] [INFO] Selected 17 target genomes. [2023-06-17 01:45:38,872] [INFO] Target genome list was writen to GCA_028287355.1_ASM2828735v1_genomic.fna/target_genomes_gtdb.txt [2023-06-17 01:45:38,879] [INFO] Task started: fastANI [2023-06-17 01:45:38,879] [INFO] Running command: fastANI --query /var/lib/cwl/stg78508d8a-fb80-4f70-a521-fab30172e8c7/GCA_028287355.1_ASM2828735v1_genomic.fna.gz --refList GCA_028287355.1_ASM2828735v1_genomic.fna/target_genomes_gtdb.txt --output GCA_028287355.1_ASM2828735v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-17 01:45:49,278] [INFO] Task succeeded: fastANI [2023-06-17 01:45:49,297] [INFO] Found 16 fastANI hits (0 hits with ANI > circumscription radius) [2023-06-17 01:45:49,297] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_013361335.1 s__AG11 sp013361335 80.5533 453 800 d__Bacteria;p__Gemmatimonadota;c__Gemmatimonadetes;o__Gemmatimonadales;f__Gemmatimonadaceae;g__AG11 95.0 99.60 99.48 0.90 0.83 7 - GCA_003223455.1 s__AG11 sp003223455 79.1401 400 800 d__Bacteria;p__Gemmatimonadota;c__Gemmatimonadetes;o__Gemmatimonadales;f__Gemmatimonadaceae;g__AG11 95.0 N/A N/A N/A N/A 1 - GCA_014378185.1 s__AG11 sp014378185 78.6024 231 800 d__Bacteria;p__Gemmatimonadota;c__Gemmatimonadetes;o__Gemmatimonadales;f__Gemmatimonadaceae;g__AG11 95.0 N/A N/A N/A N/A 1 - GCA_002215645.1 s__AG11 sp002215645 78.4877 270 800 d__Bacteria;p__Gemmatimonadota;c__Gemmatimonadetes;o__Gemmatimonadales;f__Gemmatimonadaceae;g__AG11 95.0 N/A N/A N/A N/A 1 - GCA_013361745.1 s__JABFXC01 sp013361745 78.2868 185 800 d__Bacteria;p__Gemmatimonadota;c__Gemmatimonadetes;o__Gemmatimonadales;f__Gemmatimonadaceae;g__JABFXC01 95.0 N/A N/A N/A N/A 1 - GCA_013361935.1 s__JABFXC01 sp013361935 78.2543 299 800 d__Bacteria;p__Gemmatimonadota;c__Gemmatimonadetes;o__Gemmatimonadales;f__Gemmatimonadaceae;g__JABFXC01 95.0 99.88 99.88 0.94 0.92 3 - GCF_000522985.1 s__Gemmatirosa kalamazoonesis 77.7368 398 800 d__Bacteria;p__Gemmatimonadota;c__Gemmatimonadetes;o__Gemmatimonadales;f__Gemmatimonadaceae;g__Gemmatirosa 95.0 N/A N/A N/A N/A 1 - GCA_011390885.1 s__JAABRT01 sp011390885 77.4006 237 800 d__Bacteria;p__Gemmatimonadota;c__Gemmatimonadetes;o__Gemmatimonadales;f__Gemmatimonadaceae;g__JAABRT01 95.0 N/A N/A N/A N/A 1 - GCA_001724275.1 s__SCN-70-22 sp001724275 77.2092 231 800 d__Bacteria;p__Gemmatimonadota;c__Gemmatimonadetes;o__Gemmatimonadales;f__Gemmatimonadaceae;g__SCN-70-22 95.0 N/A N/A N/A N/A 1 - GCA_013360595.1 s__JABWBJ01 sp013360595 77.1446 113 800 d__Bacteria;p__Gemmatimonadota;c__Gemmatimonadetes;o__Gemmatimonadales;f__Gemmatimonadaceae;g__JABWBJ01 95.0 N/A N/A N/A N/A 1 - GCA_016794735.1 s__SCN-70-22 sp016794735 76.865 128 800 d__Bacteria;p__Gemmatimonadota;c__Gemmatimonadetes;o__Gemmatimonadales;f__Gemmatimonadaceae;g__SCN-70-22 95.0 N/A N/A N/A N/A 1 - GCA_002483225.1 s__Gemmatimonas sp002483225 76.789 123 800 d__Bacteria;p__Gemmatimonadota;c__Gemmatimonadetes;o__Gemmatimonadales;f__Gemmatimonadaceae;g__Gemmatimonas 95.0 97.70 97.70 0.96 0.96 2 - GCA_013697785.1 s__FEN-1250 sp013697785 76.684 144 800 d__Bacteria;p__Gemmatimonadota;c__Gemmatimonadetes;o__Gemmatimonadales;f__Gemmatimonadaceae;g__FEN-1250 95.0 N/A N/A N/A N/A 1 - GCA_003222205.1 s__40CM-2-70-7 sp003222205 76.0935 99 800 d__Bacteria;p__Gemmatimonadota;c__Gemmatimonadetes;o__Gemmatimonadales;f__GWC2-71-9;g__40CM-2-70-7 95.0 97.54 95.75 0.87 0.83 8 - GCA_003221045.1 s__AG41 sp003221045 75.718 67 800 d__Bacteria;p__Gemmatimonadota;c__Gemmatimonadetes;o__Gemmatimonadales;f__GWC2-71-9;g__AG41 95.0 98.20 98.20 0.96 0.96 2 - GCF_009674665.1 s__Agromyces kandeliae 74.8837 95 800 d__Bacteria;p__Actinobacteriota;c__Actinomycetia;o__Actinomycetales;f__Microbacteriaceae;g__Agromyces 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-17 01:45:49,299] [INFO] GTDB search result was written to GCA_028287355.1_ASM2828735v1_genomic.fna/result_gtdb.tsv [2023-06-17 01:45:49,299] [INFO] ===== GTDB Search completed ===== [2023-06-17 01:45:49,303] [INFO] DFAST_QC result json was written to GCA_028287355.1_ASM2828735v1_genomic.fna/dqc_result.json [2023-06-17 01:45:49,303] [INFO] DFAST_QC completed! [2023-06-17 01:45:49,304] [INFO] Total running time: 0h1m1s