[2023-06-30 10:57:35,550] [INFO] DFAST_QC pipeline started. [2023-06-30 10:57:35,552] [INFO] DFAST_QC version: 0.5.7 [2023-06-30 10:57:35,553] [INFO] DQC Reference Directory: /var/lib/cwl/stg3ac2e57a-5701-4c9c-97ab-70ca36c9bd08/dqc_reference [2023-06-30 10:57:36,910] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-30 10:57:36,911] [INFO] Task started: Prodigal [2023-06-30 10:57:36,911] [INFO] Running command: gunzip -c /var/lib/cwl/stg4a28c35c-cc2a-4b6a-acef-126219cabbc5/GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna.gz | prodigal -d GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/cds.fna -a GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-30 10:57:42,458] [INFO] Task succeeded: Prodigal [2023-06-30 10:57:42,459] [INFO] Task started: HMMsearch [2023-06-30 10:57:42,459] [INFO] Running command: hmmsearch --tblout GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg3ac2e57a-5701-4c9c-97ab-70ca36c9bd08/dqc_reference/reference_markers.hmm GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/protein.faa > /dev/null [2023-06-30 10:57:42,749] [INFO] Task succeeded: HMMsearch [2023-06-30 10:57:42,751] [WARNING] Found 3/6 markers. [/var/lib/cwl/stg4a28c35c-cc2a-4b6a-acef-126219cabbc5/GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna.gz] [2023-06-30 10:57:42,791] [INFO] Query marker FASTA was written to GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/markers.fasta [2023-06-30 10:57:42,791] [INFO] Task started: Blastn [2023-06-30 10:57:42,791] [INFO] Running command: blastn -query GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/markers.fasta -db /var/lib/cwl/stg3ac2e57a-5701-4c9c-97ab-70ca36c9bd08/dqc_reference/reference_markers.fasta -out GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-30 10:57:43,472] [INFO] Task succeeded: Blastn [2023-06-30 10:57:43,476] [INFO] Selected 17 target genomes. [2023-06-30 10:57:43,476] [INFO] Target genome list was writen to GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/target_genomes.txt [2023-06-30 10:57:43,480] [INFO] Task started: fastANI [2023-06-30 10:57:43,480] [INFO] Running command: fastANI --query /var/lib/cwl/stg4a28c35c-cc2a-4b6a-acef-126219cabbc5/GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna.gz --refList GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/target_genomes.txt --output GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/fastani_result.tsv --threads 1 [2023-06-30 10:57:57,124] [INFO] Task succeeded: fastANI [2023-06-30 10:57:57,124] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg3ac2e57a-5701-4c9c-97ab-70ca36c9bd08/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-30 10:57:57,124] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg3ac2e57a-5701-4c9c-97ab-70ca36c9bd08/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-30 10:57:57,137] [INFO] Found 13 fastANI hits (0 hits with ANI > threshold) [2023-06-30 10:57:57,137] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-30 10:57:57,137] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Rhodocyclus purpureus strain=DSM 168 GCA_016653115.1 1067 1067 type True 79.1951 75 279 95 below_threshold Propionivibrio dicarboxylicus strain=DSM 5885 GCA_900099695.1 83767 83767 type True 78.8625 89 279 95 below_threshold Rhodocyclus tenuis strain=2761 GCA_014197755.1 1066 1066 type True 78.5111 95 279 95 below_threshold Rhodocyclus tenuis strain=2761 GCA_009469755.1 1066 1066 type True 78.4507 96 279 95 below_threshold Rhodocyclus gracilis strain=DSM 110 GCA_009617575.1 2929842 2929842 type True 78.1195 67 279 95 below_threshold Azonexus hydrophilus strain=DSM 23864 GCA_000429605.1 418702 418702 type True 78.1059 66 279 95 below_threshold Dechloromonas denitrificans strain=ATCC BAA-841 GCA_001551835.1 281362 281362 type True 77.5398 66 279 95 below_threshold Rubrivivax benzoatilyticus strain=JA2 GCA_000420125.1 316997 316997 type True 76.3603 50 279 95 below_threshold Rubrivivax benzoatilyticus strain=JA2 GCA_000190375.2 316997 316997 type True 76.2273 51 279 95 below_threshold Burkholderia perseverans strain=INN12 GCA_022870505.1 2615214 2615214 type True 76.2212 59 279 95 below_threshold Burkholderia ubonensis strain=LMG 20358 GCA_902833085.1 101571 101571 type True 76.0784 59 279 95 below_threshold Burkholderia ubonensis GCA_902499185.1 101571 101571 type True 76.0597 57 279 95 below_threshold Burkholderia anthina strain=DSM 16086 GCA_016836725.1 179879 179879 suspected-type True 75.727 56 279 95 below_threshold -------------------------------------------------------------------------------- [2023-06-30 10:57:57,141] [INFO] DFAST Taxonomy check result was written to GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/tc_result.tsv [2023-06-30 10:57:57,142] [INFO] ===== Taxonomy check completed ===== [2023-06-30 10:57:57,142] [INFO] ===== Start completeness check using CheckM ===== [2023-06-30 10:57:57,143] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg3ac2e57a-5701-4c9c-97ab-70ca36c9bd08/dqc_reference/checkm_data [2023-06-30 10:57:57,144] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-30 10:57:57,177] [INFO] Task started: CheckM [2023-06-30 10:57:57,178] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/checkm_input GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/checkm_result [2023-06-30 10:58:20,939] [INFO] Task succeeded: CheckM [2023-06-30 10:58:20,941] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 48.06% Contamintation: 9.72% Strain heterogeneity: 16.67% -------------------------------------------------------------------------------- [2023-06-30 10:58:20,974] [INFO] ===== Completeness check finished ===== [2023-06-30 10:58:20,974] [INFO] ===== Start GTDB Search ===== [2023-06-30 10:58:20,975] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/markers.fasta) [2023-06-30 10:58:20,975] [INFO] Task started: Blastn [2023-06-30 10:58:20,975] [INFO] Running command: blastn -query GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/markers.fasta -db /var/lib/cwl/stg3ac2e57a-5701-4c9c-97ab-70ca36c9bd08/dqc_reference/reference_markers_gtdb.fasta -out GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-30 10:58:21,861] [INFO] Task succeeded: Blastn [2023-06-30 10:58:21,866] [INFO] Selected 8 target genomes. [2023-06-30 10:58:21,866] [INFO] Target genome list was writen to GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/target_genomes_gtdb.txt [2023-06-30 10:58:21,872] [INFO] Task started: fastANI [2023-06-30 10:58:21,873] [INFO] Running command: fastANI --query /var/lib/cwl/stg4a28c35c-cc2a-4b6a-acef-126219cabbc5/GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna.gz --refList GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/target_genomes_gtdb.txt --output GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-30 10:58:27,506] [INFO] Task succeeded: fastANI [2023-06-30 10:58:27,514] [INFO] Found 8 fastANI hits (1 hits with ANI > circumscription radius) [2023-06-30 10:58:27,514] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_903886975.1 s__Propionivibrio sp903886975 99.7472 270 279 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Rhodocyclaceae;g__Propionivibrio 95.0 N/A N/A N/A N/A 1 conclusive GCA_903917505.1 s__Propionivibrio sp903917505 85.2673 188 279 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Rhodocyclaceae;g__Propionivibrio 95.0 N/A N/A N/A N/A 1 - GCA_018262095.1 s__Propionivibrio sp018262095 80.1738 144 279 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Rhodocyclaceae;g__Propionivibrio 95.0 N/A N/A N/A N/A 1 - GCA_001897745.1 s__66-26 sp001897745 80.0946 114 279 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Rhodocyclaceae;g__66-26 95.0 97.97 95.98 0.93 0.89 3 - GCA_903910495.1 s__Propionivibrio sp903910495 79.6575 105 279 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Rhodocyclaceae;g__Propionivibrio 95.0 N/A N/A N/A N/A 1 - GCA_016709335.1 s__Propionivibrio sp016709335 79.3792 113 279 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Rhodocyclaceae;g__Propionivibrio 95.0 98.69 98.69 0.87 0.87 2 - GCA_907163075.1 s__Propionivibrio sp907163075 79.3537 125 279 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Rhodocyclaceae;g__Propionivibrio 95.0 N/A N/A N/A N/A 1 - GCA_016208425.1 s__JACQYA01 sp016208425 78.4762 88 279 d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Burkholderiales;f__Rhodocyclaceae;g__JACQYA01 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-30 10:58:27,518] [INFO] GTDB search result was written to GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/result_gtdb.tsv [2023-06-30 10:58:27,522] [INFO] ===== GTDB Search completed ===== [2023-06-30 10:58:27,527] [INFO] DFAST_QC result json was written to GCA_903845895.1_freshwater_MAG_---_MJ130617B_bin-284_genomic.fna/dqc_result.json [2023-06-30 10:58:27,527] [INFO] DFAST_QC completed! [2023-06-30 10:58:27,527] [INFO] Total running time: 0h0m52s