[2023-06-29 22:59:17,953] [INFO] DFAST_QC pipeline started. [2023-06-29 22:59:17,956] [INFO] DFAST_QC version: 0.5.7 [2023-06-29 22:59:17,956] [INFO] DQC Reference Directory: /var/lib/cwl/stg311b9925-e66b-4205-8f8d-e641cb9cafec/dqc_reference [2023-06-29 22:59:19,275] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-29 22:59:19,276] [INFO] Task started: Prodigal [2023-06-29 22:59:19,276] [INFO] Running command: gunzip -c /var/lib/cwl/stgb2e66a37-3777-462b-bebd-4b0f77f6bf01/GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna.gz | prodigal -d GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/cds.fna -a GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-29 22:59:25,848] [INFO] Task succeeded: Prodigal [2023-06-29 22:59:25,849] [INFO] Task started: HMMsearch [2023-06-29 22:59:25,849] [INFO] Running command: hmmsearch --tblout GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg311b9925-e66b-4205-8f8d-e641cb9cafec/dqc_reference/reference_markers.hmm GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/protein.faa > /dev/null [2023-06-29 22:59:26,114] [INFO] Task succeeded: HMMsearch [2023-06-29 22:59:26,116] [WARNING] Found 5/6 markers. [/var/lib/cwl/stgb2e66a37-3777-462b-bebd-4b0f77f6bf01/GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna.gz] [2023-06-29 22:59:26,144] [INFO] Query marker FASTA was written to GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/markers.fasta [2023-06-29 22:59:26,145] [INFO] Task started: Blastn [2023-06-29 22:59:26,145] [INFO] Running command: blastn -query GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/markers.fasta -db /var/lib/cwl/stg311b9925-e66b-4205-8f8d-e641cb9cafec/dqc_reference/reference_markers.fasta -out GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-29 22:59:26,819] [INFO] Task succeeded: Blastn [2023-06-29 22:59:26,823] [INFO] Selected 10 target genomes. [2023-06-29 22:59:26,823] [INFO] Target genome list was writen to GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/target_genomes.txt [2023-06-29 22:59:26,824] [INFO] Task started: fastANI [2023-06-29 22:59:26,824] [INFO] Running command: fastANI --query /var/lib/cwl/stgb2e66a37-3777-462b-bebd-4b0f77f6bf01/GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna.gz --refList GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/target_genomes.txt --output GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/fastani_result.tsv --threads 1 [2023-06-29 22:59:31,383] [INFO] Task succeeded: fastANI [2023-06-29 22:59:31,383] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg311b9925-e66b-4205-8f8d-e641cb9cafec/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-29 22:59:31,383] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg311b9925-e66b-4205-8f8d-e641cb9cafec/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-29 22:59:31,390] [INFO] Found 5 fastANI hits (0 hits with ANI > threshold) [2023-06-29 22:59:31,391] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-29 22:59:31,391] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Pelodictyon phaeoclathratiforme strain=BU-1 GCA_000020645.1 34090 34090 type True 78.9534 316 747 95 below_threshold Chlorobium ferrooxidans strain=DSM 13031 GCA_000168715.1 84205 84205 type True 77.8607 183 747 95 below_threshold Chlorobium phaeobacteroides strain=DSM 266 GCA_000015125.1 1096 1096 type True 77.7596 117 747 95 below_threshold Chlorobium limicola strain=DSM 245 GCA_000020465.1 1092 1092 type True 76.906 84 747 95 below_threshold Chlorobaculum thiosulfatiphilum strain=DSM 249 GCA_006265165.1 115852 115852 type True 76.2911 52 747 95 below_threshold -------------------------------------------------------------------------------- [2023-06-29 22:59:31,393] [INFO] DFAST Taxonomy check result was written to GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/tc_result.tsv [2023-06-29 22:59:31,394] [INFO] ===== Taxonomy check completed ===== [2023-06-29 22:59:31,394] [INFO] ===== Start completeness check using CheckM ===== [2023-06-29 22:59:31,394] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg311b9925-e66b-4205-8f8d-e641cb9cafec/dqc_reference/checkm_data [2023-06-29 22:59:31,396] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-29 22:59:31,426] [INFO] Task started: CheckM [2023-06-29 22:59:31,426] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/checkm_input GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/checkm_result [2023-06-29 22:59:57,752] [INFO] Task succeeded: CheckM [2023-06-29 22:59:57,753] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 95.83% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-29 22:59:57,773] [INFO] ===== Completeness check finished ===== [2023-06-29 22:59:57,774] [INFO] ===== Start GTDB Search ===== [2023-06-29 22:59:57,774] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/markers.fasta) [2023-06-29 22:59:57,775] [INFO] Task started: Blastn [2023-06-29 22:59:57,775] [INFO] Running command: blastn -query GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/markers.fasta -db /var/lib/cwl/stg311b9925-e66b-4205-8f8d-e641cb9cafec/dqc_reference/reference_markers_gtdb.fasta -out GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-29 22:59:58,547] [INFO] Task succeeded: Blastn [2023-06-29 22:59:58,551] [INFO] Selected 9 target genomes. [2023-06-29 22:59:58,551] [INFO] Target genome list was writen to GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/target_genomes_gtdb.txt [2023-06-29 22:59:58,557] [INFO] Task started: fastANI [2023-06-29 22:59:58,557] [INFO] Running command: fastANI --query /var/lib/cwl/stgb2e66a37-3777-462b-bebd-4b0f77f6bf01/GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna.gz --refList GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/target_genomes_gtdb.txt --output GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-29 23:00:03,042] [INFO] Task succeeded: fastANI [2023-06-29 23:00:03,059] [INFO] Found 9 fastANI hits (1 hits with ANI > circumscription radius) [2023-06-29 23:00:03,059] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_903994525.1 s__Chlorobium sp903994525 99.85 686 747 d__Bacteria;p__Bacteroidota;c__Chlorobia;o__Chlorobiales;f__Chlorobiaceae;g__Chlorobium 95.0 99.83 98.94 0.92 0.84 27 conclusive GCA_903859055.1 s__Chlorobium sp903859055 88.926 531 747 d__Bacteria;p__Bacteroidota;c__Chlorobia;o__Chlorobiales;f__Chlorobiaceae;g__Chlorobium 95.0 99.70 99.65 0.94 0.92 3 - GCA_903866255.1 s__Chlorobium sp903866255 85.1641 486 747 d__Bacteria;p__Bacteroidota;c__Chlorobia;o__Chlorobiales;f__Chlorobiaceae;g__Chlorobium 95.0 99.98 99.98 0.95 0.93 3 - GCA_005843815.1 s__Chlorobium sp005843815 82.2465 368 747 d__Bacteria;p__Bacteroidota;c__Chlorobia;o__Chlorobiales;f__Chlorobiaceae;g__Chlorobium 95.0 N/A N/A N/A N/A 1 - GCA_903994365.1 s__Chlorobium sp903994365 81.9488 353 747 d__Bacteria;p__Bacteroidota;c__Chlorobia;o__Chlorobiales;f__Chlorobiaceae;g__Chlorobium 95.0 99.58 99.23 0.74 0.73 6 - GCA_903915955.1 s__Chlorobium sp903915955 81.9158 422 747 d__Bacteria;p__Bacteroidota;c__Chlorobia;o__Chlorobiales;f__Chlorobiaceae;g__Chlorobium 95.0 98.74 96.45 0.88 0.78 5 - GCA_903994635.1 s__Chlorobium sp903994635 81.8392 376 747 d__Bacteria;p__Bacteroidota;c__Chlorobia;o__Chlorobiales;f__Chlorobiaceae;g__Chlorobium 95.0 99.09 98.12 0.80 0.68 12 - GCA_903838095.1 s__Chlorobium sp903838095 78.8117 213 747 d__Bacteria;p__Bacteroidota;c__Chlorobia;o__Chlorobiales;f__Chlorobiaceae;g__Chlorobium 95.0 N/A N/A N/A N/A 1 - GCA_013334755.1 s__Chlorobium sp013334755 77.9086 223 747 d__Bacteria;p__Bacteroidota;c__Chlorobia;o__Chlorobiales;f__Chlorobiaceae;g__Chlorobium 95.0 99.60 99.60 0.88 0.88 2 - -------------------------------------------------------------------------------- [2023-06-29 23:00:03,061] [INFO] GTDB search result was written to GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/result_gtdb.tsv [2023-06-29 23:00:03,062] [INFO] ===== GTDB Search completed ===== [2023-06-29 23:00:03,065] [INFO] DFAST_QC result json was written to GCA_903852185.1_freshwater_MAG_---_Day6-6_bin-415_genomic.fna/dqc_result.json [2023-06-29 23:00:03,065] [INFO] DFAST_QC completed! [2023-06-29 23:00:03,066] [INFO] Total running time: 0h0m45s