[2023-06-18 23:42:55,484] [INFO] DFAST_QC pipeline started. [2023-06-18 23:42:55,506] [INFO] DFAST_QC version: 0.5.7 [2023-06-18 23:42:55,506] [INFO] DQC Reference Directory: /var/lib/cwl/stgf518b69a-873e-4f55-ad29-dc771ad831cf/dqc_reference [2023-06-18 23:42:57,800] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-18 23:42:57,801] [INFO] Task started: Prodigal [2023-06-18 23:42:57,801] [INFO] Running command: gunzip -c /var/lib/cwl/stg28026250-03ee-4cca-9c91-3411fd676ac2/GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna.gz | prodigal -d GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/cds.fna -a GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-18 23:43:02,821] [INFO] Task succeeded: Prodigal [2023-06-18 23:43:02,822] [INFO] Task started: HMMsearch [2023-06-18 23:43:02,822] [INFO] Running command: hmmsearch --tblout GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgf518b69a-873e-4f55-ad29-dc771ad831cf/dqc_reference/reference_markers.hmm GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/protein.faa > /dev/null [2023-06-18 23:43:02,971] [INFO] Task succeeded: HMMsearch [2023-06-18 23:43:02,972] [WARNING] Found 3/6 markers. [/var/lib/cwl/stg28026250-03ee-4cca-9c91-3411fd676ac2/GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna.gz] [2023-06-18 23:43:03,029] [INFO] Query marker FASTA was written to GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/markers.fasta [2023-06-18 23:43:03,030] [INFO] Task started: Blastn [2023-06-18 23:43:03,030] [INFO] Running command: blastn -query GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/markers.fasta -db /var/lib/cwl/stgf518b69a-873e-4f55-ad29-dc771ad831cf/dqc_reference/reference_markers.fasta -out GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-18 23:43:05,299] [INFO] Task succeeded: Blastn [2023-06-18 23:43:05,302] [INFO] Selected 19 target genomes. [2023-06-18 23:43:05,303] [INFO] Target genome list was writen to GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/target_genomes.txt [2023-06-18 23:43:05,379] [INFO] Task started: fastANI [2023-06-18 23:43:05,380] [INFO] Running command: fastANI --query /var/lib/cwl/stg28026250-03ee-4cca-9c91-3411fd676ac2/GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna.gz --refList GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/target_genomes.txt --output GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/fastani_result.tsv --threads 1 [2023-06-18 23:43:15,615] [INFO] Task succeeded: fastANI [2023-06-18 23:43:15,616] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgf518b69a-873e-4f55-ad29-dc771ad831cf/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-18 23:43:15,616] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgf518b69a-873e-4f55-ad29-dc771ad831cf/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-18 23:43:15,630] [INFO] Found 18 fastANI hits (0 hits with ANI > threshold) [2023-06-18 23:43:15,630] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-18 23:43:15,630] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Sphingobium xanthum strain=NL9 GCA_019737615.1 1387165 1387165 type True 79.5081 188 368 95 below_threshold Sphingobium yanoikuyae strain=ATCC 51230 GCA_000315525.1 13690 13690 suspected-type True 77.7744 127 368 95 below_threshold Sphingobium sufflavum strain=HL-25 GCA_021403115.1 1129547 1129547 type True 77.7239 97 368 95 below_threshold Sphingobium cloacae strain=JCM 10874 GCA_002355855.1 120107 120107 type True 77.6368 86 368 95 below_threshold Sphingobium cloacae strain=NBRC 102517 GCA_001591285.1 120107 120107 type True 77.5998 82 368 95 below_threshold Sphingobium xenophagum strain=NBRC 107872 GCA_000367345.1 121428 121428 type True 77.5839 99 368 95 below_threshold Sphingobium herbicidovorans strain=NBRC 16415 GCA_001598535.1 76947 76947 type True 77.5213 78 368 95 below_threshold Sphingobium herbicidovorans strain=MH GCA_002080435.1 76947 76947 type True 77.5213 78 368 95 below_threshold Sphingobium herbicidovorans strain=NBRC 16415 GCA_000632125.2 76947 76947 type True 77.492 78 368 95 below_threshold Sphingobium terrigena strain=EO9 GCA_003591655.1 2304063 2304063 type True 77.4603 100 368 95 below_threshold Sphingomonas cavernae strain=K2R01-6 GCA_003590775.1 2320861 2320861 type True 76.9768 68 368 95 below_threshold Sphingomonas flavalba strain=ZLT-5 GCA_004796535.1 2559804 2559804 type True 76.9326 61 368 95 below_threshold Sphingomonas aerolata strain=NW12 GCA_003046295.1 185951 185951 type True 76.9277 67 368 95 below_threshold Sphingomonas baiyangensis strain=L-1-4 w-11 GCA_005144715.1 2572576 2572576 type True 76.9142 68 368 95 below_threshold Sphingomonas albertensis strain=DOAB 1063 GCA_014358075.1 2762591 2762591 type True 76.8998 56 368 95 below_threshold Sphingopyxis terrae subsp. terrae strain=203-1 GCA_001610975.1 2448440 33052 type True 76.8814 65 368 95 below_threshold Sphingomonas hylomeconis strain=CCTCC AB 2013304 GCA_025370105.1 1395958 1395958 type True 76.8789 67 368 95 below_threshold Sphingomonas changnyeongensis strain=C33 GCA_009913435.1 2698679 2698679 type True 76.7665 59 368 95 below_threshold -------------------------------------------------------------------------------- [2023-06-18 23:43:15,641] [INFO] DFAST Taxonomy check result was written to GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/tc_result.tsv [2023-06-18 23:43:15,641] [INFO] ===== Taxonomy check completed ===== [2023-06-18 23:43:15,642] [INFO] ===== Start completeness check using CheckM ===== [2023-06-18 23:43:15,642] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgf518b69a-873e-4f55-ad29-dc771ad831cf/dqc_reference/checkm_data [2023-06-18 23:43:15,643] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-18 23:43:15,666] [INFO] Task started: CheckM [2023-06-18 23:43:15,666] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/checkm_input GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/checkm_result [2023-06-18 23:43:36,018] [INFO] Task succeeded: CheckM [2023-06-18 23:43:36,019] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 41.32% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-18 23:43:36,044] [INFO] ===== Completeness check finished ===== [2023-06-18 23:43:36,044] [INFO] ===== Start GTDB Search ===== [2023-06-18 23:43:36,044] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/markers.fasta) [2023-06-18 23:43:36,045] [INFO] Task started: Blastn [2023-06-18 23:43:36,045] [INFO] Running command: blastn -query GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/markers.fasta -db /var/lib/cwl/stgf518b69a-873e-4f55-ad29-dc771ad831cf/dqc_reference/reference_markers_gtdb.fasta -out GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-18 23:43:36,664] [INFO] Task succeeded: Blastn [2023-06-18 23:43:36,668] [INFO] Selected 12 target genomes. [2023-06-18 23:43:36,668] [INFO] Target genome list was writen to GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/target_genomes_gtdb.txt [2023-06-18 23:43:36,701] [INFO] Task started: fastANI [2023-06-18 23:43:36,701] [INFO] Running command: fastANI --query /var/lib/cwl/stg28026250-03ee-4cca-9c91-3411fd676ac2/GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna.gz --refList GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/target_genomes_gtdb.txt --output GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-18 23:43:43,820] [INFO] Task succeeded: fastANI [2023-06-18 23:43:43,830] [INFO] Found 12 fastANI hits (0 hits with ANI > circumscription radius) [2023-06-18 23:43:43,831] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_017304105.1 s__Sphingobium sp017304105 79.2862 159 368 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingobium 95.0 N/A N/A N/A N/A 1 - GCF_014201265.1 s__Sphingobium sp014201265 79.1473 142 368 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingobium 95.0 100.00 100.00 1.00 1.00 3 - GCA_017304655.1 s__Sphingobium sp017304655 79.0045 187 368 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingobium 95.0 N/A N/A N/A N/A 1 - GCF_018603885.1 s__Sphingobium sp018603885 78.7635 188 368 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingobium 95.0 N/A N/A N/A N/A 1 - GCA_017744605.1 s__Sphingobium sp017744605 78.5501 152 368 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingobium 95.0 N/A N/A N/A N/A 1 - GCA_001899715.1 s__Sphingobium sp001899715 78.2716 178 368 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingobium 95.0 N/A N/A N/A N/A 1 - GCF_000283515.1 s__Sphingobium sp000283515 78.1525 165 368 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingobium 95.0 96.32 96.27 0.90 0.89 5 - GCA_018823325.1 s__Sphingobium sp018823325 78.1474 153 368 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingobium 95.0 99.99 99.99 0.97 0.94 4 - GCF_000315525.1 s__Sphingobium yanoikuyae 77.7226 129 368 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingobium 95.0 96.25 95.31 0.82 0.72 34 - GCF_003591655.1 s__Sphingobium sp003591655 77.4195 101 368 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingobium 95.0 N/A N/A N/A N/A 1 - GCF_014203965.1 s__Sphingobium sp014203965 77.3653 113 368 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingobium 95.0 100.00 100.00 1.00 1.00 2 - GCF_003046295.1 s__Sphingomonas aerolata 76.9625 65 368 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Sphingomonadales;f__Sphingomonadaceae;g__Sphingomonas 95.0 96.52 96.36 0.90 0.73 15 - -------------------------------------------------------------------------------- [2023-06-18 23:43:43,833] [INFO] GTDB search result was written to GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/result_gtdb.tsv [2023-06-18 23:43:43,833] [INFO] ===== GTDB Search completed ===== [2023-06-18 23:43:43,837] [INFO] DFAST_QC result json was written to GCA_937925445.1_SRR3901701_bin.2_CONCOCT_v1.1_MAG_genomic.fna/dqc_result.json [2023-06-18 23:43:43,837] [INFO] DFAST_QC completed! [2023-06-18 23:43:43,837] [INFO] Total running time: 0h0m48s