[2023-06-14 01:25:48,599] [INFO] DFAST_QC pipeline started. [2023-06-14 01:25:48,605] [INFO] DFAST_QC version: 0.5.7 [2023-06-14 01:25:48,605] [INFO] DQC Reference Directory: /var/lib/cwl/stgdc18dfeb-bc41-4828-b7b7-46dddf6b6c03/dqc_reference [2023-06-14 01:25:50,347] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-14 01:25:50,348] [INFO] Task started: Prodigal [2023-06-14 01:25:50,348] [INFO] Running command: gunzip -c /var/lib/cwl/stgf8d6bb54-fed0-403a-b1ad-ac69b52f9acf/GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna.gz | prodigal -d GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/cds.fna -a GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-14 01:25:55,998] [INFO] Task succeeded: Prodigal [2023-06-14 01:25:55,998] [INFO] Task started: HMMsearch [2023-06-14 01:25:55,998] [INFO] Running command: hmmsearch --tblout GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgdc18dfeb-bc41-4828-b7b7-46dddf6b6c03/dqc_reference/reference_markers.hmm GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/protein.faa > /dev/null [2023-06-14 01:25:56,215] [INFO] Task succeeded: HMMsearch [2023-06-14 01:25:56,217] [INFO] Found 6/6 markers. [2023-06-14 01:25:56,250] [INFO] Query marker FASTA was written to GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/markers.fasta [2023-06-14 01:25:56,250] [INFO] Task started: Blastn [2023-06-14 01:25:56,250] [INFO] Running command: blastn -query GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/markers.fasta -db /var/lib/cwl/stgdc18dfeb-bc41-4828-b7b7-46dddf6b6c03/dqc_reference/reference_markers.fasta -out GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-14 01:25:56,854] [INFO] Task succeeded: Blastn [2023-06-14 01:25:56,857] [INFO] Selected 22 target genomes. [2023-06-14 01:25:56,857] [INFO] Target genome list was writen to GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/target_genomes.txt [2023-06-14 01:25:56,859] [INFO] Task started: fastANI [2023-06-14 01:25:56,859] [INFO] Running command: fastANI --query /var/lib/cwl/stgf8d6bb54-fed0-403a-b1ad-ac69b52f9acf/GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna.gz --refList GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/target_genomes.txt --output GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/fastani_result.tsv --threads 1 [2023-06-14 01:26:08,036] [INFO] Task succeeded: fastANI [2023-06-14 01:26:08,036] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgdc18dfeb-bc41-4828-b7b7-46dddf6b6c03/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-14 01:26:08,036] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgdc18dfeb-bc41-4828-b7b7-46dddf6b6c03/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-14 01:26:08,043] [INFO] Found 6 fastANI hits (0 hits with ANI > threshold) [2023-06-14 01:26:08,043] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-14 01:26:08,043] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Coprococcus eutactus strain=ATCC 27759 GCA_025149915.1 33043 33043 suspected-type True 79.6281 220 816 95 below_threshold Coprococcus eutactus strain=ATCC 27759 GCA_000154425.1 33043 33043 suspected-type True 79.3921 219 816 95 below_threshold Wujia chipingensis strain=NSJ-4 GCA_014337155.1 2763670 2763670 type True 79.0078 106 816 95 below_threshold Lachnospira eligens strain=ATCC 27750 GCA_000146185.1 39485 39485 suspected-type True 77.5104 70 816 95 below_threshold Eubacterium ruminantium strain=ATCC 17233 GCA_900167085.1 42322 42322 type True 77.4888 75 816 95 below_threshold Roseburia inulinivorans strain=DSM 16841 GCA_000174195.1 360807 360807 suspected-type True 77.0484 50 816 95 below_threshold -------------------------------------------------------------------------------- [2023-06-14 01:26:08,045] [INFO] DFAST Taxonomy check result was written to GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/tc_result.tsv [2023-06-14 01:26:08,046] [INFO] ===== Taxonomy check completed ===== [2023-06-14 01:26:08,046] [INFO] ===== Start completeness check using CheckM ===== [2023-06-14 01:26:08,046] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgdc18dfeb-bc41-4828-b7b7-46dddf6b6c03/dqc_reference/checkm_data [2023-06-14 01:26:08,047] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-14 01:26:08,073] [INFO] Task started: CheckM [2023-06-14 01:26:08,073] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/checkm_input GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/checkm_result [2023-06-14 01:26:30,571] [INFO] Task succeeded: CheckM [2023-06-14 01:26:30,572] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-14 01:26:30,587] [INFO] ===== Completeness check finished ===== [2023-06-14 01:26:30,587] [INFO] ===== Start GTDB Search ===== [2023-06-14 01:26:30,588] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/markers.fasta) [2023-06-14 01:26:30,588] [INFO] Task started: Blastn [2023-06-14 01:26:30,588] [INFO] Running command: blastn -query GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/markers.fasta -db /var/lib/cwl/stgdc18dfeb-bc41-4828-b7b7-46dddf6b6c03/dqc_reference/reference_markers_gtdb.fasta -out GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-14 01:26:31,554] [INFO] Task succeeded: Blastn [2023-06-14 01:26:31,558] [INFO] Selected 24 target genomes. [2023-06-14 01:26:31,558] [INFO] Target genome list was writen to GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/target_genomes_gtdb.txt [2023-06-14 01:26:31,585] [INFO] Task started: fastANI [2023-06-14 01:26:31,585] [INFO] Running command: fastANI --query /var/lib/cwl/stgf8d6bb54-fed0-403a-b1ad-ac69b52f9acf/GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna.gz --refList GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/target_genomes_gtdb.txt --output GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-14 01:26:41,050] [INFO] Task succeeded: fastANI [2023-06-14 01:26:41,065] [INFO] Found 19 fastANI hits (1 hits with ANI > circumscription radius) [2023-06-14 01:26:41,066] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_003482105.1 s__Coprococcus sp000433075 98.0874 734 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Coprococcus 95.0 99.05 98.02 0.96 0.92 5 conclusive GCF_001404675.1 s__Coprococcus eutactus_A 79.7704 229 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Coprococcus 95.0 98.33 97.53 0.90 0.82 34 - GCF_000154245.1 s__Coprococcus sp000154245 79.7438 192 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Coprococcus 95.0 98.94 98.58 0.93 0.88 6 - GCF_003461625.1 s__Coprococcus sp900066115 79.5422 193 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Coprococcus 95.0 98.16 97.77 0.91 0.89 23 - GCF_000154425.1 s__Coprococcus eutactus 79.3707 220 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Coprococcus 95.0 98.55 98.02 0.90 0.86 8 - GCA_900557435.1 s__Coprococcus sp900557435 78.8936 130 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Coprococcus 95.0 N/A N/A N/A N/A 1 - GCA_900548315.1 s__Coprococcus sp900548315 78.8674 157 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Coprococcus 95.0 N/A N/A N/A N/A 1 - GCA_016297325.1 s__Coprococcus sp016297325 78.6592 166 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Coprococcus 95.0 N/A N/A N/A N/A 1 - GCA_017935345.1 s__Coprococcus sp017935345 78.6548 143 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Coprococcus 95.0 N/A N/A N/A N/A 1 - GCA_002437435.1 s__Coprococcus sp002437435 78.3866 160 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Coprococcus 95.0 N/A N/A N/A N/A 1 - GCA_016296785.1 s__Coprococcus sp016296785 78.2837 144 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Coprococcus 95.0 N/A N/A N/A N/A 1 - GCA_017624085.1 s__Coprococcus sp017624085 78.1175 75 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Coprococcus 95.0 N/A N/A N/A N/A 1 - GCF_014287955.1 s__Lachnospira sp900316325 77.7376 62 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Lachnospira 95.0 98.86 98.35 0.93 0.88 10 - GCA_017420535.1 s__Eubacterium_Q sp017420535 77.2213 72 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Eubacterium_Q 95.0 N/A N/A N/A N/A 1 - GCA_902774725.1 s__Eubacterium_Q sp902774725 77.1211 58 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Eubacterium_Q 95.0 96.31 95.93 0.76 0.75 3 - GCA_017938485.1 s__CAG-127 sp017938485 77.1203 72 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__CAG-127 95.0 N/A N/A N/A N/A 1 - GCA_902800325.1 s__CAG-127 sp902800325 76.9516 75 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__CAG-127 95.0 98.34 96.92 0.85 0.82 5 - GCA_002314775.1 s__UBA1760 sp002314775 76.414 74 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__UBA1760 95.0 N/A N/A N/A N/A 1 - GCA_900321215.1 s__Eubacterium_Q sp900321215 76.2625 60 816 d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Eubacterium_Q 95.0 99.10 98.50 0.87 0.75 5 - -------------------------------------------------------------------------------- [2023-06-14 01:26:41,068] [INFO] GTDB search result was written to GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/result_gtdb.tsv [2023-06-14 01:26:41,068] [INFO] ===== GTDB Search completed ===== [2023-06-14 01:26:41,072] [INFO] DFAST_QC result json was written to GCA_938048665.1_SRR413754_bin.94_CONCOCT_v1.1_MAG_genomic.fna/dqc_result.json [2023-06-14 01:26:41,072] [INFO] DFAST_QC completed! [2023-06-14 01:26:41,072] [INFO] Total running time: 0h0m52s