[2023-06-17 00:46:19,304] [INFO] DFAST_QC pipeline started. [2023-06-17 00:46:19,307] [INFO] DFAST_QC version: 0.5.7 [2023-06-17 00:46:19,307] [INFO] DQC Reference Directory: /var/lib/cwl/stg80ade448-f421-4e5f-8343-47d9b054cfee/dqc_reference [2023-06-17 00:46:20,594] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-17 00:46:20,595] [INFO] Task started: Prodigal [2023-06-17 00:46:20,595] [INFO] Running command: gunzip -c /var/lib/cwl/stgf74c26f5-ba85-4833-959f-d4d48d436668/GCA_013911885.1_ASM1391188v1_genomic.fna.gz | prodigal -d GCA_013911885.1_ASM1391188v1_genomic.fna/cds.fna -a GCA_013911885.1_ASM1391188v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-17 00:46:28,785] [INFO] Task succeeded: Prodigal [2023-06-17 00:46:28,785] [INFO] Task started: HMMsearch [2023-06-17 00:46:28,786] [INFO] Running command: hmmsearch --tblout GCA_013911885.1_ASM1391188v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg80ade448-f421-4e5f-8343-47d9b054cfee/dqc_reference/reference_markers.hmm GCA_013911885.1_ASM1391188v1_genomic.fna/protein.faa > /dev/null [2023-06-17 00:46:28,983] [INFO] Task succeeded: HMMsearch [2023-06-17 00:46:28,984] [INFO] Found 6/6 markers. [2023-06-17 00:46:29,009] [INFO] Query marker FASTA was written to GCA_013911885.1_ASM1391188v1_genomic.fna/markers.fasta [2023-06-17 00:46:29,009] [INFO] Task started: Blastn [2023-06-17 00:46:29,009] [INFO] Running command: blastn -query GCA_013911885.1_ASM1391188v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg80ade448-f421-4e5f-8343-47d9b054cfee/dqc_reference/reference_markers.fasta -out GCA_013911885.1_ASM1391188v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-17 00:46:29,749] [INFO] Task succeeded: Blastn [2023-06-17 00:46:29,753] [INFO] Selected 34 target genomes. [2023-06-17 00:46:29,754] [INFO] Target genome list was writen to GCA_013911885.1_ASM1391188v1_genomic.fna/target_genomes.txt [2023-06-17 00:46:29,761] [INFO] Task started: fastANI [2023-06-17 00:46:29,762] [INFO] Running command: fastANI --query /var/lib/cwl/stgf74c26f5-ba85-4833-959f-d4d48d436668/GCA_013911885.1_ASM1391188v1_genomic.fna.gz --refList GCA_013911885.1_ASM1391188v1_genomic.fna/target_genomes.txt --output GCA_013911885.1_ASM1391188v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-17 00:46:53,440] [INFO] Task succeeded: fastANI [2023-06-17 00:46:53,441] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg80ade448-f421-4e5f-8343-47d9b054cfee/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-17 00:46:53,441] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg80ade448-f421-4e5f-8343-47d9b054cfee/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-17 00:46:53,463] [INFO] Found 31 fastANI hits (0 hits with ANI > threshold) [2023-06-17 00:46:53,464] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-17 00:46:53,464] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Methylobrevis pamukkalensis strain=PK2 GCA_001720135.1 1439726 1439726 type True 77.0857 83 908 95 below_threshold Pannonibacter indicus strain=DSM 23407 GCA_001517385.1 466044 466044 type True 76.721 78 908 95 below_threshold Pannonibacter indicus strain=DSM 23407 GCA_001418225.1 466044 466044 type True 76.721 78 908 95 below_threshold Aurantimonas aggregata strain=KCTC 52919 GCA_010500835.1 2047720 2047720 type True 76.3728 95 908 95 below_threshold Ensifer alkalisoli strain=YIC4027 GCA_001723275.1 1752398 1752398 type True 76.3264 85 908 95 below_threshold Tepidamorphus gemmatus strain=DSM 19345 GCA_004346195.1 747076 747076 type True 76.297 67 908 95 below_threshold Ensifer sojae strain=CCBAU 05684 GCA_002288525.1 716925 716925 type True 76.2421 85 908 95 below_threshold Ensifer sojae strain=CCBAU 05684 GCA_000261485.1 716925 716925 type True 76.2343 87 908 95 below_threshold Ensifer alkalisoli strain=YIC4027 GCA_008932245.1 1752398 1752398 type True 76.2123 91 908 95 below_threshold Sinorhizobium kostiense strain=DSM 13372 GCA_017874595.1 76747 76747 type True 76.2077 87 908 95 below_threshold Blastochloris tepida strain=GI GCA_003966715.1 2233851 2233851 type True 76.2008 79 908 95 below_threshold Jiella sonneratiae strain=MQZ13P-4 GCA_017353515.1 2816856 2816856 type True 76.1652 96 908 95 below_threshold Roseibium marinum strain=DSM 17023 GCA_002906165.1 281252 281252 type True 76.1507 82 908 95 below_threshold Kaistia granuli strain=Ko04 GCA_000380505.1 363259 363259 type True 76.1463 95 908 95 below_threshold Xanthobacter oligotrophicus strain=29k GCA_008364685.1 2607286 2607286 type True 76.0747 51 908 95 below_threshold Bauldia litoralis strain=ATCC 35022 GCA_900104485.1 665467 665467 type True 76.0489 72 908 95 below_threshold Stappia taiwanensis strain=CCM 7757 GCA_014635285.1 992267 992267 type True 76.0371 94 908 95 below_threshold Rhizobium rosettiformans strain=W3 GCA_004912135.1 1368430 1368430 type True 76.0231 82 908 95 below_threshold Afipia felis strain=ATCC 53690 GCA_000314735.2 1035 1035 type True 76.0043 64 908 95 below_threshold Kaistia adipata strain=DSM 17808 GCA_000423225.1 166954 166954 type True 75.9937 91 908 95 below_threshold Afipia felis strain=NCTC12499 GCA_900445155.1 1035 1035 type True 75.9909 63 908 95 below_threshold Rhizobium rosettiformans strain=DSM 26376 GCA_014202175.1 1368430 1368430 type True 75.9836 83 908 95 below_threshold Pannonibacter phragmitetus strain=DSM 14782 GCA_000382365.1 121719 121719 suspected-type True 75.973 83 908 95 below_threshold Rhizobium wuzhouense strain=W44 GCA_003205195.1 1986026 1986026 type True 75.9391 90 908 95 below_threshold Pannonibacter phragmitetus strain=NCTC13350 GCA_900454465.1 121719 121719 suspected-type True 75.9365 85 908 95 below_threshold Stappia stellulata strain=DSM 5886 GCA_000423705.1 71235 71235 type True 75.8844 91 908 95 below_threshold Oharaeibacter diazotrophicus strain=SM30 GCA_011317485.1 1920512 1920512 type True 75.7998 59 908 95 below_threshold Oharaeibacter diazotrophicus strain=DSM 102969 GCA_004362745.1 1920512 1920512 type True 75.77 64 908 95 below_threshold Methylobacterium nonmethylotrophicum strain=6HR-1 GCA_004745635.1 1141884 1141884 type True 75.7011 68 908 95 below_threshold Methylobacterium terricola strain=17Sr1-39 GCA_006151805.1 2583531 2583531 type True 75.6342 72 908 95 below_threshold Methylobacterium oryzihabitans strain=TER-1 GCA_004004555.2 2499852 2499852 type True 75.3219 62 908 95 below_threshold -------------------------------------------------------------------------------- [2023-06-17 00:46:53,466] [INFO] DFAST Taxonomy check result was written to GCA_013911885.1_ASM1391188v1_genomic.fna/tc_result.tsv [2023-06-17 00:46:53,466] [INFO] ===== Taxonomy check completed ===== [2023-06-17 00:46:53,467] [INFO] ===== Start completeness check using CheckM ===== [2023-06-17 00:46:53,467] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg80ade448-f421-4e5f-8343-47d9b054cfee/dqc_reference/checkm_data [2023-06-17 00:46:53,468] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-17 00:46:53,496] [INFO] Task started: CheckM [2023-06-17 00:46:53,496] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_013911885.1_ASM1391188v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_013911885.1_ASM1391188v1_genomic.fna/checkm_input GCA_013911885.1_ASM1391188v1_genomic.fna/checkm_result [2023-06-17 00:47:21,824] [INFO] Task succeeded: CheckM [2023-06-17 00:47:21,825] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 95.83% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-17 00:47:21,924] [INFO] ===== Completeness check finished ===== [2023-06-17 00:47:21,925] [INFO] ===== Start GTDB Search ===== [2023-06-17 00:47:21,925] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_013911885.1_ASM1391188v1_genomic.fna/markers.fasta) [2023-06-17 00:47:21,925] [INFO] Task started: Blastn [2023-06-17 00:47:21,925] [INFO] Running command: blastn -query GCA_013911885.1_ASM1391188v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg80ade448-f421-4e5f-8343-47d9b054cfee/dqc_reference/reference_markers_gtdb.fasta -out GCA_013911885.1_ASM1391188v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-17 00:47:23,225] [INFO] Task succeeded: Blastn [2023-06-17 00:47:23,229] [INFO] Selected 26 target genomes. [2023-06-17 00:47:23,229] [INFO] Target genome list was writen to GCA_013911885.1_ASM1391188v1_genomic.fna/target_genomes_gtdb.txt [2023-06-17 00:47:23,248] [INFO] Task started: fastANI [2023-06-17 00:47:23,248] [INFO] Running command: fastANI --query /var/lib/cwl/stgf74c26f5-ba85-4833-959f-d4d48d436668/GCA_013911885.1_ASM1391188v1_genomic.fna.gz --refList GCA_013911885.1_ASM1391188v1_genomic.fna/target_genomes_gtdb.txt --output GCA_013911885.1_ASM1391188v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-17 00:47:43,262] [INFO] Task succeeded: fastANI [2023-06-17 00:47:43,280] [INFO] Found 24 fastANI hits (1 hits with ANI > circumscription radius) [2023-06-17 00:47:43,280] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_013911885.1 s__JACESI01 sp013911885 100.0 908 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__JACESI01;g__JACESI01 95.0 N/A N/A N/A N/A 1 conclusive GCF_001418225.1 s__Pannonibacter indicus 76.721 78 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Pannonibacter 95.0 96.01 95.55 0.94 0.89 12 - GCF_001298265.1 s__Bosea vaviloviae_B 76.3993 80 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Beijerinckiaceae;g__Bosea 95.0 N/A N/A N/A N/A 1 - GCA_017643215.1 s__Roseibium sp017643215 76.2696 66 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Roseibium 95.0 N/A N/A N/A N/A 1 - GCF_002288525.1 s__Sinorhizobium sojae 76.2421 85 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Sinorhizobium 95.0 99.99 99.99 1.00 1.00 2 - GCF_003966715.1 s__Blastochloris tepida 76.2008 79 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Blastochloris 95.0 N/A N/A N/A N/A 1 - GCF_002906165.1 s__Roseibium marinum 76.1507 82 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Roseibium 95.0 N/A N/A N/A N/A 1 - GCF_007829155.1 s__Rhizobium sp900467885 76.1506 84 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Rhizobium 95.0 98.99 98.86 0.92 0.89 10 - GCF_001651875.1 s__Sinorhizobium saheli 76.1505 94 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Sinorhizobium 95.0 99.97 99.97 0.96 0.96 2 - GCF_012641405.1 s__Rhizobium sp012641405 76.1061 83 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Rhizobium 95.0 99.52 99.52 0.93 0.93 2 - GCA_016716745.1 s__GCA-013693735 sp016716745 76.0947 91 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Beijerinckiaceae;g__GCA-013693735 95.0 N/A N/A N/A N/A 1 - GCF_001262505.1 s__Shinella sp001262505 76.0665 106 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Shinella 95.0 98.95 98.39 0.91 0.87 4 - GCF_014635285.1 s__Stappia taiwanensis 76.0371 94 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Stappia 95.0 100.00 100.00 1.00 1.00 2 - GCF_004912135.1 s__Allorhizobium rosettiformans 76.0231 82 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium 95.0 96.40 95.42 0.87 0.84 7 - GCA_017307955.1 s__Kaistia sp017307955 76.022 99 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Kaistiaceae;g__Kaistia 95.0 N/A N/A N/A N/A 1 - GCF_002005205.3 s__Agrobacterium rhizogenes_A 76.0112 100 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Agrobacterium 95.0 98.60 97.88 0.94 0.88 11 - GCF_000382365.1 s__Pannonibacter phragmitetus 75.973 83 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Pannonibacter 95.0 99.99 99.99 1.00 1.00 2 - GCF_018760785.1 s__Rhizobium_A sp018760785 75.9587 104 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Rhizobium_A 95.0 N/A N/A N/A N/A 1 - GCF_003205195.1 s__Allorhizobium wuzhouense 75.9391 90 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Rhizobiaceae;g__Allorhizobium 95.0 97.59 97.59 0.92 0.92 2 - GCF_016629525.1 s__Kaistia sp016629525 75.9339 104 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Kaistiaceae;g__Kaistia 95.0 N/A N/A N/A N/A 1 - GCF_000423705.1 s__Stappia stellulata 75.8844 91 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Stappia 95.0 N/A N/A N/A N/A 1 - GCF_001624695.1 s__Roseibium sp001624695 75.8416 81 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Roseibium 95.0 N/A N/A N/A N/A 1 - GCF_014692675.1 s__Roseibium aggregatum_C 75.8004 91 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Stappiaceae;g__Roseibium 95.0 N/A N/A N/A N/A 1 - GCF_003062295.1 s__Bradyrhizobium algeriense 75.6096 84 908 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Bradyrhizobium 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-17 00:47:43,283] [INFO] GTDB search result was written to GCA_013911885.1_ASM1391188v1_genomic.fna/result_gtdb.tsv [2023-06-17 00:47:43,283] [INFO] ===== GTDB Search completed ===== [2023-06-17 00:47:43,289] [INFO] DFAST_QC result json was written to GCA_013911885.1_ASM1391188v1_genomic.fna/dqc_result.json [2023-06-17 00:47:43,289] [INFO] DFAST_QC completed! [2023-06-17 00:47:43,289] [INFO] Total running time: 0h1m24s