[2024-01-24 15:09:55,541] [INFO] DFAST_QC pipeline started. [2024-01-24 15:09:55,543] [INFO] DFAST_QC version: 0.5.7 [2024-01-24 15:09:55,543] [INFO] DQC Reference Directory: /var/lib/cwl/stg43a9c35a-3967-42a3-a280-beb87614d687/dqc_reference [2024-01-24 15:09:58,165] [INFO] ===== Start taxonomy check using ANI ===== [2024-01-24 15:09:58,166] [INFO] Task started: Prodigal [2024-01-24 15:09:58,167] [INFO] Running command: gunzip -c /var/lib/cwl/stgc4226066-f972-4ff1-909e-680326a6f5f4/GCF_020447305.1_ASM2044730v2_genomic.fna.gz | prodigal -d GCF_020447305.1_ASM2044730v2_genomic.fna/cds.fna -a GCF_020447305.1_ASM2044730v2_genomic.fna/protein.faa -g 11 -q > /dev/null [2024-01-24 15:10:10,249] [INFO] Task succeeded: Prodigal [2024-01-24 15:10:10,249] [INFO] Task started: HMMsearch [2024-01-24 15:10:10,249] [INFO] Running command: hmmsearch --tblout GCF_020447305.1_ASM2044730v2_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg43a9c35a-3967-42a3-a280-beb87614d687/dqc_reference/reference_markers.hmm GCF_020447305.1_ASM2044730v2_genomic.fna/protein.faa > /dev/null [2024-01-24 15:10:10,606] [INFO] Task succeeded: HMMsearch [2024-01-24 15:10:10,608] [INFO] Found 6/6 markers. [2024-01-24 15:10:10,643] [INFO] Query marker FASTA was written to GCF_020447305.1_ASM2044730v2_genomic.fna/markers.fasta [2024-01-24 15:10:10,643] [INFO] Task started: Blastn [2024-01-24 15:10:10,643] [INFO] Running command: blastn -query GCF_020447305.1_ASM2044730v2_genomic.fna/markers.fasta -db /var/lib/cwl/stg43a9c35a-3967-42a3-a280-beb87614d687/dqc_reference/reference_markers.fasta -out GCF_020447305.1_ASM2044730v2_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 15:10:11,520] [INFO] Task succeeded: Blastn [2024-01-24 15:10:11,635] [INFO] Selected 32 target genomes. [2024-01-24 15:10:11,636] [INFO] Target genome list was writen to GCF_020447305.1_ASM2044730v2_genomic.fna/target_genomes.txt [2024-01-24 15:10:11,656] [INFO] Task started: fastANI [2024-01-24 15:10:11,657] [INFO] Running command: fastANI --query /var/lib/cwl/stgc4226066-f972-4ff1-909e-680326a6f5f4/GCF_020447305.1_ASM2044730v2_genomic.fna.gz --refList GCF_020447305.1_ASM2044730v2_genomic.fna/target_genomes.txt --output GCF_020447305.1_ASM2044730v2_genomic.fna/fastani_result.tsv --threads 1 [2024-01-24 15:10:33,522] [INFO] Task succeeded: fastANI [2024-01-24 15:10:33,522] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg43a9c35a-3967-42a3-a280-beb87614d687/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2024-01-24 15:10:33,523] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg43a9c35a-3967-42a3-a280-beb87614d687/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2024-01-24 15:10:33,549] [INFO] Found 32 fastANI hits (0 hits with ANI > threshold) [2024-01-24 15:10:33,549] [INFO] The taxonomy check result is classified as 'below_threshold'. [2024-01-24 15:10:33,550] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Gemmobacter caeruleus strain=N8 GCA_008271655.1 2595004 2595004 type True 77.8789 287 1069 95 below_threshold Gemmobacter nectariphilus strain=DSM 15620 GCA_000429765.1 220343 220343 type True 77.834 256 1069 95 below_threshold Rhodovulum bhavnagarense strain=DSM 24766 GCA_004343505.1 992286 992286 type True 77.5989 181 1069 95 below_threshold Cereibacter johrii strain=JA192 GCA_001720585.1 445629 445629 type True 77.4633 162 1069 95 below_threshold Palleronia rufa strain=MOLA 401 GCA_000743715.1 1530186 1530186 type True 77.355 142 1069 95 below_threshold Cereibacter johrii strain=JA192 GCA_003046325.1 445629 445629 type True 77.1844 165 1069 95 below_threshold Halovulum dunhuangense strain=YYQ-30 GCA_013093415.1 1505036 1505036 type True 77.1056 210 1069 95 below_threshold Rhodophyticola porphyridii strain=MA-7-27 GCA_003688285.1 1852017 1852017 type True 77.0487 153 1069 95 below_threshold Qingshengfaniella alkalisoli strain=LN3S51 GCA_007855645.1 2599296 2599296 type True 77.0443 101 1069 95 below_threshold Paracoccus thiocyanatus strain=ATCC 700171 GCA_900156255.1 34006 34006 type True 77.0281 211 1069 95 below_threshold Paracoccus tegillarcae strain=BM15 GCA_002847305.1 1529068 1529068 type True 77.0158 185 1069 95 below_threshold Paracoccus siganidrum strain=M26 GCA_003709565.1 1276757 1276757 type True 77.012 243 1069 95 below_threshold Rhodovulum tesquicola strain=A-36s GCA_024128855.1 540254 540254 type True 76.9839 221 1069 95 below_threshold Paracoccus siganidrum strain=DSM 26381 GCA_003594835.1 1276757 1276757 type True 76.974 241 1069 95 below_threshold Rhodovulum robiginosum strain=DSM 12329 GCA_003944755.1 68292 68292 type True 76.9511 231 1069 95 below_threshold Oceanomicrobium pacificus strain=KN286 GCA_009833495.1 2692916 2692916 type True 76.9241 224 1069 95 below_threshold Roseovarius faecimaris strain=MME-070 GCA_009762325.1 2494550 2494550 type True 76.8863 186 1069 95 below_threshold Paracoccus halophilus strain=CGMCC 1.6117 GCA_900111785.1 376733 376733 type True 76.8093 215 1069 95 below_threshold Tabrizicola alkalilacus strain=DJC GCA_003443995.1 2305252 2305252 type True 76.7982 232 1069 95 below_threshold Thalassobius aquimarinus strain=KMM 8518 GCA_018219815.1 2785917 2785917 type True 76.7895 183 1069 95 below_threshold Paracoccus halophilus strain=JCM 14014 GCA_000763905.1 376733 376733 type True 76.7694 213 1069 95 below_threshold Paracoccus denitrificans strain=NBRC 102528 GCA_007989485.1 266 266 type True 76.72 192 1069 95 below_threshold Tabrizicola sediminis strain=DRYC-M-16 GCA_004745575.1 2486418 2486418 type True 76.6752 224 1069 95 below_threshold Cereibacter sphaeroides strain=2.4.1 GCA_000273405.1 1063 1063 type True 76.6601 158 1069 95 below_threshold Paracoccus nototheniae strain=I-41R45 GCA_004335005.1 2489002 2489002 type True 76.623 208 1069 95 below_threshold Pseudophaeobacter flagellatus strain=MA21411-1 GCA_021228235.1 2899119 2899119 type True 76.5683 145 1069 95 below_threshold Cereibacter sphaeroides strain=NBRC 12203 GCA_007991035.1 1063 1063 type True 76.403 155 1069 95 below_threshold Gemmobacter tilapiae strain=KCTC 23310 GCA_014652215.1 875041 875041 type True 76.4008 157 1069 95 below_threshold Paracoccus laeviglucosivorans strain=DSM 100094 GCA_900182695.1 1197861 1197861 type True 76.3599 194 1069 95 below_threshold Poseidonocella pacifica strain=DSM 29316 GCA_900111875.1 871651 871651 type True 76.1712 99 1069 95 below_threshold Sagittula marina strain=DSM 102235 GCA_014196795.1 943940 943940 type True 76.1309 118 1069 95 below_threshold Albimonas donghaensis strain=DSM 17890 GCA_900106695.1 356660 356660 type True 75.9735 160 1069 95 below_threshold -------------------------------------------------------------------------------- [2024-01-24 15:10:33,551] [INFO] DFAST Taxonomy check result was written to GCF_020447305.1_ASM2044730v2_genomic.fna/tc_result.tsv [2024-01-24 15:10:33,552] [INFO] ===== Taxonomy check completed ===== [2024-01-24 15:10:33,552] [INFO] ===== Start completeness check using CheckM ===== [2024-01-24 15:10:33,552] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg43a9c35a-3967-42a3-a280-beb87614d687/dqc_reference/checkm_data [2024-01-24 15:10:33,553] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2024-01-24 15:10:33,585] [INFO] Task started: CheckM [2024-01-24 15:10:33,586] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_020447305.1_ASM2044730v2_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_020447305.1_ASM2044730v2_genomic.fna/checkm_input GCF_020447305.1_ASM2044730v2_genomic.fna/checkm_result [2024-01-24 15:11:11,141] [INFO] Task succeeded: CheckM [2024-01-24 15:11:11,142] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2024-01-24 15:11:11,167] [INFO] ===== Completeness check finished ===== [2024-01-24 15:11:11,168] [INFO] ===== Start GTDB Search ===== [2024-01-24 15:11:11,168] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_020447305.1_ASM2044730v2_genomic.fna/markers.fasta) [2024-01-24 15:11:11,169] [INFO] Task started: Blastn [2024-01-24 15:11:11,169] [INFO] Running command: blastn -query GCF_020447305.1_ASM2044730v2_genomic.fna/markers.fasta -db /var/lib/cwl/stg43a9c35a-3967-42a3-a280-beb87614d687/dqc_reference/reference_markers_gtdb.fasta -out GCF_020447305.1_ASM2044730v2_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 15:11:12,585] [INFO] Task succeeded: Blastn [2024-01-24 15:11:12,588] [INFO] Selected 35 target genomes. [2024-01-24 15:11:12,589] [INFO] Target genome list was writen to GCF_020447305.1_ASM2044730v2_genomic.fna/target_genomes_gtdb.txt [2024-01-24 15:11:12,612] [INFO] Task started: fastANI [2024-01-24 15:11:12,613] [INFO] Running command: fastANI --query /var/lib/cwl/stgc4226066-f972-4ff1-909e-680326a6f5f4/GCF_020447305.1_ASM2044730v2_genomic.fna.gz --refList GCF_020447305.1_ASM2044730v2_genomic.fna/target_genomes_gtdb.txt --output GCF_020447305.1_ASM2044730v2_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2024-01-24 15:11:35,514] [INFO] Task succeeded: fastANI [2024-01-24 15:11:35,546] [INFO] Found 35 fastANI hits (0 hits with ANI > circumscription radius) [2024-01-24 15:11:35,546] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_000429765.1 s__Wagnerdoeblera nectariphila 77.8337 256 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Wagnerdoeblera 95.0 N/A N/A N/A N/A 1 - GCF_004343505.1 s__Rhodovulum bhavnagarense 77.5492 184 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum 95.0 N/A N/A N/A N/A 1 - GCF_001720585.1 s__Cereibacter_A johrii 77.472 163 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Cereibacter_A 95.1736 98.42 97.86 0.94 0.91 6 - GCF_014196965.1 s__Actibacterium_A naphthalenivorans 77.4211 240 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Actibacterium_A 95.0 98.28 97.38 0.93 0.89 4 - GCF_009649175.1 s__Rhodovulum strictum 77.3755 236 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum 95.0 N/A N/A N/A N/A 1 - GCA_016278325.1 s__JADQCB01 sp016278325 77.3663 219 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__JADQCB01 95.0 N/A N/A N/A N/A 1 - GCF_900177545.1 s__Muriiphilus sp900177545 77.1696 175 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Muriiphilus 95.0 N/A N/A N/A N/A 1 - GCF_900142185.1 s__Lutimaribacter pacificus 77.157 267 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Lutimaribacter 95.0 100.00 100.00 1.00 1.00 2 - GCF_003688285.1 s__Rhodophyticola porphyridii 77.0487 153 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodophyticola 95.0 N/A N/A N/A N/A 1 - GCF_900156255.1 s__Paracoccus thiocyanatus 77.02 211 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Paracoccus 95.0 97.14 97.14 0.86 0.86 2 - GCF_003709565.1 s__Paracoccus siganidrum 77.0135 242 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Paracoccus 95.0 99.99 99.99 0.98 0.98 2 - GCF_900142935.1 s__Rhodovulum sp900142935 76.9984 178 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum 95.0 N/A N/A N/A N/A 1 - GCA_016936395.1 s__ETT8 sp016936395 76.9958 199 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__ETT8 95.0 N/A N/A N/A N/A 1 - GCF_014640115.1 s__Muriiphilus lacisalsi 76.9884 169 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Muriiphilus 95.0 N/A N/A N/A N/A 1 - GCA_015485375.1 s__UBA3077 sp015485375 76.9396 148 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__UBA3077 95.0 N/A N/A N/A N/A 1 - GCF_009833495.1 s__Oceanomicrobium pacificus 76.9241 224 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Oceanomicrobium 95.0 N/A N/A N/A N/A 1 - GCF_009762325.1 s__Roseovarius faecimaris 76.8995 185 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius 95.0 N/A N/A N/A N/A 1 - GCA_011048875.1 s__Rhodovulum sp011048875 76.8449 178 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum 95.0 N/A N/A N/A N/A 1 - GCA_017644125.1 s__Rhodophyticola sp017644125 76.8413 174 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodophyticola 95.0 99.99 99.99 0.99 0.98 3 - GCF_002900965.1 s__ETT8 sp002900965 76.827 218 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__ETT8 95.0 N/A N/A N/A N/A 1 - GCF_018139985.1 s__JAGSOU01 sp018139985 76.808 241 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__JAGSOU01 95.0 N/A N/A N/A N/A 1 - GCA_007827395.1 s__Rhodophyticola sp007827395 76.6927 155 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodophyticola 95.0 95.40 95.35 0.93 0.92 216 - GCF_003265325.1 s__Halovulum sediminis 76.685 200 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Halovulum 95.0 N/A N/A N/A N/A 1 - GCA_002748285.1 s__Profundibacter sp002748285 76.6808 173 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Profundibacter 95.0 N/A N/A N/A N/A 1 - GCA_016938875.1 s__Rhodovulum sp016938875 76.6561 195 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum 95.0 N/A N/A N/A N/A 1 - GCF_900110775.1 s__Litorimicrobium taeanense 76.6353 181 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Litorimicrobium 95.0 N/A N/A N/A N/A 1 - GCA_007120685.1 s__Pararhodobacter sp007120685 76.6259 167 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Pararhodobacter 95.0 99.30 99.30 0.84 0.84 2 - GCA_015689745.1 s__Roseicyclus sp015689745 76.6125 168 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseicyclus 95.0 N/A N/A N/A N/A 1 - GCF_000012905.2 s__Cereibacter_A sphaeroides 76.5855 158 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Cereibacter_A 95.1736 98.82 96.57 0.94 0.87 24 - GCA_015487335.1 s__S012-89 sp015487335 76.5835 159 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__S012-89 95.0 N/A N/A N/A N/A 1 - GCF_004145845.1 s__Marivivens sp004145845 76.5831 193 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Marivivens 95.0 N/A N/A N/A N/A 1 - GCA_007118445.1 s__Roseinatronobacter sp007118445 76.5398 151 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseinatronobacter 95.0 99.03 98.90 0.77 0.72 3 - GCA_012032275.1 s__JAAURK01 sp012032275 76.4986 147 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__JAAURK01 95.0 N/A N/A N/A N/A 1 - GCA_007121115.1 s__PUOA01 sp007121115 76.4485 157 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__PUOA01 95.0 99.30 99.15 0.84 0.78 4 - GCA_013151335.1 s__UBA5972 sp013151335 76.1876 131 1069 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__UBA5972 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2024-01-24 15:11:35,549] [INFO] GTDB search result was written to GCF_020447305.1_ASM2044730v2_genomic.fna/result_gtdb.tsv [2024-01-24 15:11:35,549] [INFO] ===== GTDB Search completed ===== [2024-01-24 15:11:35,556] [INFO] DFAST_QC result json was written to GCF_020447305.1_ASM2044730v2_genomic.fna/dqc_result.json [2024-01-24 15:11:35,556] [INFO] DFAST_QC completed! [2024-01-24 15:11:35,556] [INFO] Total running time: 0h1m40s