[2023-06-08 09:05:12,721] [INFO] DFAST_QC pipeline started. [2023-06-08 09:05:12,724] [INFO] DFAST_QC version: 0.5.7 [2023-06-08 09:05:12,724] [INFO] DQC Reference Directory: /var/lib/cwl/stg4862f2f5-98a7-4ca7-85d6-5f9f1dc003b1/dqc_reference [2023-06-08 09:05:14,669] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-08 09:05:14,670] [INFO] Task started: Prodigal [2023-06-08 09:05:14,670] [INFO] Running command: gunzip -c /var/lib/cwl/stg888c404b-ff77-42c0-9907-31e81632dd11/GCA_945903685.1_TH-11nov19-65_genomic.fna.gz | prodigal -d GCA_945903685.1_TH-11nov19-65_genomic.fna/cds.fna -a GCA_945903685.1_TH-11nov19-65_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-08 09:05:20,343] [INFO] Task succeeded: Prodigal [2023-06-08 09:05:20,344] [INFO] Task started: HMMsearch [2023-06-08 09:05:20,344] [INFO] Running command: hmmsearch --tblout GCA_945903685.1_TH-11nov19-65_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg4862f2f5-98a7-4ca7-85d6-5f9f1dc003b1/dqc_reference/reference_markers.hmm GCA_945903685.1_TH-11nov19-65_genomic.fna/protein.faa > /dev/null [2023-06-08 09:05:20,572] [INFO] Task succeeded: HMMsearch [2023-06-08 09:05:20,573] [INFO] Found 6/6 markers. [2023-06-08 09:05:20,601] [INFO] Query marker FASTA was written to GCA_945903685.1_TH-11nov19-65_genomic.fna/markers.fasta [2023-06-08 09:05:20,601] [INFO] Task started: Blastn [2023-06-08 09:05:20,601] [INFO] Running command: blastn -query GCA_945903685.1_TH-11nov19-65_genomic.fna/markers.fasta -db /var/lib/cwl/stg4862f2f5-98a7-4ca7-85d6-5f9f1dc003b1/dqc_reference/reference_markers.fasta -out GCA_945903685.1_TH-11nov19-65_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-08 09:05:21,440] [INFO] Task succeeded: Blastn [2023-06-08 09:05:21,445] [INFO] Selected 34 target genomes. [2023-06-08 09:05:21,446] [INFO] Target genome list was writen to GCA_945903685.1_TH-11nov19-65_genomic.fna/target_genomes.txt [2023-06-08 09:05:21,456] [INFO] Task started: fastANI [2023-06-08 09:05:21,457] [INFO] Running command: fastANI --query /var/lib/cwl/stg888c404b-ff77-42c0-9907-31e81632dd11/GCA_945903685.1_TH-11nov19-65_genomic.fna.gz --refList GCA_945903685.1_TH-11nov19-65_genomic.fna/target_genomes.txt --output GCA_945903685.1_TH-11nov19-65_genomic.fna/fastani_result.tsv --threads 1 [2023-06-08 09:05:47,266] [INFO] Task succeeded: fastANI [2023-06-08 09:05:47,267] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg4862f2f5-98a7-4ca7-85d6-5f9f1dc003b1/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-08 09:05:47,267] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg4862f2f5-98a7-4ca7-85d6-5f9f1dc003b1/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-08 09:05:47,289] [INFO] Found 30 fastANI hits (0 hits with ANI > threshold) [2023-06-08 09:05:47,290] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-08 09:05:47,290] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Azospirillum ramasamyi strain=M2T2B2 GCA_003233655.1 682998 682998 type True 76.4488 121 689 95 below_threshold Stella humosa strain=ATCC 43930 GCA_006738645.1 94 94 type True 76.4011 152 689 95 below_threshold Hypericibacter terrae strain=R5913 GCA_008728855.1 2602015 2602015 type True 76.3325 134 689 95 below_threshold Hypericibacter adhaerens strain=R5959 GCA_008728835.1 2602016 2602016 type True 76.3293 160 689 95 below_threshold Stella humosa strain=DSM 5900 GCA_003751345.1 94 94 type True 76.3205 159 689 95 below_threshold Azospirillum picis strain=IMMIB TAR-3 GCA_017876115.1 488438 488438 type True 76.177 145 689 95 below_threshold Acidiphilium multivorum strain=AIU301 GCA_000964345.1 62140 62140 type True 76.1758 69 689 95 below_threshold Azospirillum thiophilum strain=BV-S GCA_001305595.1 528244 528244 type True 76.1586 125 689 95 below_threshold Azospirillum thiophilum strain=DSM 21654 GCA_000960825.1 528244 528244 type True 76.1449 126 689 95 below_threshold Thalassobaculum fulvum strain=KCTC 42651 GCA_014652915.1 1633335 1633335 type True 76.1394 162 689 95 below_threshold Propylenella binzhouense strain=L72 GCA_009866965.1 2555902 2555902 type True 76.0981 116 689 95 below_threshold Reyranella soli strain=NBRC 108950 GCA_007992495.1 1230389 1230389 type True 76.0734 134 689 95 below_threshold Oharaeibacter diazotrophicus strain=SM30 GCA_011317485.1 1920512 1920512 type True 76.0278 131 689 95 below_threshold Bradyrhizobium japonicum strain=NBRC 14783 GCA_006539645.1 375 375 type True 75.9813 115 689 95 below_threshold Skermanella mucosa strain=KEMB 2255-438 GCA_016765655.2 1789672 1789672 type True 75.9753 139 689 95 below_threshold Bradyrhizobium japonicum strain=USDA 6 GCA_000472985.1 375 375 type True 75.9713 116 689 95 below_threshold Rhodovulum sulfidophilum strain=DSM 1374 GCA_001633165.1 35806 35806 type True 75.9318 57 689 95 below_threshold Methylobacterium crusticola strain=KCTC 52305 GCA_022179145.1 1697972 1697972 type True 75.8631 133 689 95 below_threshold Methylobacterium crusticola strain=MIMD6 GCA_003574465.1 1697972 1697972 type True 75.8277 122 689 95 below_threshold Oharaeibacter diazotrophicus strain=DSM 102969 GCA_004362745.1 1920512 1920512 type True 75.8253 164 689 95 below_threshold Shinella zoogloeoides strain=ATCC 19623 GCA_020883495.1 352475 352475 type True 75.7052 69 689 95 below_threshold Acidiphilium iwatense strain=KCTC 23505 GCA_021556475.1 768198 768198 type True 75.7003 57 689 95 below_threshold Rubrimonas cliftonensis strain=DSM 15345 GCA_900107585.1 89524 89524 type True 75.6947 93 689 95 below_threshold Sphingosinicella humi strain=QZX222 GCA_003129465.1 2068657 2068657 type True 75.6496 60 689 95 below_threshold Albimonas pacifica strain=CGMCC 1.11030 GCA_900113695.1 1114924 1114924 type True 75.609 118 689 95 below_threshold Kaustia mangrovi strain=R1DC25 GCA_015482775.1 2593653 2593653 type True 75.5567 99 689 95 below_threshold Sphingomonas ginsenosidivorax strain=KHI67 GCA_007995065.1 862135 862135 type True 75.4545 84 689 95 below_threshold Novosphingobium percolationis strain=c1 GCA_020179425.1 2871811 2871811 type True 75.4274 52 689 95 below_threshold Sphingomonas corticis strain=36D10-4-7 GCA_012035195.1 2722791 2722791 type True 75.1933 78 689 95 below_threshold Agrococcus pavilionensis strain=RW1 GCA_000400485.1 1346502 1346502 type True 74.763 50 689 95 below_threshold -------------------------------------------------------------------------------- [2023-06-08 09:05:47,292] [INFO] DFAST Taxonomy check result was written to GCA_945903685.1_TH-11nov19-65_genomic.fna/tc_result.tsv [2023-06-08 09:05:47,293] [INFO] ===== Taxonomy check completed ===== [2023-06-08 09:05:47,293] [INFO] ===== Start completeness check using CheckM ===== [2023-06-08 09:05:47,293] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg4862f2f5-98a7-4ca7-85d6-5f9f1dc003b1/dqc_reference/checkm_data [2023-06-08 09:05:47,295] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-08 09:05:47,328] [INFO] Task started: CheckM [2023-06-08 09:05:47,329] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_945903685.1_TH-11nov19-65_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_945903685.1_TH-11nov19-65_genomic.fna/checkm_input GCA_945903685.1_TH-11nov19-65_genomic.fna/checkm_result [2023-06-08 09:06:09,906] [INFO] Task succeeded: CheckM [2023-06-08 09:06:09,908] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 93.31% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-08 09:06:09,927] [INFO] ===== Completeness check finished ===== [2023-06-08 09:06:09,927] [INFO] ===== Start GTDB Search ===== [2023-06-08 09:06:09,928] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_945903685.1_TH-11nov19-65_genomic.fna/markers.fasta) [2023-06-08 09:06:09,928] [INFO] Task started: Blastn [2023-06-08 09:06:09,928] [INFO] Running command: blastn -query GCA_945903685.1_TH-11nov19-65_genomic.fna/markers.fasta -db /var/lib/cwl/stg4862f2f5-98a7-4ca7-85d6-5f9f1dc003b1/dqc_reference/reference_markers_gtdb.fasta -out GCA_945903685.1_TH-11nov19-65_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-08 09:06:11,333] [INFO] Task succeeded: Blastn [2023-06-08 09:06:11,339] [INFO] Selected 28 target genomes. [2023-06-08 09:06:11,339] [INFO] Target genome list was writen to GCA_945903685.1_TH-11nov19-65_genomic.fna/target_genomes_gtdb.txt [2023-06-08 09:06:11,416] [INFO] Task started: fastANI [2023-06-08 09:06:11,416] [INFO] Running command: fastANI --query /var/lib/cwl/stg888c404b-ff77-42c0-9907-31e81632dd11/GCA_945903685.1_TH-11nov19-65_genomic.fna.gz --refList GCA_945903685.1_TH-11nov19-65_genomic.fna/target_genomes_gtdb.txt --output GCA_945903685.1_TH-11nov19-65_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-08 09:06:30,126] [INFO] Task succeeded: fastANI [2023-06-08 09:06:30,155] [INFO] Found 28 fastANI hits (0 hits with ANI > circumscription radius) [2023-06-08 09:06:30,156] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_016869275.1 s__SHVV01 sp016869275 90.3561 575 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__SHVV01;f__SHVV01;g__SHVV01 95.0 N/A N/A N/A N/A 1 - GCA_009694095.1 s__SHVV01 sp009694095 77.8514 154 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__SHVV01;f__SHVV01;g__SHVV01 95.0 N/A N/A N/A N/A 1 - GCA_016869765.1 s__SHVP01 sp016869765 77.0702 93 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__SHVP01;f__SHVP01;g__SHVP01 95.0 N/A N/A N/A N/A 1 - GCF_902729435.1 s__Magnetospirillum sp902729435 76.6989 108 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Magnetospirillaceae;g__Magnetospirillum 95.0 N/A N/A N/A N/A 1 - GCA_016869145.1 s__VGEU01 sp016869145 76.6903 198 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__MPNO01;f__VGEU01;g__VGEU01 95.0 N/A N/A N/A N/A 1 - GCA_009694165.1 s__SHVP01 sp009694165 76.6426 55 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__SHVP01;f__SHVP01;g__SHVP01 95.0 N/A N/A N/A N/A 1 - GCA_000518365.1 s__URHD0088 sp000518365 76.5531 158 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__ATCC43930;f__Stellaceae;g__URHD0088 95.0 N/A N/A N/A N/A 1 - GCA_018971215.1 s__REEB95 sp018971215 76.4541 108 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__ATCC43930;f__Stellaceae;g__REEB95 95.0 N/A N/A N/A N/A 1 - GCA_902826565.1 s__CADEFF01 sp902826565 76.3807 125 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Dongiales;f__Dongiaceae;g__CADEFF01 95.0 N/A N/A N/A N/A 1 - GCA_016869195.1 s__SHWC01 sp016869195 76.3485 92 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__UBA828;f__UBA828;g__SHWC01 95.0 N/A N/A N/A N/A 1 - GCA_009694025.1 s__SHVZ01 sp009694025 76.3477 97 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__SHVZ01;f__SHVZ01;g__SHVZ01 95.0 N/A N/A N/A N/A 1 - GCA_017305635.1 s__Bauldia sp017305635 76.3433 105 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Kaistiaceae;g__Bauldia 95.0 N/A N/A N/A N/A 1 - GCA_003576705.1 s__SYSU-D60015 sp003576705 76.3237 162 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Ferrovibrionales;f__Ferrovibrionaceae;g__SYSU-D60015 95.0 N/A N/A N/A N/A 1 - GCF_008728835.1 s__Hypericibacter adhaerens 76.3187 161 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Dongiales;f__Dongiaceae;g__Hypericibacter 95.0 N/A N/A N/A N/A 1 - GCA_009377525.1 s__Reyranella sp009377525 76.3108 134 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Reyranellales;f__Reyranellaceae;g__Reyranella 95.0 N/A N/A N/A N/A 1 - GCF_014197805.1 s__Rhodospirillum_A centenum 76.2474 89 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Rhodospirillum_A 95.0 100.00 100.00 1.00 1.00 2 - GCF_900177515.1 s__Azospirillum oryzae_A 76.2397 134 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Azospirillum 95.0 N/A N/A N/A N/A 1 - GCF_001305595.1 s__Azospirillum thiophilum 76.1576 125 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Azospirillum 95.0 100.00 100.00 1.00 1.00 2 - GCA_011523425.1 s__WTGU01 sp011523425 76.1497 111 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__UBA828;f__UBA828;g__WTGU01 95.0 N/A N/A N/A N/A 1 - GCA_017307275.1 s__Reyranella sp017307275 76.0982 101 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Reyranellales;f__Reyranellaceae;g__Reyranella 95.0 N/A N/A N/A N/A 1 - GCA_002690215.1 s__GCA-2690215 sp002690215 76.0858 85 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__UBA2966;f__UBA2966;g__GCA-2690215 95.0 N/A N/A N/A N/A 1 - GCA_903907895.1 s__CAIVPW01 sp903907895 76.0664 64 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__CAIVPW01;f__CAIVPW01;g__CAIVPW01 95.0 N/A N/A N/A N/A 1 - GCA_009694045.1 s__SHVY01 sp009694045 76.0661 94 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__SP197;f__SP197;g__SHVY01 95.0 N/A N/A N/A N/A 1 - GCF_003966125.1 s__Azospirillum griseum 75.9285 86 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Azospirillales;f__Azospirillaceae;g__Azospirillum 95.0 N/A N/A N/A N/A 1 - GCF_004362745.1 s__Oharaeibacter diazotrophicus 75.8255 164 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Pleomorphomonadaceae;g__Oharaeibacter 95.0 99.98 99.97 1.00 1.00 3 - GCF_000964365.1 s__Acidisphaera rubrifaciens 75.817 65 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Acetobacterales;f__Acetobacteraceae;g__Acidisphaera 95.0 N/A N/A N/A N/A 1 - GCF_013359755.1 s__Tardiphaga robiniae 75.8101 73 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhizobiales;f__Xanthobacteraceae;g__Tardiphaga 95.0 96.54 96.42 0.88 0.87 5 - GCA_017983635.1 s__SZUA-430 sp017983635 75.8088 61 689 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Micropepsales;f__Micropepsaceae;g__SZUA-430 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-08 09:06:30,166] [INFO] GTDB search result was written to GCA_945903685.1_TH-11nov19-65_genomic.fna/result_gtdb.tsv [2023-06-08 09:06:30,167] [INFO] ===== GTDB Search completed ===== [2023-06-08 09:06:30,173] [INFO] DFAST_QC result json was written to GCA_945903685.1_TH-11nov19-65_genomic.fna/dqc_result.json [2023-06-08 09:06:30,173] [INFO] DFAST_QC completed! [2023-06-08 09:06:30,174] [INFO] Total running time: 0h1m17s