[2023-06-12 23:45:30,452] [INFO] DFAST_QC pipeline started. [2023-06-12 23:45:30,455] [INFO] DFAST_QC version: 0.5.7 [2023-06-12 23:45:30,455] [INFO] DQC Reference Directory: /var/lib/cwl/stg84596b16-9400-4ec4-839f-e6b376afeab7/dqc_reference [2023-06-12 23:45:33,433] [INFO] ===== Start taxonomy check using ANI ===== [2023-06-12 23:45:33,434] [INFO] Task started: Prodigal [2023-06-12 23:45:33,434] [INFO] Running command: gunzip -c /var/lib/cwl/stg368b46de-8d77-4fc7-bffa-94a442e6435a/GCA_022448845.1_ASM2244884v1_genomic.fna.gz | prodigal -d GCA_022448845.1_ASM2244884v1_genomic.fna/cds.fna -a GCA_022448845.1_ASM2244884v1_genomic.fna/protein.faa -g 11 -q > /dev/null [2023-06-12 23:45:38,223] [INFO] Task succeeded: Prodigal [2023-06-12 23:45:38,224] [INFO] Task started: HMMsearch [2023-06-12 23:45:38,224] [INFO] Running command: hmmsearch --tblout GCA_022448845.1_ASM2244884v1_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg84596b16-9400-4ec4-839f-e6b376afeab7/dqc_reference/reference_markers.hmm GCA_022448845.1_ASM2244884v1_genomic.fna/protein.faa > /dev/null [2023-06-12 23:45:38,437] [INFO] Task succeeded: HMMsearch [2023-06-12 23:45:38,439] [INFO] Found 6/6 markers. [2023-06-12 23:45:38,470] [INFO] Query marker FASTA was written to GCA_022448845.1_ASM2244884v1_genomic.fna/markers.fasta [2023-06-12 23:45:38,471] [INFO] Task started: Blastn [2023-06-12 23:45:38,471] [INFO] Running command: blastn -query GCA_022448845.1_ASM2244884v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg84596b16-9400-4ec4-839f-e6b376afeab7/dqc_reference/reference_markers.fasta -out GCA_022448845.1_ASM2244884v1_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-12 23:45:39,269] [INFO] Task succeeded: Blastn [2023-06-12 23:45:39,274] [INFO] Selected 31 target genomes. [2023-06-12 23:45:39,275] [INFO] Target genome list was writen to GCA_022448845.1_ASM2244884v1_genomic.fna/target_genomes.txt [2023-06-12 23:45:39,284] [INFO] Task started: fastANI [2023-06-12 23:45:39,284] [INFO] Running command: fastANI --query /var/lib/cwl/stg368b46de-8d77-4fc7-bffa-94a442e6435a/GCA_022448845.1_ASM2244884v1_genomic.fna.gz --refList GCA_022448845.1_ASM2244884v1_genomic.fna/target_genomes.txt --output GCA_022448845.1_ASM2244884v1_genomic.fna/fastani_result.tsv --threads 1 [2023-06-12 23:46:02,376] [INFO] Task succeeded: fastANI [2023-06-12 23:46:02,377] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg84596b16-9400-4ec4-839f-e6b376afeab7/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-06-12 23:46:02,378] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg84596b16-9400-4ec4-839f-e6b376afeab7/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-06-12 23:46:02,400] [INFO] Found 29 fastANI hits (0 hits with ANI > threshold) [2023-06-12 23:46:02,401] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-06-12 23:46:02,401] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Pelagibius marinus strain=NBU2595 GCA_014925385.1 2762760 2762760 type True 76.9254 106 467 95 below_threshold Magnetospirillum caucaseum strain=SO-1 GCA_000342045.1 1244869 1244869 type True 76.5827 102 467 95 below_threshold Azospirillum ramasamyi strain=M2T2B2 GCA_003233655.1 682998 682998 type True 76.5394 93 467 95 below_threshold Caenispirillum salinarum strain=AK4 GCA_000315795.1 859058 859058 type True 76.525 91 467 95 below_threshold Azospirillum melinis strain=TMCY 0552 GCA_017876055.1 328839 328839 type True 76.4973 82 467 95 below_threshold Azospirillum melinis strain=TMCY0552 GCA_013340935.1 328839 328839 type True 76.4973 82 467 95 below_threshold Rhodothalassium salexigens strain=DSM 2132 GCA_016583875.1 1086 1086 type True 76.4413 61 467 95 below_threshold Oceanibaculum nanhaiense strain=L54-1-50 GCA_002148795.1 1909734 1909734 type True 76.4375 65 467 95 below_threshold Azospirillum griseum strain=L-25-5 w-1 GCA_003966125.1 2496639 2496639 type True 76.4029 67 467 95 below_threshold Azospirillum humicireducens strain=SgZ-5 GCA_001639105.2 1226968 1226968 type True 76.3728 85 467 95 below_threshold Stella humosa strain=ATCC 43930 GCA_006738645.1 94 94 type True 76.3643 82 467 95 below_threshold Stella humosa strain=DSM 5900 GCA_003751345.1 94 94 type True 76.3578 81 467 95 below_threshold Methylobacterium nonmethylotrophicum strain=6HR-1 GCA_004745635.1 1141884 1141884 type True 76.3184 51 467 95 below_threshold Hypericibacter adhaerens strain=R5959 GCA_008728835.1 2602016 2602016 type True 76.315 87 467 95 below_threshold Thalassobaculum fulvum strain=KCTC 42651 GCA_014652915.1 1633335 1633335 type True 76.2907 106 467 95 below_threshold Tistlia consotensis strain=DSM 21585 GCA_900188055.1 1321365 1321365 type True 76.2875 107 467 95 below_threshold Tistlia consotensis strain=USBA 355 GCA_900177295.1 1321365 1321365 type True 76.2547 108 467 95 below_threshold Brevundimonas viscosa strain=CGMCC 1.10683 GCA_900116065.1 871741 871741 type True 76.2298 53 467 95 below_threshold Azospirillum agricola strain=CC-HIH038 GCA_017876095.1 1720247 1720247 type True 76.1999 106 467 95 below_threshold Rhodoplanes elegans strain=DSM 11907 GCA_003258805.1 29408 29408 type True 76.1891 54 467 95 below_threshold Oceanibaculum pacificum strain=MCCC 1A02656 GCA_001618175.1 580166 580166 type True 76.1004 76 467 95 below_threshold Sphingomonas flavalba strain=ZLT-5 GCA_004796535.1 2559804 2559804 type True 76.0945 51 467 95 below_threshold Magnetospirillum kuznetsovii strain=LBB-42 GCA_003284725.1 2053833 2053833 type True 76.0696 78 467 95 below_threshold Rhodoplanes elegans strain=DSM 11907 GCA_016653355.1 29408 29408 type True 76.0602 61 467 95 below_threshold Magnetospirillum marisnigri strain=SP-1 GCA_001650715.1 1285242 1285242 type True 75.9617 79 467 95 below_threshold Methylobacterium crusticola strain=KCTC 52305 GCA_022179145.1 1697972 1697972 type True 75.8715 67 467 95 below_threshold Methylobacterium terricola strain=17Sr1-39 GCA_006151805.1 2583531 2583531 type True 75.8666 62 467 95 below_threshold Reyranella aquatilis strain=KCTC 52223 GCA_020880995.1 2035356 2035356 type True 75.8568 73 467 95 below_threshold Magnetospirillum magnetotacticum strain=MS-1 GCA_000829825.1 188 188 type True 75.7846 80 467 95 below_threshold -------------------------------------------------------------------------------- [2023-06-12 23:46:02,404] [INFO] DFAST Taxonomy check result was written to GCA_022448845.1_ASM2244884v1_genomic.fna/tc_result.tsv [2023-06-12 23:46:02,405] [INFO] ===== Taxonomy check completed ===== [2023-06-12 23:46:02,405] [INFO] ===== Start completeness check using CheckM ===== [2023-06-12 23:46:02,405] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg84596b16-9400-4ec4-839f-e6b376afeab7/dqc_reference/checkm_data [2023-06-12 23:46:02,406] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-06-12 23:46:02,431] [INFO] Task started: CheckM [2023-06-12 23:46:02,432] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCA_022448845.1_ASM2244884v1_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCA_022448845.1_ASM2244884v1_genomic.fna/checkm_input GCA_022448845.1_ASM2244884v1_genomic.fna/checkm_result [2023-06-12 23:46:22,799] [INFO] Task succeeded: CheckM [2023-06-12 23:46:22,800] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 85.32% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-06-12 23:46:22,825] [INFO] ===== Completeness check finished ===== [2023-06-12 23:46:22,826] [INFO] ===== Start GTDB Search ===== [2023-06-12 23:46:22,826] [INFO] Query marker FASTA already exists. Will reuse it. (GCA_022448845.1_ASM2244884v1_genomic.fna/markers.fasta) [2023-06-12 23:46:22,826] [INFO] Task started: Blastn [2023-06-12 23:46:22,827] [INFO] Running command: blastn -query GCA_022448845.1_ASM2244884v1_genomic.fna/markers.fasta -db /var/lib/cwl/stg84596b16-9400-4ec4-839f-e6b376afeab7/dqc_reference/reference_markers_gtdb.fasta -out GCA_022448845.1_ASM2244884v1_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-06-12 23:46:24,280] [INFO] Task succeeded: Blastn [2023-06-12 23:46:24,284] [INFO] Selected 13 target genomes. [2023-06-12 23:46:24,285] [INFO] Target genome list was writen to GCA_022448845.1_ASM2244884v1_genomic.fna/target_genomes_gtdb.txt [2023-06-12 23:46:24,301] [INFO] Task started: fastANI [2023-06-12 23:46:24,301] [INFO] Running command: fastANI --query /var/lib/cwl/stg368b46de-8d77-4fc7-bffa-94a442e6435a/GCA_022448845.1_ASM2244884v1_genomic.fna.gz --refList GCA_022448845.1_ASM2244884v1_genomic.fna/target_genomes_gtdb.txt --output GCA_022448845.1_ASM2244884v1_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2023-06-12 23:46:30,679] [INFO] Task succeeded: fastANI [2023-06-12 23:46:30,692] [INFO] Found 12 fastANI hits (1 hits with ANI > circumscription radius) [2023-06-12 23:46:30,692] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_002938075.1 s__Casp-alpha2 sp002938075 98.2129 349 467 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Casp-alpha2;g__Casp-alpha2 95.0 98.70 98.63 0.78 0.77 3 conclusive GCA_002686255.1 s__Casp-alpha2 sp002686255 91.3128 372 467 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Casp-alpha2;g__Casp-alpha2 95.0 99.76 99.74 0.90 0.88 3 - GCA_016763915.1 s__Casp-alpha2 sp016763915 79.7898 155 467 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Casp-alpha2;g__Casp-alpha2 95.0 N/A N/A N/A N/A 1 - GCA_014382345.1 s__Casp-alpha2 sp014382345 78.5195 247 467 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Casp-alpha2;g__Casp-alpha2 95.0 99.86 99.86 0.92 0.92 2 - GCA_016187325.1 s__JACPJY01 sp016187325 78.4776 228 467 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Casp-alpha2;g__JACPJY01 95.0 N/A N/A N/A N/A 1 - GCA_001510075.1 s__Casp-alpha2 sp001510075 78.4735 163 467 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Casp-alpha2;g__Casp-alpha2 95.0 N/A N/A N/A N/A 1 - GCA_013203955.1 s__Casp-alpha2 sp013203955 78.4699 254 467 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Casp-alpha2;g__Casp-alpha2 95.0 N/A N/A N/A N/A 1 - GCA_002938315.1 s__Casp-alpha2 sp002938315 78.2274 207 467 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Casp-alpha2;g__Casp-alpha2 95.0 99.33 99.24 0.87 0.79 4 - GCA_013204265.1 s__Casp-alpha2 sp013204265 77.4863 168 467 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Casp-alpha2;g__Casp-alpha2 95.0 N/A N/A N/A N/A 1 - GCA_002325375.1 s__UBA1479 sp002325375 77.0873 146 467 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__Casp-alpha2;g__UBA1479 95.0 99.20 96.43 0.95 0.85 6 - GCA_009694195.1 s__SHVQ01 sp009694195 76.4438 63 467 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodospirillales;f__2-12-FULL-67-15;g__SHVQ01 95.0 N/A N/A N/A N/A 1 - GCF_001618175.1 s__Oceanibaculum pacificum 76.1004 76 467 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Oceanibaculales;f__Oceanibaculaceae;g__Oceanibaculum 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-06-12 23:46:30,694] [INFO] GTDB search result was written to GCA_022448845.1_ASM2244884v1_genomic.fna/result_gtdb.tsv [2023-06-12 23:46:30,694] [INFO] ===== GTDB Search completed ===== [2023-06-12 23:46:30,699] [INFO] DFAST_QC result json was written to GCA_022448845.1_ASM2244884v1_genomic.fna/dqc_result.json [2023-06-12 23:46:30,699] [INFO] DFAST_QC completed! [2023-06-12 23:46:30,699] [INFO] Total running time: 0h1m0s