[2024-01-24 10:57:50,265] [INFO] DFAST_QC pipeline started. [2024-01-24 10:57:50,266] [INFO] DFAST_QC version: 0.5.7 [2024-01-24 10:57:50,267] [INFO] DQC Reference Directory: /var/lib/cwl/stge1545c9d-2174-4572-af6b-bde0aacb42f5/dqc_reference [2024-01-24 10:57:52,640] [INFO] ===== Start taxonomy check using ANI ===== [2024-01-24 10:57:52,641] [INFO] Task started: Prodigal [2024-01-24 10:57:52,641] [INFO] Running command: gunzip -c /var/lib/cwl/stg695adc08-306d-4616-946f-50f843528009/GCF_017921975.2_ASM1792197v2_genomic.fna.gz | prodigal -d GCF_017921975.2_ASM1792197v2_genomic.fna/cds.fna -a GCF_017921975.2_ASM1792197v2_genomic.fna/protein.faa -g 11 -q > /dev/null [2024-01-24 10:58:32,665] [INFO] Task succeeded: Prodigal [2024-01-24 10:58:32,665] [INFO] Task started: HMMsearch [2024-01-24 10:58:32,666] [INFO] Running command: hmmsearch --tblout GCF_017921975.2_ASM1792197v2_genomic.fna/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stge1545c9d-2174-4572-af6b-bde0aacb42f5/dqc_reference/reference_markers.hmm GCF_017921975.2_ASM1792197v2_genomic.fna/protein.faa > /dev/null [2024-01-24 10:58:32,984] [INFO] Task succeeded: HMMsearch [2024-01-24 10:58:32,986] [INFO] Found 6/6 markers. [2024-01-24 10:58:33,038] [INFO] Query marker FASTA was written to GCF_017921975.2_ASM1792197v2_genomic.fna/markers.fasta [2024-01-24 10:58:33,038] [INFO] Task started: Blastn [2024-01-24 10:58:33,038] [INFO] Running command: blastn -query GCF_017921975.2_ASM1792197v2_genomic.fna/markers.fasta -db /var/lib/cwl/stge1545c9d-2174-4572-af6b-bde0aacb42f5/dqc_reference/reference_markers.fasta -out GCF_017921975.2_ASM1792197v2_genomic.fna/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 10:58:33,671] [INFO] Task succeeded: Blastn [2024-01-24 10:58:33,675] [INFO] Selected 17 target genomes. [2024-01-24 10:58:33,676] [INFO] Target genome list was writen to GCF_017921975.2_ASM1792197v2_genomic.fna/target_genomes.txt [2024-01-24 10:58:33,684] [INFO] Task started: fastANI [2024-01-24 10:58:33,684] [INFO] Running command: fastANI --query /var/lib/cwl/stg695adc08-306d-4616-946f-50f843528009/GCF_017921975.2_ASM1792197v2_genomic.fna.gz --refList GCF_017921975.2_ASM1792197v2_genomic.fna/target_genomes.txt --output GCF_017921975.2_ASM1792197v2_genomic.fna/fastani_result.tsv --threads 1 [2024-01-24 10:58:55,038] [INFO] Task succeeded: fastANI [2024-01-24 10:58:55,038] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stge1545c9d-2174-4572-af6b-bde0aacb42f5/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2024-01-24 10:58:55,039] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stge1545c9d-2174-4572-af6b-bde0aacb42f5/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2024-01-24 10:58:55,057] [INFO] Found 17 fastANI hits (1 hits with ANI > threshold) [2024-01-24 10:58:55,058] [INFO] The taxonomy check result is classified as 'conclusive'. [2024-01-24 10:58:55,058] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Hymenobacter terricola strain=3F2 GCA_017921975.2 2819236 2819236 type True 100.0 2080 2089 95 conclusive Hymenobacter artigasi strain=1B GCA_012275535.1 2719616 2719616 type True 85.2225 1295 2089 95 below_threshold Hymenobacter rubidus strain=DG7B GCA_016734815.1 1441626 1441626 type True 84.777 1225 2089 95 below_threshold Hymenobacter ruricola strain=BT662 GCA_015694525.1 2791023 2791023 type True 84.6825 1278 2089 95 below_threshold Hymenobacter armeniacus strain=BT189 GCA_014699055.1 2771358 2771358 type True 84.551 1246 2089 95 below_threshold Hymenobacter frigidus strain=CGMCC 1.14966 GCA_014640435.1 1524095 1524095 type True 84.0684 1105 2089 95 below_threshold Hymenobacter properus strain=BT439 GCA_015694735.1 2791026 2791026 type True 83.8805 1237 2089 95 below_threshold Hymenobacter sedentarius strain=DG5B GCA_001507645.1 1411621 1411621 type True 83.8197 1153 2089 95 below_threshold Hymenobacter glacialis strain=CCM 8648 GCA_001816165.1 1908236 1908236 type True 83.3756 995 2089 95 below_threshold Hymenobacter lapidarius strain=CCM 8643 GCA_001816145.1 1908237 1908237 type True 83.3218 1063 2089 95 below_threshold Hymenobacter jeongseonensis strain=BT683 GCA_015694725.1 2791027 2791027 type True 82.9485 1135 2089 95 below_threshold Hymenobacter terrenus strain=MIMtkLc17 GCA_000972495.1 1629124 1629124 type True 82.1685 1112 2089 95 below_threshold Hymenobacter montanus strain=BT664 GCA_014699115.1 2771359 2771359 type True 81.7855 972 2089 95 below_threshold Hymenobacter guriensis strain=BT594 GCA_015773195.1 2793065 2793065 type True 78.9767 733 2089 95 below_threshold Hymenobacter metallicola strain=9PBR-1 GCA_004745645.1 2563114 2563114 type True 78.6946 755 2089 95 below_threshold Frankia canadensis GCA_900197875.1 1836972 1836972 type True 74.6588 94 2089 95 below_threshold Lujinxingia vulgaris strain=TMQ4 GCA_007997015.1 2600176 2600176 type True 74.6322 64 2089 95 below_threshold -------------------------------------------------------------------------------- [2024-01-24 10:58:55,060] [INFO] DFAST Taxonomy check result was written to GCF_017921975.2_ASM1792197v2_genomic.fna/tc_result.tsv [2024-01-24 10:58:55,060] [INFO] ===== Taxonomy check completed ===== [2024-01-24 10:58:55,060] [INFO] ===== Start completeness check using CheckM ===== [2024-01-24 10:58:55,060] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stge1545c9d-2174-4572-af6b-bde0aacb42f5/dqc_reference/checkm_data [2024-01-24 10:58:55,061] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2024-01-24 10:58:55,121] [INFO] Task started: CheckM [2024-01-24 10:58:55,122] [INFO] Running command: checkm taxonomy_wf --tab_table -f GCF_017921975.2_ASM1792197v2_genomic.fna/cc_result.tsv -t 1 life "Prokaryote" GCF_017921975.2_ASM1792197v2_genomic.fna/checkm_input GCF_017921975.2_ASM1792197v2_genomic.fna/checkm_result [2024-01-24 11:00:33,993] [INFO] Task succeeded: CheckM [2024-01-24 11:00:33,994] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 100.00% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2024-01-24 11:00:34,018] [INFO] ===== Completeness check finished ===== [2024-01-24 11:00:34,018] [INFO] ===== Start GTDB Search ===== [2024-01-24 11:00:34,019] [INFO] Query marker FASTA already exists. Will reuse it. (GCF_017921975.2_ASM1792197v2_genomic.fna/markers.fasta) [2024-01-24 11:00:34,019] [INFO] Task started: Blastn [2024-01-24 11:00:34,019] [INFO] Running command: blastn -query GCF_017921975.2_ASM1792197v2_genomic.fna/markers.fasta -db /var/lib/cwl/stge1545c9d-2174-4572-af6b-bde0aacb42f5/dqc_reference/reference_markers_gtdb.fasta -out GCF_017921975.2_ASM1792197v2_genomic.fna/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2024-01-24 11:00:34,917] [INFO] Task succeeded: Blastn [2024-01-24 11:00:34,923] [INFO] Selected 17 target genomes. [2024-01-24 11:00:34,923] [INFO] Target genome list was writen to GCF_017921975.2_ASM1792197v2_genomic.fna/target_genomes_gtdb.txt [2024-01-24 11:00:34,945] [INFO] Task started: fastANI [2024-01-24 11:00:34,946] [INFO] Running command: fastANI --query /var/lib/cwl/stg695adc08-306d-4616-946f-50f843528009/GCF_017921975.2_ASM1792197v2_genomic.fna.gz --refList GCF_017921975.2_ASM1792197v2_genomic.fna/target_genomes_gtdb.txt --output GCF_017921975.2_ASM1792197v2_genomic.fna/fastani_result_gtdb.tsv --threads 1 [2024-01-24 11:00:57,013] [INFO] Task succeeded: fastANI [2024-01-24 11:00:57,032] [INFO] Found 17 fastANI hits (0 hits with ANI > circumscription radius) [2024-01-24 11:00:57,033] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_012275535.1 s__Hymenobacter artigasi 85.2228 1295 2089 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Cytophagales;f__Hymenobacteraceae;g__Hymenobacter 95.0 N/A N/A N/A N/A 1 - GCF_018967845.1 s__Hymenobacter sp018967845 84.8006 1230 2089 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Cytophagales;f__Hymenobacteraceae;g__Hymenobacter 95.0 99.02 98.98 0.94 0.93 3 - GCF_016734815.1 s__Hymenobacter rubidus 84.7794 1224 2089 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Cytophagales;f__Hymenobacteraceae;g__Hymenobacter 95.0 N/A N/A N/A N/A 1 - GCF_016056375.1 s__Hymenobacter negativus_A 84.7261 1288 2089 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Cytophagales;f__Hymenobacteraceae;g__Hymenobacter 95.0 97.78 97.78 0.95 0.95 3 - GCF_015694525.1 s__Hymenobacter ruricola 84.6829 1278 2089 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Cytophagales;f__Hymenobacteraceae;g__Hymenobacter 95.0 N/A N/A N/A N/A 1 - GCF_016427455.1 s__Hymenobacter sp016427455 84.4798 1275 2089 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Cytophagales;f__Hymenobacteraceae;g__Hymenobacter 95.0 95.46 95.46 0.89 0.89 2 - GCF_014640435.1 s__Hymenobacter frigidus 84.0819 1104 2089 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Cytophagales;f__Hymenobacteraceae;g__Hymenobacter 95.0 N/A N/A N/A N/A 1 - GCF_015694735.1 s__Hymenobacter properus 83.871 1238 2089 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Cytophagales;f__Hymenobacteraceae;g__Hymenobacter 95.0 100.00 100.00 1.00 1.00 2 - GCF_001507645.1 s__Hymenobacter sedentarius 83.8198 1153 2089 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Cytophagales;f__Hymenobacteraceae;g__Hymenobacter 95.0 N/A N/A N/A N/A 1 - GCF_001816145.1 s__Hymenobacter lapidarius 83.3597 1058 2089 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Cytophagales;f__Hymenobacteraceae;g__Hymenobacter 95.0 N/A N/A N/A N/A 1 - GCF_001816165.1 s__Hymenobacter glacialis 83.3512 998 2089 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Cytophagales;f__Hymenobacteraceae;g__Hymenobacter 95.0 N/A N/A N/A N/A 1 - GCF_000715495.1 s__Hymenobacter sp000715495 83.2561 1109 2089 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Cytophagales;f__Hymenobacteraceae;g__Hymenobacter 95.0 N/A N/A N/A N/A 1 - GCF_017571495.1 s__Hymenobacter negativus 83.0476 1304 2089 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Cytophagales;f__Hymenobacteraceae;g__Hymenobacter 95.0 N/A N/A N/A N/A 1 - GCF_015694725.1 s__Hymenobacter jeongseonensis 82.9171 1140 2089 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Cytophagales;f__Hymenobacteraceae;g__Hymenobacter 95.0 N/A N/A N/A N/A 1 - GCF_000972495.1 s__Hymenobacter terrenus 82.172 1112 2089 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Cytophagales;f__Hymenobacteraceae;g__Hymenobacter 95.0 N/A N/A N/A N/A 1 - GCF_014699115.1 s__Hymenobacter sp014699115 81.7664 975 2089 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Cytophagales;f__Hymenobacteraceae;g__Hymenobacter 95.0 N/A N/A N/A N/A 1 - GCF_003583925.1 s__Hymenobacter rubripertinctus 79.3899 753 2089 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Cytophagales;f__Hymenobacteraceae;g__Hymenobacter 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2024-01-24 11:00:57,035] [INFO] GTDB search result was written to GCF_017921975.2_ASM1792197v2_genomic.fna/result_gtdb.tsv [2024-01-24 11:00:57,035] [INFO] ===== GTDB Search completed ===== [2024-01-24 11:00:57,040] [INFO] DFAST_QC result json was written to GCF_017921975.2_ASM1792197v2_genomic.fna/dqc_result.json [2024-01-24 11:00:57,040] [INFO] DFAST_QC completed! [2024-01-24 11:00:57,040] [INFO] Total running time: 0h3m7s