[2023-03-16 11:01:17,064] [INFO] DFAST_QC pipeline started. [2023-03-16 11:01:17,064] [INFO] DFAST_QC version: 0.5.7 [2023-03-16 11:01:17,064] [INFO] DQC Reference Directory: /var/lib/cwl/stgbcef1bd8-872b-4d8a-81e0-1a7358649d5d/dqc_reference [2023-03-16 11:01:18,270] [INFO] ===== Start taxonomy check using ANI ===== [2023-03-16 11:01:18,271] [INFO] Task started: Prodigal [2023-03-16 11:01:18,271] [INFO] Running command: cat /var/lib/cwl/stgfe07a68b-bb0f-4179-95aa-9e6aabdb6926/OceanDNA-b27545.fa | prodigal -d OceanDNA-b27545/cds.fna -a OceanDNA-b27545/protein.faa -g 11 -q > /dev/null [2023-03-16 11:01:44,276] [INFO] Task succeeded: Prodigal [2023-03-16 11:01:44,276] [INFO] Task started: HMMsearch [2023-03-16 11:01:44,277] [INFO] Running command: hmmsearch --tblout OceanDNA-b27545/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stgbcef1bd8-872b-4d8a-81e0-1a7358649d5d/dqc_reference/reference_markers.hmm OceanDNA-b27545/protein.faa > /dev/null [2023-03-16 11:01:44,463] [INFO] Task succeeded: HMMsearch [2023-03-16 11:01:44,464] [WARNING] Found 4/6 markers. [/var/lib/cwl/stgfe07a68b-bb0f-4179-95aa-9e6aabdb6926/OceanDNA-b27545.fa] [2023-03-16 11:01:44,486] [INFO] Query marker FASTA was written to OceanDNA-b27545/markers.fasta [2023-03-16 11:01:44,487] [INFO] Task started: Blastn [2023-03-16 11:01:44,488] [INFO] Running command: blastn -query OceanDNA-b27545/markers.fasta -db /var/lib/cwl/stgbcef1bd8-872b-4d8a-81e0-1a7358649d5d/dqc_reference/reference_markers.fasta -out OceanDNA-b27545/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-16 11:01:45,075] [INFO] Task succeeded: Blastn [2023-03-16 11:01:45,077] [INFO] Selected 30 target genomes. [2023-03-16 11:01:45,077] [INFO] Target genome list was writen to OceanDNA-b27545/target_genomes.txt [2023-03-16 11:01:45,110] [INFO] Task started: fastANI [2023-03-16 11:01:45,111] [INFO] Running command: fastANI --query /var/lib/cwl/stgfe07a68b-bb0f-4179-95aa-9e6aabdb6926/OceanDNA-b27545.fa --refList OceanDNA-b27545/target_genomes.txt --output OceanDNA-b27545/fastani_result.tsv --threads 1 [2023-03-16 11:02:04,072] [INFO] Task succeeded: fastANI [2023-03-16 11:02:04,073] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stgbcef1bd8-872b-4d8a-81e0-1a7358649d5d/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-03-16 11:02:04,073] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stgbcef1bd8-872b-4d8a-81e0-1a7358649d5d/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-03-16 11:02:04,090] [INFO] Found 30 fastANI hits (0 hits with ANI > threshold) [2023-03-16 11:02:04,090] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-03-16 11:02:04,090] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Nioella nitratireducens strain=SSW136 GCA_001879715.1 1287720 1287720 type True 76.5552 104 772 95 below_threshold Pseudooceanicola antarcticus strain=Ar-45 GCA_002786285.1 1247613 1247613 type True 76.54 108 772 95 below_threshold Cereibacter sediminicola strain=JA983 GCA_007668225.1 2584941 2584941 type True 76.5152 100 772 95 below_threshold Qingshengfaniella alkalisoli strain=LN3S51 GCA_007855645.1 2599296 2599296 type True 76.5126 66 772 95 below_threshold Alexandriicola marinus strain=LZ-14 GCA_004000435.1 2081710 2081710 type True 76.48 124 772 95 below_threshold Brevirhabdus pacifica strain=DSM 27767 GCA_002797755.1 1267768 1267768 type True 76.4525 103 772 95 below_threshold Brevirhabdus pacifica strain=22DY15 GCA_002094875.1 1267768 1267768 type True 76.4362 95 772 95 below_threshold Roseovarius atlanticus strain=R12B GCA_001441615.1 1641875 1641875 type True 76.4117 106 772 95 below_threshold Rhodovulum sulfidophilum strain=DSM 1374 GCA_001633165.1 35806 35806 type True 76.3942 117 772 95 below_threshold Rhodovulum sulfidophilum strain=DSM 1374 GCA_000520135.2 35806 35806 type True 76.3933 117 772 95 below_threshold Pseudopuniceibacterium sediminis strain=CY03 GCA_003575965.1 2211117 2211117 type True 76.383 85 772 95 below_threshold Rhabdonatronobacter sediminivivens strain=IM2376 GCA_013415485.1 2743469 2743469 type True 76.2977 80 772 95 below_threshold Pseudooceanicola algae strain=Lw-13e GCA_003590145.2 1537215 1537215 type True 76.2943 107 772 95 below_threshold Phaeobacter italicus strain=DSM 26436 GCA_900113345.1 481446 481446 type True 76.227 98 772 95 below_threshold Pseudogemmobacter bohemicus strain=Cd-10 GCA_003290025.1 2250708 2250708 type True 76.2237 61 772 95 below_threshold Oceaniglobus trochenteri strain=G4 GCA_020529025.1 2763260 2763260 type True 76.2142 122 772 95 below_threshold Phaeobacter italicus strain=CECT 7645 GCA_001404195.1 481446 481446 type True 76.2014 98 772 95 below_threshold Phaeobacter italicus strain=CECT 7645 GCA_001258055.1 481446 481446 type True 76.2014 98 772 95 below_threshold Pseudosulfitobacter pseudonitzschiae strain=H3 GCA_000712315.1 1402135 1402135 type True 76.1848 94 772 95 below_threshold Pseudosulfitobacter pseudonitzschiae strain=DSM 26824 GCA_900129395.1 1402135 1402135 type True 76.1659 95 772 95 below_threshold Mangrovicoccus algicola strain=HB182678 GCA_014903745.1 2771008 2771008 type True 76.1389 92 772 95 below_threshold Sulfitobacter sabulilitoris strain=HSMS-29 GCA_005887615.1 2562655 2562655 type True 76.1258 99 772 95 below_threshold Pseudooceanicola endophyticus strain=CBS1P-1 GCA_018760365.1 2841273 2841273 type True 76.1176 109 772 95 below_threshold Pacificitalea manganoxidans strain=DY25 GCA_002504165.1 1411902 1411902 type True 76.104 96 772 95 below_threshold Ruegeria sediminis strain=CAU 1488 GCA_005938215.1 2583820 2583820 type True 76.0942 119 772 95 below_threshold Pararhodobacter aggregans strain=D1-19 GCA_003075525.1 404875 404875 type True 76.0615 86 772 95 below_threshold Pararhodobacter aggregans strain=DSM 18938 GCA_003054005.1 404875 404875 type True 76.0608 85 772 95 below_threshold Rubellimicrobium roseum strain=YIM 48858 GCA_006152145.1 687525 687525 type True 75.9904 80 772 95 below_threshold Gemmobacter fulva strain=con5 GCA_018798885.1 2840474 2840474 type True 75.8608 105 772 95 below_threshold Rubellimicrobium mesophilum strain=DSM 19309 GCA_000600335.2 1123067 1123067 type True 75.7065 76 772 95 below_threshold -------------------------------------------------------------------------------- [2023-03-16 11:02:04,090] [INFO] DFAST Taxonomy check result was written to OceanDNA-b27545/tc_result.tsv [2023-03-16 11:02:04,091] [INFO] ===== Taxonomy check completed ===== [2023-03-16 11:02:04,091] [INFO] ===== Start completeness check using CheckM ===== [2023-03-16 11:02:04,091] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stgbcef1bd8-872b-4d8a-81e0-1a7358649d5d/dqc_reference/checkm_data [2023-03-16 11:02:04,092] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-03-16 11:02:04,097] [INFO] Task started: CheckM [2023-03-16 11:02:04,097] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b27545/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b27545/checkm_input OceanDNA-b27545/checkm_result [2023-03-16 11:02:45,667] [INFO] Task succeeded: CheckM [2023-03-16 11:02:45,668] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 60.37% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-03-16 11:02:45,673] [INFO] ===== Completeness check finished ===== [2023-03-16 11:02:45,674] [INFO] ===== Start GTDB Search ===== [2023-03-16 11:02:45,674] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b27545/markers.fasta) [2023-03-16 11:02:45,675] [INFO] Task started: Blastn [2023-03-16 11:02:45,675] [INFO] Running command: blastn -query OceanDNA-b27545/markers.fasta -db /var/lib/cwl/stgbcef1bd8-872b-4d8a-81e0-1a7358649d5d/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b27545/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-16 11:02:46,551] [INFO] Task succeeded: Blastn [2023-03-16 11:02:46,554] [INFO] Selected 29 target genomes. [2023-03-16 11:02:46,554] [INFO] Target genome list was writen to OceanDNA-b27545/target_genomes_gtdb.txt [2023-03-16 11:02:46,590] [INFO] Task started: fastANI [2023-03-16 11:02:46,590] [INFO] Running command: fastANI --query /var/lib/cwl/stgfe07a68b-bb0f-4179-95aa-9e6aabdb6926/OceanDNA-b27545.fa --refList OceanDNA-b27545/target_genomes_gtdb.txt --output OceanDNA-b27545/fastani_result_gtdb.tsv --threads 1 [2023-03-16 11:03:10,362] [INFO] Task succeeded: fastANI [2023-03-16 11:03:10,378] [INFO] Found 29 fastANI hits (0 hits with ANI > circumscription radius) [2023-03-16 11:03:10,379] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCA_017792445.1 s__Rhodobacter_B sp017792445 76.8459 149 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodobacter_B 95.0 N/A N/A N/A N/A 1 - GCF_016412875.1 s__Palleronia pontilimi 76.603 138 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Palleronia 95.0 N/A N/A N/A N/A 1 - GCF_001879715.1 s__Nioella nitratireducens 76.5552 104 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Nioella 95.0 N/A N/A N/A N/A 1 - GCF_002869745.1 s__Oceaniglobus roseus 76.518 126 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Oceaniglobus 95.0 N/A N/A N/A N/A 1 - GCF_007855645.1 s__Qingshengfaniella alkalisoli 76.5126 66 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Qingshengfaniella 95.0 N/A N/A N/A N/A 1 - GCF_002786325.1 s__Pseudooceanicola_C lipolyticus 76.506 104 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Pseudooceanicola_C 95.0 N/A N/A N/A N/A 1 - GCA_006227805.1 s__Aestuariivita sp006227805 76.4841 90 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Aestuariivita 95.0 N/A N/A N/A N/A 1 - GCF_003335585.1 s__Sulfitobacter sp003335585 76.4819 111 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sulfitobacter 95.0 N/A N/A N/A N/A 1 - GCA_013002465.1 s__JABDJO01 sp013002465 76.4758 113 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__JABDJO01 95.0 N/A N/A N/A N/A 1 - GCF_018224955.1 s__Aestuariicoccus sp018224955 76.4362 107 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Aestuariicoccus 95.0 N/A N/A N/A N/A 1 - GCF_001441615.1 s__Roseovarius atlanticus 76.4117 106 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius 95.0 N/A N/A N/A N/A 1 - GCF_001633165.1 s__Rhodovulum sulfidophilum 76.3942 117 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rhodovulum 95.0 97.78 97.16 0.92 0.88 14 - GCF_003111685.1 s__Salipiger pacificus_A 76.3794 108 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Salipiger 95.0 97.69 97.64 0.90 0.89 3 - GCF_004799285.1 s__Gemmobacter_D aestuarii 76.3606 119 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Gemmobacter_D 95.0 N/A N/A N/A N/A 1 - GCF_011806405.1 s__Pseudooceanicola sp011806405 76.3551 92 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Pseudooceanicola 95.0 N/A N/A N/A N/A 1 - GCF_009835145.1 s__Profundibacterium mesophilum 76.3141 95 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Profundibacterium 95.0 N/A N/A N/A N/A 1 - GCF_003590145.2 s__Pseudooceanicola algae 76.3116 106 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Pseudooceanicola 95.0 N/A N/A N/A N/A 1 - GCF_002814095.1 s__Sagittula sp002814095 76.2526 116 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Sagittula 95.0 N/A N/A N/A N/A 1 - GCF_001258055.1 s__Phaeobacter italicus 76.2014 98 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Phaeobacter 95.0 99.00 98.06 0.96 0.92 7 - GCA_009992615.1 s__Oceaniglobus sp009992615 76.1822 60 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Oceaniglobus 95.0 N/A N/A N/A N/A 1 - GCF_900129395.1 s__Ascidiaceihabitans pseudonitzschiae 76.1659 95 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Ascidiaceihabitans 95.0 99.99 99.97 0.98 0.96 4 - GCF_009827095.1 s__Pseudooceanicola sp009827095 76.1466 111 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Pseudooceanicola 95.0 99.99 99.99 0.98 0.98 2 - GCF_900156535.1 s__Roseovarius nanhaiticus 76.1217 87 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius 95.0 99.99 99.99 1.00 1.00 2 - GCA_001730095.1 s__Leisingera sp001730095 76.1047 87 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Leisingera 95.0 N/A N/A N/A N/A 1 - GCF_003075525.1 s__Pararhodobacter aggregans 76.0615 86 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Pararhodobacter 95.0 100.00 100.00 1.00 1.00 2 - GCF_006152145.1 s__Rubellimicrobium roseum 75.9904 80 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rubellimicrobium 95.0 N/A N/A N/A N/A 1 - GCF_017916275.1 s__Rubellimicrobium sp017916275 75.8927 72 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rubellimicrobium 95.0 100.00 100.00 1.00 1.00 2 - GCA_002783205.1 s__Roseovarius sp002783205 75.7445 71 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Roseovarius 95.0 99.90 99.90 0.93 0.93 2 - GCF_000600335.2 s__Rubellimicrobium mesophilum 75.7065 76 772 d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Rubellimicrobium 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-03-16 11:03:10,380] [INFO] GTDB search result was written to OceanDNA-b27545/result_gtdb.tsv [2023-03-16 11:03:10,380] [INFO] ===== GTDB Search completed ===== [2023-03-16 11:03:10,383] [INFO] DFAST_QC result json was written to OceanDNA-b27545/dqc_result.json [2023-03-16 11:03:10,383] [INFO] DFAST_QC completed! [2023-03-16 11:03:10,383] [INFO] Total running time: 0h1m53s