[2023-03-15 05:51:32,071] [INFO] DFAST_QC pipeline started. [2023-03-15 05:51:32,071] [INFO] DFAST_QC version: 0.5.7 [2023-03-15 05:51:32,071] [INFO] DQC Reference Directory: /var/lib/cwl/stg7e9962fa-429d-4112-af13-a28fde190581/dqc_reference [2023-03-15 05:51:33,647] [INFO] ===== Start taxonomy check using ANI ===== [2023-03-15 05:51:33,648] [INFO] Task started: Prodigal [2023-03-15 05:51:33,648] [INFO] Running command: cat /var/lib/cwl/stg2d66afb6-c0d4-4412-92f8-b495d9bdc629/OceanDNA-b6759.fa | prodigal -d OceanDNA-b6759/cds.fna -a OceanDNA-b6759/protein.faa -g 11 -q > /dev/null [2023-03-15 05:51:58,395] [INFO] Task succeeded: Prodigal [2023-03-15 05:51:58,395] [INFO] Task started: HMMsearch [2023-03-15 05:51:58,395] [INFO] Running command: hmmsearch --tblout OceanDNA-b6759/hmmer_result.tsv -E 1E-50 /var/lib/cwl/stg7e9962fa-429d-4112-af13-a28fde190581/dqc_reference/reference_markers.hmm OceanDNA-b6759/protein.faa > /dev/null [2023-03-15 05:51:58,619] [INFO] Task succeeded: HMMsearch [2023-03-15 05:51:58,620] [INFO] Found 6/6 markers. [2023-03-15 05:51:58,650] [INFO] Query marker FASTA was written to OceanDNA-b6759/markers.fasta [2023-03-15 05:51:58,651] [INFO] Task started: Blastn [2023-03-15 05:51:58,652] [INFO] Running command: blastn -query OceanDNA-b6759/markers.fasta -db /var/lib/cwl/stg7e9962fa-429d-4112-af13-a28fde190581/dqc_reference/reference_markers.fasta -out OceanDNA-b6759/blast.markers.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-15 05:51:59,280] [INFO] Task succeeded: Blastn [2023-03-15 05:51:59,285] [INFO] Selected 22 target genomes. [2023-03-15 05:51:59,285] [INFO] Target genome list was writen to OceanDNA-b6759/target_genomes.txt [2023-03-15 05:51:59,299] [INFO] Task started: fastANI [2023-03-15 05:51:59,299] [INFO] Running command: fastANI --query /var/lib/cwl/stg2d66afb6-c0d4-4412-92f8-b495d9bdc629/OceanDNA-b6759.fa --refList OceanDNA-b6759/target_genomes.txt --output OceanDNA-b6759/fastani_result.tsv --threads 1 [2023-03-15 05:52:15,496] [INFO] Task succeeded: fastANI [2023-03-15 05:52:15,496] [INFO] Loading species specific ANI threshold from /var/lib/cwl/stg7e9962fa-429d-4112-af13-a28fde190581/dqc_reference/prokaryote_ANI_species_specific_threshold.txt [2023-03-15 05:52:15,496] [WARNING] Species-specific ANI threshold file not found. Will use the default threshold for all species. [/var/lib/cwl/stg7e9962fa-429d-4112-af13-a28fde190581/dqc_reference/prokaryote_ANI_species_specific_threshold.txt] [2023-03-15 05:52:15,509] [INFO] Found 22 fastANI hits (0 hits with ANI > threshold) [2023-03-15 05:52:15,509] [INFO] The taxonomy check result is classified as 'below_threshold'. [2023-03-15 05:52:15,510] [INFO] DFAST Taxonomy check final result -------------------------------------------------------------------------------- organism_name strain accession taxid species_taxid relation_to_type validated ani matched_fragments total_fragments ani_threshold status Changchengzhania lutea strain=SM1355 GCA_006974145.1 2049305 2049305 type True 79.3971 479 1274 95 below_threshold Algibacter marinivivus strain=ZY111 GCA_003143755.1 2100723 2100723 type True 78.9718 501 1274 95 below_threshold Arenitalea lutea strain=P7-3-5 GCA_000283015.1 1178825 1178825 type True 78.6955 471 1274 95 below_threshold Arenitalea lutea strain=CGMCC 1.12213 GCA_900141715.1 1178825 1178825 type True 78.6935 468 1274 95 below_threshold Mariniflexile fucanivorans strain=DSM 18792 GCA_004341235.1 264023 264023 type True 78.6906 487 1274 95 below_threshold Algibacter amylolyticus strain=RU-4-M-4 GCA_008630605.1 1608400 1608400 type True 78.614 486 1274 95 below_threshold Algibacter amylolyticus strain=DSM 29199 GCA_014202225.1 1608400 1608400 type True 78.6028 489 1274 95 below_threshold Algibacter amylolyticus strain=RU-4-M-4 GCA_007559325.1 1608400 1608400 type True 78.5962 488 1274 95 below_threshold Flavivirga rizhaonensis strain=RZ03 GCA_004791695.1 2559571 2559571 type True 78.5606 463 1274 95 below_threshold Algibacter pectinivorans strain=DSM 25730 GCA_900112595.1 870482 870482 type True 78.5302 467 1274 95 below_threshold Mariniflexile gromovii strain=KCTC 12570 GCA_017814435.1 362523 362523 type True 78.4911 462 1274 95 below_threshold Algibacter pacificus strain=H164 GCA_008033385.1 2599389 2599389 type True 78.4186 399 1274 95 below_threshold Flavivirga algicola strain=Y03 GCA_012910715.1 2729136 2729136 type True 78.2561 425 1274 95 below_threshold Algibacter alginicilyticus strain=HZ22 GCA_001310225.1 1736674 1736674 type True 77.9543 408 1274 95 below_threshold Confluentibacter sediminis strain=DSL-48 GCA_003258355.1 2219045 2219045 type True 77.9242 365 1274 95 below_threshold Seonamhaeicola maritimus strain=1505 GCA_008056315.1 2591822 2591822 type True 77.9087 392 1274 95 below_threshold Confluentibacter flavum strain=3B GCA_002843175.1 1909700 1909700 type True 77.8917 383 1274 95 below_threshold Flavivirga eckloniae strain=ECD14 GCA_002886045.1 1803846 1803846 type True 77.8332 441 1274 95 below_threshold Tamlana crocina strain=HST1-43 GCA_012037625.1 393006 393006 type True 77.3898 300 1274 95 below_threshold Pontimicrobium aquaticum strain=CAU 1491 GCA_005047595.1 2565367 2565367 type True 77.2989 225 1274 95 below_threshold Hanstruepera marina strain=NBU2968 GCA_019880635.1 2873265 2873265 type True 77.2371 222 1274 95 below_threshold Hanstruepera flava strain=NBU2984 GCA_023634025.1 2930218 2930218 type True 77.1706 235 1274 95 below_threshold -------------------------------------------------------------------------------- [2023-03-15 05:52:15,510] [INFO] DFAST Taxonomy check result was written to OceanDNA-b6759/tc_result.tsv [2023-03-15 05:52:15,511] [INFO] ===== Taxonomy check completed ===== [2023-03-15 05:52:15,511] [INFO] ===== Start completeness check using CheckM ===== [2023-03-15 05:52:15,511] [INFO] Setting CHECKM_DATA_PATH to /var/lib/cwl/stg7e9962fa-429d-4112-af13-a28fde190581/dqc_reference/checkm_data [2023-03-15 05:52:15,512] [INFO] Selected 'Prokaryote' markers (life, taxid=0) for CheckM [2023-03-15 05:52:15,523] [INFO] Task started: CheckM [2023-03-15 05:52:15,523] [INFO] Running command: checkm taxonomy_wf --tab_table -f OceanDNA-b6759/cc_result.tsv -t 1 life "Prokaryote" OceanDNA-b6759/checkm_input OceanDNA-b6759/checkm_result [2023-03-15 05:53:17,366] [INFO] Task succeeded: CheckM [2023-03-15 05:53:17,367] [INFO] Completeness check finished. -------------------------------------------------------------------------------- Completeness: 79.17% Contamintation: 0.00% Strain heterogeneity: 0.00% -------------------------------------------------------------------------------- [2023-03-15 05:53:17,377] [INFO] ===== Completeness check finished ===== [2023-03-15 05:53:17,377] [INFO] ===== Start GTDB Search ===== [2023-03-15 05:53:17,377] [INFO] Query marker FASTA already exists. Will reuse it. (OceanDNA-b6759/markers.fasta) [2023-03-15 05:53:17,379] [INFO] Task started: Blastn [2023-03-15 05:53:17,379] [INFO] Running command: blastn -query OceanDNA-b6759/markers.fasta -db /var/lib/cwl/stg7e9962fa-429d-4112-af13-a28fde190581/dqc_reference/reference_markers_gtdb.fasta -out OceanDNA-b6759/blast.markers.gtdb.tsv -outfmt 6 -max_hsps 1 -num_alignments 5 [2023-03-15 05:53:18,309] [INFO] Task succeeded: Blastn [2023-03-15 05:53:18,313] [INFO] Selected 20 target genomes. [2023-03-15 05:53:18,313] [INFO] Target genome list was writen to OceanDNA-b6759/target_genomes_gtdb.txt [2023-03-15 05:53:18,328] [INFO] Task started: fastANI [2023-03-15 05:53:18,328] [INFO] Running command: fastANI --query /var/lib/cwl/stg2d66afb6-c0d4-4412-92f8-b495d9bdc629/OceanDNA-b6759.fa --refList OceanDNA-b6759/target_genomes_gtdb.txt --output OceanDNA-b6759/fastani_result_gtdb.tsv --threads 1 [2023-03-15 05:53:33,127] [INFO] Task succeeded: fastANI [2023-03-15 05:53:33,138] [INFO] Found 19 fastANI hits (0 hits with ANI > circumscription radius) [2023-03-15 05:53:33,138] [INFO] GTDB search result -------------------------------------------------------------------------------- accession gtdb_species ani matched_fragments total_fragments gtdb_taxonomy ani_circumscription_radius mean_intra_species_ani min_intra_species_ani mean_intra_species_af min_intra_species_af num_clustered_genomes status GCF_900114265.1 s__Flaviramulus basaltis 79.6004 552 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flaviramulus 95.0 N/A N/A N/A N/A 1 - GCF_006974145.1 s__Changchengzhania lutea 79.3915 480 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Changchengzhania 95.0 N/A N/A N/A N/A 1 - GCF_000789235.1 s__Wocania ichthyoenteri 79.3571 554 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Wocania 95.0 N/A N/A N/A N/A 1 - GCA_905480415.1 s__Algibacter_B sp905480415 79.0534 432 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Algibacter_B 95.0 N/A N/A N/A N/A 1 - GCF_003143755.1 s__Algibacter_B sp003143755 78.976 500 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Algibacter_B 95.0 N/A N/A N/A N/A 1 - GCA_013001275.1 s__Flaviramulus sp013001275 78.9012 405 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Flaviramulus 95.0 N/A N/A N/A N/A 1 - GCF_002741945.1 s__Wocania sp002741945 78.7784 460 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Wocania 95.0 N/A N/A N/A N/A 1 - GCF_900141715.1 s__Arenitalea lutea 78.6877 468 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Arenitalea 95.0 100.00 100.00 1.00 1.00 2 - GCF_004341235.1 s__Mariniflexile fucanivorans 78.6675 489 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Mariniflexile 95.0 N/A N/A N/A N/A 1 - GCF_001747085.1 s__Algibacter_C aquaticus 78.6072 436 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Algibacter_C 95.0 N/A N/A N/A N/A 1 - GCF_014202225.1 s__Algibacter_B amylolyticus 78.595 490 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Algibacter_B 95.0 100.00 100.00 1.00 1.00 3 - GCF_017814435.1 s__Mariniflexile gromovii 78.5345 457 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Mariniflexile 95.0 N/A N/A N/A N/A 1 - GCF_008033385.1 s__Algibacter pacificus 78.4146 400 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Algibacter 95.0 N/A N/A N/A N/A 1 - GCF_015355625.1 s__Tamlana_A sp015355625 78.3579 412 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Tamlana_A 95.0 N/A N/A N/A N/A 1 - GCF_009796805.1 s__Algibacter sp009796805 78.3095 442 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Algibacter 95.0 N/A N/A N/A N/A 1 - GCF_018860245.1 s__Tamlana_A agarivorans_A 78.1818 361 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Tamlana_A 95.0 N/A N/A N/A N/A 1 - GCF_002973595.1 s__Jejuia pallidilutea 77.9499 398 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Jejuia 95.0 98.03 97.97 0.83 0.83 4 - GCF_001310225.1 s__Algibacter_A alginicilyticus 77.9484 410 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Algibacter_A 95.0 N/A N/A N/A N/A 1 - GCF_008056315.1 s__Seonamhaeicola maritimus 77.9178 391 1274 d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Flavobacteriales;f__Flavobacteriaceae;g__Seonamhaeicola 95.0 N/A N/A N/A N/A 1 - -------------------------------------------------------------------------------- [2023-03-15 05:53:33,141] [INFO] GTDB search result was written to OceanDNA-b6759/result_gtdb.tsv [2023-03-15 05:53:33,144] [INFO] ===== GTDB Search completed ===== [2023-03-15 05:53:33,148] [INFO] DFAST_QC result json was written to OceanDNA-b6759/dqc_result.json [2023-03-15 05:53:33,148] [INFO] DFAST_QC completed! [2023-03-15 05:53:33,148] [INFO] Total running time: 0h2m1s