Summary: | Among other motif finding algorithms Planted Motif Search (PMS) or (ℓ, d) motif search algorithms are considered as next-generation motif finding algorithms. Here, n strings and two integer ℓ and d are provided as input in PMS. It gives outputs of all sequences of length M in each input string, where each occurrence varies from Min at most d points. This paper proposes a new algorithm, for the (ℓ, d) motif discovery problem in which we find all strings of length ℓ that seems in every string of a provided set of strings with at most d mismatches has been. After proper investigation of the PMS problem, it has been shown that this is an NP-hard exact algorithm. Usually, in the worst-case scenarios, all the known exact algorithms for PMS need exponential time in some of the underlying parameters. In this paper, we proposed a faster approach that reduces the searching time in the sample-driven part of the algorithm. In particular, we used dynamic programming techniques to eliminate the recalculation of the values of some common subtree and introduced some new speedup techniques like using a linked list, reduced ℓ-mers for making the algorithm faster. Due to the use of Linked List as a main speedup technique, we named our proposed algorithm as Linked List Implementation technique of PMS8 (LL-PMS8).
|