我想玩一下实现Rabin-Karp算法。我看到了这段伪代码:
RABIN -KARP -MATCHER (T, P, d, q)
1 n = T.length
2 m = P.length
3 h = d^(m-1) mod q
4 p=0
5 t= 0
6 for i = 1 to m
/ preprocessing
/
7 p = (dp + P [i]) mod q
8 t = (dt + T [i]) mod q
9 for s = 0 to n-m
/ matching
/
10 if p == t
11 if P [1... m] == T [s + 1...s + m]
12 print “Pattern occurs with shift” s
13 if s < n-m
14 t = (d(t-T[s + 1]h) + T [s + m + 1]) mod q
我使用Python 2.7实现的代码如下:
def Rabin_Karp_Matcher(text, pattern, d, q):
n = len(text)
m = len(pattern)
h = pow(d,m-1)%q
p = 0
t =0
result = []
for i in range(m): # preprocessing
p = (d*p+ord(pattern[i]))%q
t = (d*t+ord(text[i]))%q
for s in range(n-m):
if p == t: # check character by character
match = True
for i in range(m):
if pattern[i] != text[s+i]:
match = False
break
if match:
result = result + [s]
if s < n-m:
t = (d*(t-ord(text[s+1])*h)+ord(text[s+m]))%q #index out of bounds here
return result
结果(result)是一个列表,其中包含模式(pattern)在文本(text)中的索引。
我的代码无法在3141592653589793中找到26,我怀疑这与伪代码第14行定义的哈希码有关。请有人能帮帮我吗?
我传入了以下参数:
P =“26” T =“3141592653589793” d = 257 q = 11
P和T必须是字符(数组)串。
pow(num, exp, modulo)
函数(不知道从什么时候开始),它比pow(num, exp) % modulo
更快。 - undefined