5.15 Suppose we want to find the first occurrence of a string P1P2 ··· Pk in a l
ID: 3875430 • Letter: 5
Question
5.15 Suppose we want to find the first occurrence of a string P1P2 ··· Pk in a long input string A1A2 ··· AN. We can solve this problem by hashing the pattern string, obtaining a hash value HP, and comparing this value with the hash value formed from A1A2 ··· Ak, A2A3 ··· Ak+1, A3A4 ··· Ak+2, and so on until ANk+1ANk+2 ··· AN. If we have a match of hash values, we compare the strings character by character to verify the match. We return the position (in A) if the strings actually do match, and we continue in the unlikely event that the match is false.
*a. Show that if the hash value of AiAi+1 ··· Ai+k1 is known, then the hash value of Ai+1Ai+2 ··· Ai+k can be computed in constant time.
b. Show that the running time is O(k + N) plus the time spent refuting false matches.
*c. Show that the expected number of false matches is negligible.
d. Write a program to implement this algorithm.
**e. Describe an algorithm that runs in O(k + N) worst-case time.
**f. Describe an algorithm that runs in O(N/k) average time.
Data Structures and Algorithm Analysis in Java (3rd Edition) Chapter 5, 15E - Missing Textbook Solution
Explanation / Answer
I'm Solved A and B
We will take the ascii values of letter in computation of hash value of a string and we choose a prime number .
our hase function will be like
h = ((Ascii(A[i..i+k-1])) mod p where p is a prime number
so since summation function requires k addition so computation of h will take O(k) time.
Note . A[i] denote the ith character and A[i..i+k-1] denotes the character sequence from A[i] to A[i+k-1] of length k.
A).
Suppose we are given the hash value h of A[i]..A[i+k-1] .
then we can compute the hash h' of A[i+1] .. A[i+k]
h' = h - (Ascii(A[i]) mod p ) + (Ascii(A[i+k]) mod p)
which is similar to
h' = ((Ascii(A[i..i+k-1] mod p )) - (Ascii(A[i]) mod p ) + (Ascii(A[i+k]) mod p)
which is similar to
h' = ((Ascii(A[i+1..i+k] mod p ))
Since we are performing one subtraction and one addition ,this takes constant time.
B)
create an array H of length N-k+1 to store hash value and k is given
1 ) for i = 1 to N-k+1
2 ) compute hash value of A[i..i+k-1] and store in H[i].
Now given S and its hash value h we have to find the index where it matched and refute if match is false.
Match(h,S,A,k)
1 ) found = false;
2 ) for i = 1 to N-k+1 ; do
if ( h == H[i] ) ; then // if hash value is equal
found = stringcmp(S,A[i..i+k-1]);
/*compare two string the string S whose h is given and and A[i..i+k-1]. if the strings are
equal the then return index return true otherwise return false. */
if ( found == true )
return i;
done
3 ) /* if we reach here the we haven;t found the true occurence of string */
return -1
bool stringcmp ( S,T ) {
for i = 1 to k {
if ( S[i] != T[i] ) {
return false;
}
}
return true; // if we reach here we havn't found a mismatch.
}
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.