You are given n boxes that can not be opened. Each box contains a secret code. I
ID: 3752747 • Letter: Y
Question
You are given n boxes that can not be opened. Each box contains a secret code. It is possible that two different boxes contain the same secret code. You have a scanner with the following capability: When we place two boxes in the scanner, it tells if the boxes contain the same secret code or not. Each time you do the scanning it costs you a dollar. The goal is to determine if there is a secret message that appears in at least n/2 +1 boxes. Write an algorithm to solve this problem. The cost of the algorithm is the number of times you uses the scanner. You must use divide and conquer to solve this problem. State the recurrence relation for the cost and solve the recurrence relation.Explanation / Answer
You are given n boxes that can not be opened. Each box contains a secret code. It is
possible that two different boxes contain the same secret code. You have a scanner with the
following capability: When we place two boxes in the scanner, it tells if the boxes contain
the same secret code or not. Each time you do the scanning it costs you a dollar. The goal
is to determine if there is a secret message that appears in at least n=2 + 1 boxes. Write an
algorithm to solve this problem. The cost of the algorithm is the number of times you uses
the scanner. You must use divide and conquer to solve this problem. State the recurrence
relation for the cost and solve the recurrence relation.
The problem is similar to identifying a majority element in a array.
Before coming on to the solution ,let us proof a recurrence relation:-
Say we have a recurrence relation as:-
T(n)=2T(n/2)+O(n)
We will prove that above equation evaluates to O(nlogn)
Solution:-
T(n) = 2 T(n/2) + n
= 2 [2 T(n/4) + n/2] + n
= 4 T(n/4) + 2n
= 4 [2 T(n/8) + n/4] + 2n
= 8 T(n/8) + 3n
= 16 T(n/16) + 4n
= 2k T(n/2k) + k n [this is the Eureka! line]
Note that the last line is derived by seeing a pattern --- this is the Eureka/leap of faith/practice with generalizing mathematical patterns part of the problem.
We know that T(1) = 1 and this is a way to end the derivation above. In particular we want T(1) to appear on the right hand side of the = sign. This means we want:
n/2k = 1 OR n = 2k OR log2 n = k
Continuing with the previous derivation we get the following since k = log2 n:
= 2k T(n/2k) + k n
= pow(2,log2n)T(1) + (log2n) n
= n + n log2 n [remember that T(1) = 1]
= O(n log n)
Hence proved.
Algorithm goes like this:-
Let the boxes with secret codes are all given in array A, and we need to identify a secret code or number which appears atleast n/2+1 times. We will be giving this solution in 2 parts:-
1) So for this We split the array A into 2 subarrays A1 and A2 of half the size. We choose the majority element of A1 and A2. After that we do a linear time equality operation to decide whether it is possible to find a majority element
The recurrence relation comes to
T(n)=2T(n/2)+O(n)
which on solving comes to O(nLogn) (proved above)
Pseudocode:-
procedure GetMajorityElement(a[1...n])
Input: Array a of objects
Output: Majority element of a
if n = 1: return a[1]
k = floor(n/2)
elemlsub = GetMajorityElement(a[1...k])
elemrsub = GetMajorityElement(a[k+1...n]
if elemlsub = elemrsub:
return elemlsub
lcount = GetFrequency(a[1...n],elemlsub)
rcount = GetFrequency(a[1...n],elemrsub)
if lcount > k+1:
return elemlsub
else if rcount > k+1:
return elemrsub
else return NO-MAJORITY-ELEMENT
The above method GetFrequency computes the number of times an element (elemlsub or elemrsub) appears in the given array a[1....n].Two calls to GetFrequency is O(n). After that comparisons are done to validate the existence of majority element. GetFrequency is the linear time equality operation.
2) Using the proposed divide-and-conquer operation, indeed it is possible to give a linear time algorithm.Idea is to pair up the elements arbitrarily to get n/2 pairs. In each pair if the two elements are different we discard both of them. If they are same only one of them is kept.
Pseudocode:-
procedure GetMajorityElementLinear(a[1...n])
Input: Array a of objects
Output: Majority element of a
if n = 2:
if a[1] = a[2] return a[1]
else return NO-MAJORITY-ELEMENT
Create a temporary array temp
for i = 1 to n:
if a[i] = a[i+1]:
Insert a[i] into temp
i = i+1
return GetMajorityElementLinear(temp)
procedure CheckSanity(a[1...n])
Input: Array a of objects
Output: Majority element of a
m = GetMajorityElementLinear(a[1...n])
freq = GetFrequency(a[1...n],m)
if freq > floor(n/2)+1:
return m
else return NO-MAJORITY-ELEMENT
The recurrence of the above algorithms comes as T(n)=T(n/2)+O(n)
as we can see the processing of the array in the recursive array is done in O(n) time.
Hence Complexity of the function GetMajorityElementLinear is O(n). CheckSanity is also O(n). Therefore the whole algorithm is linear in terms of n.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.