Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Objective: The objective of this homework is familiarize you with the performanc

ID: 3551947 • Letter: O

Question

Objective: The objective of this homework is familiarize you with the performance characteristics of Quicksort under normal and worst case conditions. The assignment will require some programming and interpretation of the results.

Assignment: Using the pseudocode provided in the Cormen text for Quicksort, implement a working program that uses Quicksort to sort the elements in an array. In order to make meaningful comparisons, your array should be fairly large. (Try n=10,000 and see if you observe measurable changes in performance. You might also try a larger array if the time measurements do not vary significantly.) Do not use the library function, as it may include features that will prevent you from observing the differences between expected and worst case performance.

I suggest that your program be implemented in c, or c++, so that you can use debugging tools, such as gdb and ddd, if necessary.

Part I

Generate a main program which initializes your data array to be sorted

Explanation / Answer

Analysis of Quick sort



Worst-case


Let T(n) be the worst-case time for QUICK SORT on input size n. We have a recurrence


T(n) = max1?q?n-1 (T(q) + T(n-q)) + (n) --------- 1


where q runs from 1 to n-1, since the partition produces two regions, each having size at least 1.


Now we guess that T(n) ? cn2 for some constant c.


Substituting our guess in equation 1.We get



T(n) = max1?q?n-1 (cq2 ) + c(n - q2)) + (n)

= c max (q2 + (n - q)2) + (n)


Since the second derivative of expression q2 + (n-q)2 with respect to q is positive. Therefore, expression achieves a maximum over the range 1? q ? n -1 at one of the endpoints. This gives the bound max (q2 + (n - q)2)) 1 + (n -1)2 = n2 + 2(n -1).


Continuing with our bounding of T(n) we get


T(n) ? c [n2 - 2(n-1)] + (n)

= cn2 - 2c(n-1) + (n)


Since we can pick the constant so that the 2c(n -1) term dominates the (n) term we have


T(n) ? cn2


Thus the worst-case running time of quick sort is (n2).



Average-case Analysis


If the split induced by RANDOMIZED_PARTITION puts constant fraction of elements on one side of the partition, then the recurrence tree has depth (lgn) and (n) work is performed at (lg n) of these level. This is an intuitive argument why the average-case running time of RANDOMIZED_QUICKSORT is (n lg n).


Let T(n) denotes the average time required to sort an array of n elements. A call to RANDOMIZED_QUICKSORT with a 1 element array takes a constant time, so we have T(1) = (1).


After the split RANDOMIZED_QUICKSORT calls itself to sort two subarrays. The average time to sort an array A[1 . . q] is T[q] and the average time to sort an array A[q+1 . . n] is T[n-q]. We have


T(n) = 1/n (T(1) + T(n-1) + n-1?q=1 T(q) + T(n-q))) + (n) ----- 1


We know from worst-case analysis


T(1) = (1) and T(n -1) = O(n2)

T(n) = 1/n ((1) + O(n2)) + 1/n n-1?q=1 (r(q) + T(n - q)) + (n)

= 1/n n-1?q=1(T(q) + T(n - q)) + (n) ------- 2

= 1/n[2 n-1?k=1(T(k)] + (n)

= 2/n n-1?k=1(T(k) + (n) --------- 3


Solve the above recurrence using substitution method. Assume inductively that T(n) ? anlgn + b for some constants a > 0 and b > 0.


If we can pick 'a' and 'b' large enough so that n lg n + b > T(1). Then for n > 1, we have


T(n) ? n-1?k=1 2/n (aklgk + b) + (n)

= 2a/n n-1?k=1 klgk - 1/8(n2) + 2b/n (n -1) + (n) ------- 4


At this point we are claiming that


n-1?k=1 klgk ? 1/2 n2 lgn - 1/8(n2)


Stick this claim in the equation 4 above and we get


T(n) ? 2a/n [1/2 n2 lgn - 1/8(n2)] + 2/n b(n -1) + (n)

? anlgn - an/4 + 2b + (n) ---------- 5


In the above equation, we see that (n) + b and an/4 are polynomials and we certainly can choose 'a' large enough so that an/4 dominates (n) + b.


We conclude that QUICKSORT's average running time is (n lg(n)).




Conclusion


Quick sort is an in place sorting algorithm whose worst-case running time is (n2) and expected running time is (n lg n) where constants hidden in (n lg n) are small.