Some of your friends have gotten into the burgeoning field of time-series data m
ID: 3816588 • Letter: S
Question
Some of your friends have gotten into the burgeoning field of time-series data mining, in which one looks for patterns in sequences of events that occur over time. Purchases at stock exchanges—what’s being bought— are one source of data with a natural ordering in time. Given a long sequence S of such events, your friends want an efficient way to detect certain “patterns” in them—for example, they may want to know if the four events buy Yahoo, buy eBay, buy Yahoo, buy Oracle occur in this sequence S, in order but not necessarily consecutively. They begin with a collection of possible events (e.g., the possible transactions) and a sequence S of n of these events. A given event may occur multiple times in S (e.g., Yahoo stock may be bought many times in a single sequence S). We will say that a sequence S is a subsequence of S if there is a way to delete certain of the events from S so that the remaining events, in order, are equal to the sequence S. So, for example, the sequence of four events above is a subsequence of the sequence buy Amazon, buy Yahoo, buy eBay, buy Yahoo, buy Yahoo, buy Oracle Their goal is to be able to dream up short sequences and quickly detect whether they are subsequences of S. So this is the problem they pose to you: Give an algorithm with pseudocode that takes two sequences of events—S of length m and S of length n, each possibly containing an event more than once—and decides in time O(m + n) whether S is a subsequence of S.
PLEASE ALSO ANALYSE THE RUNNING TIME OF THE ALGORITHM AND PROVE IT'S CORRECTNESS.
Explanation / Answer
Consider event as a single char (say buy yahoo as a single charcter e)
Here is the algorithm
def isSubSequence(string1, string2, m, n):
# Base Cases
if m == 0: return True
if n == 0: return False
# If last event of two strings are matching
if string1[m-1] == string2[n-1]:
return isSubSequence(string1, string2, m-1, n-1)
# If last event are not matching
return isSubSequence(string1, string2, m, n-1)
Please note string1 and string2 can be thought of as list where each list element is an event.
This will run in O(m+n) time as this will checkeck each event only once.
Thsi will work as it is deleting events in S which are not occuring in S' in order from the last, so if S' occurs in S, S will have thsoe events in order form last as well and it will delete all other event to left with S' if there is any event not in order than resulting S would not be same as S'
Please rate positively if this answered your question.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.