Some of your friends have gotten into the burgeoning field of time-series data m
ID: 3794440 • Letter: S
Question
Some of your friends have gotten into the burgeoning field of time-series data mining, in which one looks for patterns in sequences of events that occur over time. Purchases at stock exchanges-what's being bought- are one source of data with a natural ordering in time. Given a long sequence S of such events, your friends want an efficient way to detect certain "patterns" in them-for example, they may want to know if the four events buy Yahoo, buy eBay, buy Yahoo, buy Oracle occur in this sequence S, in order but not necessarily consecutively. They begin with a collection of possible events (e.g., the possible transactions) and a sequence S of n of these events. A given event may occur multiple times in S (e.g., Yahoo stock may be bought many times in a single sequence S). We will say that a sequence S' is a subsequence of S if there is a way to delete certain of the events from S so that the remaining events, in order, are equal to the sequence S'. So, for example, the sequence of four events above is a subsequence of the sequence buy Amazon, buy Yahoo, buy eBay, buy Yahoo, buy Yahoo, buy Oracle Their goal is to be able to dream up short sequences and quickly detect whether they are subsequences of S. So, this is the problem they pose to you: Give an algorithm that takes two sequences of events-S' of length m and S of length n, each possibly containing an event more than once-and decides in time 0(m + n) whether S' is a subsequence of S.Explanation / Answer
SubSequence
A subsequence is a sequence that can be derived from another sequence by deleting some elements without changing the order of the remaining elements
More formally A subsequence of a sequence A(n) is an infinite collection of numbers from A(n) in the same order that they appear in that sequence.
Naive Algorithm O(2^n * (n+m))
There are total of 2^n different subsequence for a sequence S of length n.(for each of n elements we have 2 option either to include in our sub sequence or not).
we can generate all the 2^n different possible subsequences for S.
Compare each of the subsequence with S' and see if it is equal with S' or not
Optimized Algorithm( O(n+m) )
The idea of the algorithm is to iterate both the string simultaneously using 2 pointer in each string and
whenever a mismatch occurs it is always safe to move the pointer pointing to S ahead.
If we have matched all the character of string S' then S' is subsequence of S otherwise not.
Algortihm
if length of S' is greater than length of S (m > n)
return false (a larger length string cannot be a subsequence of smaller length string)
i = 0(index of first element of S)
j = 0(index of first element of S')
while i < n(length of S)
if S[i] == S'[j]:
move both the pointers
i+=1
j+=1
else
move pointer pointing to S
i+=1
//check if we have matched all character
if j == m(length of S')
return true
else
return false
Code in JAVA
public class Solution {
public boolean isSubsequence(String s, String t) {
if (s.length() == 0) return true;
int indexS = 0, indexT = 0;
while (indexT < t.length()) {
if (t.charAt(indexT) == s.charAt(indexS)) {
indexS++;
if (indexS == s.length()) return true;
}
indexT++;
}
return false;
}
}
Code in C#
bool isSubsequence(string t, string s) {
int a = 0;
if (s == "")
return true;
if (t=="")
return false;
for (int i = 0; i<t.Length;i++){
if (t[i]==s[a])
a++;
if (a>=s.Length)
return true;
}
return false;
}
Code in C++
Code in python
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.