Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

c programming. standard c gcc compiler. Superstring Assembly Suppose that multip

ID: 3748258 • Letter: C

Question

c programming. standard c gcc compiler.

Superstring Assembly Suppose that multiple copies of a long string are cut up into much smaller the available fragments, is it possible to rebuild the larger string? For example, if the original string was "algorithmsarefunfunfun", and you got given a list of fragments like "refun" and "unfun" "rith" and "1go" and so on, plus lots more small fragments, could you rebuild the original string? Probably you could, right? But what about if the original string was millions of characters long, like War and Peace? Or billions of characters long, like your genetic sequence? pieces. If you are given Let's ask a slightly different question now: if you got given a set of small string fragments each from a larger string that you didn't know anything about, could you at least find the shortest string in which every supplied fragment occurs at least once? This problem is known as the shortest superstring problem, and can be thought of as having parameters n, the number of string fragments provided, and m, the total length of those string fragments. The shortest superstring problem is very important in bioinformatics, where genome reassembly from short fragments call understanding of the behavior of organisms. ed reads helps contribute to our Unfortunately, finding optimal solutions to the shortest superstring problem has been shown to be very challenging, and falls into a family of intractable problems that we'll talk about later in the semester. In this project you will implement some approximation techniques that generate non-optimal superstrings from a collection of string fragments. Input Data Input to your program will (always) come from stdin, and will (always) consist of strictly lower-case alphabetic strings, one per input line, each line terminated by a single newline 'n' character. Be sure to read the message on the FAQ page about newlines. For example, the following file test0.txt containing six fragments is a valid input: accgtcgatg gcctag gtacctacta cgatgcc tcgatgccgca atgagaccgtc As you develop your program according to the stages listed below, the output will evolve. Full output examples for this and two other test files are linked from the FAQ page. You should also check your program against other inputs too; and can easily generate your own tests using a text editor. Testing and debugging is your responsibility

Explanation / Answer

#include <bits/stdc++.h>
using namespace std;

// Utility function to calculate minimum of two numbers
int min(int a, int b)
{
return (a < b) ? a : b;
}

// Function to calculate maximum overlap in two given strings
int overlappingPair(string st1, string st2, string &str)
{
// max will store maximum overlap i.e maximum
// length of the matching prefix and suffix
int max = INT_MIN;
int le1 = st1.length();
int le2 = st2.length();

// check suffix of str1 matches with prefix of str2
for (int i = 1; i <= min(le1, le2); i++)
{
// compare last i characters in str1 with first i
// characters in str2
if (st1.compare(le1-i, i, st2, 0, i) == 0)
{
if (max < i)
{
//update max and str
max = i;
str = st1 + st2.substr(i);
}
}
}

// check prefix of str1 matches with suffix of str2
for (int i = 1; i <= min(le1, le2); i++)
{
// compare first i characters in str1 with last i
// characters in str2
if (st1.compare(0, i, st2, le2-i, i) == 0)
{
if (max < i)
{
//update max and str
max = i;
str = st2 + st1.substr(i);
}
}
}

return max;
}

// Function to calculate smallest string that contains
// each string in the given set as substring.
string shortestSuperstring(string ar[], int le)
{
// run len-1 times to consider every pair
while(le != 1)
{
int max = INT_MIN; // to store maximum overlap
int l, r; // to store array index of strings
// involved in maximum overlap
string reStr; // to store resultant string after
// maximum overlap

for (int i = 0; i < le; i++)
{
for (int j = i + 1; j < le; j++)
{
string str;

// res will store maximum length of the matching
// prefix and suffix str is passed by reference and
// will store the resultant string after maximum
// overlap of arr[i] and arr[j], if any.
int re = overlappingPair(ar[i], ar[j], str);

// check for maximum overlap
if (max < re)
{
max = re;
reStr.assign(str);
l = i, r = j;
}
}
}

le--; //ignore last element in next cycle

// if no overlap, append arr[len] to arr[0]
if (max == INT_MIN)
ar[0] += ar[le];
else
{
ar[l] = reStr; // copy resultant string to index l
ar[r] = ar[le]; // copy string at last index to index r
}
}
return ar[0];
}

// Driver program
int main()
{
string arr[] = {"catgc", "ctaagt", "gcta", "ttca", "atgcatc"};
int len = sizeof(arr)/sizeof(arr[0]);

cout << "The Shortest Superstring is "
<< shortestSuperstring(arr, len);

return 0;
}

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote