DNA sequencers produce output in a variety of formats. A common task that biotec
ID: 3659005 • Letter: D
Question
DNA sequencers produce output in a variety of formats. A common task that biotechnicians encounter is writing program to collect the data from these sequencers and put it in a common format for analysis. In this homework, you will start with a shell program that is provided to you and write two file parsers. In each provided data file, you will see that there are long sequences of characters that form the DNA sequence. The letters in the sequence (ACGT) represent the nucleic acid bases that make up DNA. The parsers should extract only those sequences, remove all the spaces in them if necessary, and change them to uppercase letters so that that the results from different file formats can be compared. Theres three files: 1. (embl_data.txt) ID AB000263 standard; RNA; PRI; 368 BP. XX AC AB000263; XX DE Homo sapiens mRNA for prepro cortistatin like peptide, complete cds. XX SQ Sequence 368 BP; acaagatgcc attgtccccc ggcctcctgc tgctgctgct ctccggggcc acggccaccg 60 ctgccctgcc cctggagggt ggccccaccg gccgagacag cgagcatatg caggaagcgg 120 caggaataag gaaaagcagc ctcctgactt tcctcgcttg gtggtttgag tggacctccc 180 aggccagtgc cgggcccctc ataggagagg aagctcggga ggtggccagg cggcaggaag 240 gcgcaccccc ccagcaatcc gcgcgccggg acagaatgcc ctgcaggaac ttcttctgga 300 agaccttctc ctcctgcaaa taaaacctca cccatgaatg ctcacgcaag tttaattaca 360 gacctgaa 368 2. (plain_data.txt) ACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCC CCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGC CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGG AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCC CTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAG TTTAATTACAGACCTGAA 3. (main.cpp) #include #include using namespace std; class SequenceParser { protected: string sequence_data; public: virtual string parse_file(string filename) = 0; }; class PlainParser : public SequenceParser{ public: virtual string parse_file(string filename); }; class EMBLParser : public SequenceParser { public: virtual string parse_file(string filename); }; int main(int argc, const char * argv[]) { // Parse the file "plain_data.txt" using PlainParser // Parse the file "embl_data.txt" using the EMBLParser // Compare the two results. Print out a message saying if the two data strings are the // same or different. HINT: use the string class compare() method! // Sequence data formats can be found at http://www.genomatix.de/online_help/help/sequence_formats.html return 0; } string PlainParser::parse_file(string filename) { // Implement the parser, storing the resulting sequence in the sequence_data // member variable } string EMBLParser::parse_file(string filename) { // Implement the parser, storing the resulting sequence in the sequence_data // member variable }Explanation / Answer
the data is not clear... please repost the complete question with proper data so dat i solve it and give u the best possible answer...
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.