A DNA sequence is a sequence of some combination of the characters A (adenine),
ID: 3853993 • Letter: A
Question
A DNA sequence is a sequence of some combination of the characters A (adenine), C (cytosine), G (guanine), and T (thymine) which correspond to the four nucleobases that make up DNA. Given a long DNA sequence it is often necessary to compute the number of instances of a certain subsequence. For this exercise, you will develop a program that processes a DNA sequence from a file and, given a subsequences, searches the DNA sequence and counts the number of times s appears. As an example, consider the following sequence: GGAAGTAGCAGGCCGCATGCTTGGAGGTAAAGTTCATGGTTCCCTGGCCC If we were to search for the subsequence GTA, it appears twice. You will write a program (place your source in a file named dnaSearch.c) that takes, as command line inputs, an input file name and a valid DNA (sub)sequence. That is, it should be callable from the command line as follows: /dnaSearch dna01.txt GTA What you will submit via handin: dnaSearch.c.Explanation / Answer
Hi Let me know if you need more information:-
==============================================
#include <iostream>
#include <stdio.h>
#include <string>
#include <fstream>
using namespace std;
int main(int argc, char **argv) {
ifstream fileInput;
int offset;
string search(argv[2]);
string line;
//cout << argv[1] << ":" << search << endl;
int counter = 0;
fileInput.open(argv[1]);
if (fileInput.is_open()) {
while (!fileInput.eof()) {
getline(fileInput, line);
int offset = 0;
while(offset <line.size()){
if ((offset = line.find(search, offset)) != string::npos) {
//cout<<"offset"<<offset<<endl;
offset = offset + search.size();
counter++;
}else{
offset= line.size();
}
}
}
fileInput.close();
}
cout << search << " appears " << counter << " times " << endl;
}
============================================================
INPUT:-
==============================================================
INPUT: GGAAGTAGCAGGCCGCATGCTTGGAGGTAAAGTTCATGGTTCCCTGGCCC
OUTPUT:-
GTA appears 2 times
====================================
INPUT 2: GGAAGTAGCAGGCCGCATGCTTGGAGGTAAAGTTCATGGTTCCCTGTAGCCC
OUTPUT:-
GTA appears 3 times
=======================
INPUT 3: ACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTGTACCGGGGCCACGGCCACCGCTGCCCTGCC
CCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGC
CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGG
AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCC
CTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAG
TTTAATTACAGACCTGAA
OUTPUT:-
GTA appears 14 times
==============================
Thanks
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.