Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

plesea i need help with this Assignment: Follow the steps that we have outlined

ID: 3767525 • Letter: P

Question

plesea i need help with this Assignment:
Follow the steps that we have outlined in class for algorithm development to generate a program that reads in DNA sequences from a file and determines the content of A, T, C, and G in the sequence. Specifically, I am interested in the GC content (the percentage of the sequence that it G or C). The first line of the file will be in integer that tells you how many sequences there are in the file. Each line following will contain a single sequence. You will need to store the percent of A, T, C, G in a 2D array, this is because you need to know the average GC content of the genome to determine whether a bacterial gene is, or is not, pathogenic. If a bacterial gene has a higher GC content than the genome as a whole, then it is likely that that gene is pathogenic.
The Wikipedia page on GC content gives additional explanation: https://en.wikipedia.org/wiki/GC-content
Specifications:
Inputs:
- File called sequences.txt (contains a plasmid of Yersina pestis)

http://www.filedropper.com/sequences

Outputs:
- File called content.txt containing A, T, C, G, and GC content of each sequence along with a pathogenicity

http://www.filedropper.com/content_2

prediction:
EX:
%A %T %C %G %GC pathogenic?
10 20 40 30 70 Y
20 50 10 20 30 N

Functions:
1. void printToFile(int seq, float content[seq][4], float avgGC)
a. prints the results out to a file
b. You should open and close your file in this function
2. float averageGC(int seq, float content[seq][4])
a. calculates the average GC content for the whole genome
3. char isPathogenic(float avgGC, float seqGC)
a. returns Y if pathogenic, N if not

Explanation / Answer

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>
void input_sequence(int length,char input[]);
void search(char C[],char DNA[],int length);

int main(void) {
    //Given array
    char DNA[] = {'A', 'G', 'C', 'G', 'G', 'G', 'A', 'C', 'C', 'G', 'T', 'C',
          'C', 'C', 'G', 'A', 'C', 'A', 'T', 'T', 'G', 'A', 'T', 'G',
          'A', 'A', 'G', 'G', 'G', 'T', 'C', 'A', 'T', 'A', 'G', 'A',
          'C', 'C', 'C', 'A', 'A', 'T', 'A', 'C', 'G', 'C', 'C', 'A',
          'C', 'C', 'A', 'C', 'C', 'C', 'C', 'A', 'A', 'G', 'T', 'T',
          'T', 'T', 'C', 'C', 'T', 'G', 'T', 'G', 'T', 'C', 'T', 'T',
          'C', 'C', 'A', 'T', 'T', 'G', 'A', 'G', 'T', 'A', 'G', 'A',
          'T', 'T', 'G', 'A', 'C', 'A', 'C', 'T', 'C', 'C', 'C', 'A',
          'G', 'A', 'T', 'G', ''};
    int length,i=0,k;
    /*Program should repeatedly ask the user for two things: the length of a search sequence,
    and the search sequence itself*/
    /*The program should terminate when the length of the input sequence is zero or less*/
    do{
        printf("Enter length of DNA sequence to match: ");
        scanf("%d",&length);
        Search sequence array
        char input[length];
        //input sequence length has to be >0
        if(length>0){
            input_sequence(length,input[]);
            /*The elements of the search sequence may take on one of five characters: A,G,T,C and *. The
            meaning of the ‘*’ character is that it matches all four nucleotides: A,G,T and C.*/
            for(i=0; i<length; i++){
                k=0;
                if(input[i]!='A'&&input[i]!='G'&&input[i]!='T'&&input[i]!='C'&&input[i]!='*'){
                    printf("Erroneous character input ’%c’ exiting ",input[i]);
                    k=1;
                }
                if(k==1)
                    break;           
            }
            if(k==0){
                search(input,DNA,length);
            }
            k=0;
        }
    }
    while(length>0);
    printf("Goodbye");

    return (EXIT_SUCCESS);
}

//Function to search for input sequence in the given array
void search(char C[],char DNA[],int length){
    int numFound = 0,i,foundIndex;
    bool found = false;
    for(i=0;i<length && !found;i++) {
        int n=0;
        char temp=C[i];
        if (temp==DNA[i]) {
            numFound++;
            if (numFound == length) {
                found = true;
                foundIndex = i - (length-1);
            }
        }
        else numFound = 0;
    }
    if (found)
        printf("Match of search sequence found at element %d ",foundIndex);
}

void input_sequence(int length,char input[]){
    int i;
    printf("Enter %d characters (one of AGTC*) as a search sequence: ",length);
    for(i=0; i<length; i++){
        scanf(" %c", &input[i]);
        }
}