Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Program in python Biologists use a sequence of letters A, C, T, and G to model a

ID: 3935267 • Letter: P

Question

Program in python

Biologists use a sequence of letters A, C, T, and G to model a genome. A gene is a substring of a genome that starts after a triplet ATG and ends before a triplet TAG, TAA, or TGA. Furthermore, the length of a gene string is a multiple of 3 and the gene does not contain any of the triplets ATG, TAG, TAA, and TGA. Write a program that prompts the user to enter a genome and displays all genes in the genome. If no gene is found in the input sequence, the program displays "No gene is found". Here are the sample runs: Enter a genome string: TTATGTTTTAAGGATGGGGCGTTAGTT [ENTER] TTT GGGCGT Enter a genome string: TGTGTGTATAT [ENTER] No gene is found

Explanation / Answer

from collections import Counter

#Enter the Genome
#genome = raw_input("Ask user for something.")
genome = "TTATGTTTTAAGGATGGGGCGTTAGTT"
#print(genome)
#count the number of
number = genome.count("ATG")
#i is counter till the number of gene found
i=0
#found_gene index
found_gene=0
#a,b,c are the respective index of TAG , TAA , TGA
a=0
b=0
c=0
#Crop is slicing the processed string
crop=0

while(i < number) :
   found_gene = genome.find("ATG",c+ 1)
   #slicing the not used substring
   genome = genome[found_gene+3:]
   a = genome.find("TAG")
   b = genome.find("TAA")
   c = genome.find("TGA")
   foo = [a,b,c]
   #j is finding the substring from TAG , TAA , TGA which came first
   j= min([element for element in foo if element > 0])
   #printing the gene
   print (genome[0:j])
   genome = genome[j:]
   #counter till number of gene found
   i = i+1