Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

(Bioinformatics: find genes) Biologists use a sequence of letters A, C, T, and G

ID: 3599448 • Letter: #

Question

(Bioinformatics: find genes) Biologists use a sequence of letters A, C, T, and G to model a genome. A gene is a substring of a genome that starts after a triplet ATG and ends before a triplet TAG, TAA, or TGA. Furthermore, the length of a gene string is a multiple of 3 and the gene does not contain any of the triplets ATG, TAG, TAA, and TGA. Write a program that prompts the user to enter a genome and dis plays all genes in the genome. If no gene is found in the input sequence, the pro gram displays no gene is found. Here are the sample runs: 12 144 Enter a genome string: TTATGTTTTAAGGATCCCCTTAGTT GGGCGT

Explanation / Answer

def isGene(gene):
if len(gene) == 0 or len(gene)%3 != 0 or "ATG" in gene or "TAG" in gene or "TAA" in gene or "TGA" in gene:
return False
return True

def printGenes(genome):
genomes = genome.split("ATG")
if genome[0:3] != "ATG":
genomes = genomes[1:]
  
genes = []
for genome in genomes:
if genome:
if "TAA" in genome:
gene = genome.split("TAA")[0]
if not gene in genes and isGene(gene):
genes.append(gene)
  
if "TAG" in genome:
gene = genome.split("TAG")[0]
if not gene in genes and isGene(gene):
genes.append(gene)
  
if "TGA" in genome:
gene = genome.split("TGA")[0]
if not gene in genes and isGene(gene):
genes.append(gene)
  
for gene in genes:
print(gene)

genome = input("Enter a genome string: ")
printGenes(genome)  

# copy pastable code: https://paste.ee/p/BdXrA

'''

Sample run

'''