e appears 1 time in computer Program Set 4(10 points extra credit) Biologists us
ID: 3705542 • Letter: E
Question
e appears 1 time in computer Program Set 4(10 points extra credit) Biologists use a sequence of letters A, C, T, and G to model a genome. A gene is a substring of a genome that starts after a triplet ATG and ends before a triplet TAG, TAA, or TGA. Furthermore, the length of a gene string is a multiple of 3 and the gene does not contain any of the triplets ATG, TAG, TAA, and TGA. Write a program that prompts the user to enter a genome and displays all genes in the genome. If no gene is found in the input sequence, the program displays no gene is found. Here are the sample runs: RESTART: E:/HW3/HW3_4_genes.py Enter a genome string: TTATGTTTTAAGGATGGGGCGTTAGTT GGGCGT Enter a genome string: TGTGTGTATAT no gene is found Test with 4 more genome strings TGATGCTCTAAGGATGCGCCGTTGATT TGATGCTCTAGAGATGCGCCGTTGAATAT iousExplanation / Answer
import re
def findGene(genome):
pattern = re.compile(r'ATG((?:[ACTG]{3})+?)(?:TAG|TAA|TGA)')
#ATG matches the characters ATG literally
#(?:TAG|TAA|TGA) Non-capturing group for TAG, TAA, TGA
#((?:[ACTG]{3})+?) capturing group
if not pattern.findall(genome):
print("no gene is Found")
for part in pattern.findall(genome):
print(part)
def main():
genome = input("Enter a genome string: ")
findGene(genome)
main()
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.