Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

LearningObjectives • Working with strings in Python • Basic functions in Python

ID: 3883110 • Letter: L

Question

LearningObjectives

• Working with strings in Python

• Basic functions in Python

• Working with GenBank

• Understanding nucleotide composition in DNA

Strings in GenBank

Access GenBank http://www.ncbi.nlm.nih.gov/genbank/.Find a nucleotide sequence for the rpoA gene from an organism of your choice (e.g., plant, animal, bacteria, archaea). To do this, you can search for "rpoA" in the search bar at the top of the page. Make sure it says "Nucleotide" in the pull-down menu to the left of the search bar. You can navigate to different organisms on the left-and right-hand sides of the page.You can also just search for "rpoA+chicken", as an example.Undereach result,you should see "GenBankFASTAGraphics"; look at the GenBank record for information and the FASTA format to copy the DNA sequence. 1. Write a short paragraph explaining what this gene product (i.e, RNA polymerase) is/does. Be sure to note your gene’s accession ID. Hint: look in the right panel on GenBank, or use the web. (You can write your answer in the same le as your Python code by commenting out the text.) 2. Write a function that takes one input parameter, where the input is expected to be a string containing a DNA sequence, and call that function on the DNA sequence of the gene you found in part I. Your function should analyze the DNA input into it, and print out the following information:

(a) Print the total number of DNA base pairs in the input DNA in the form of "The total length of the gene is [insert length here]" where the italicized part is replaced by the actual length.


(b) Find the number of A’s, G’s, C’s and T’s in this sequence and print the result with each base on it’s on line:

"There are [insert number] A’s"

"There are [insert number] T’s"

"There are [insert number] G’s"

"There are [insert number] C’s"

(c) Calculate the percentage of each base in your DNA string and print the output to 1 decimal:

"%A = [insert percentage]"

"%T = [insert percentage]"

"%G = [insert percentage]"

"%C = [insert percentage]"

(d) Given that three nucleotides encodes one amino acid, print the total number of amino acids produced by the DNA sequence. Round down to the lowest whole number.

Add the answers you got from running your program on this gene to the end of the commented material that was your answer to number 1.

Explanation / Answer

Write a function that takes one input parameter, where the input is expected to be a string containing a DNA sequence, and call that function on the DNA sequence of the gene you found in part I. Your function should analyze the DNA input into it, and print out the following information:

$ python gene.py
The total length of the gene is 672
There are 189 A's
There are 219 T's
There are 116 G's
There are 148 C's
%A = 28.125
%T = 32.5892857143
%G = 17.2619047619
%C = 22.0238095238
total number of amino acids produced by the DNA sequence 224

Code:

from __future__ import division

# source: https://www.ncbi.nlm.nih.gov/nuccore/LC127062.1?report=fasta
input = "TTTTAGACAGTTATATGAACGAACAGATAAGTCTAATTCTTCGATTGTCATTTCTAACATTTTTTCTTTTTGTGTTTCTTCTTTTTCAATCATGATTTCAGCATTTTTAGCCTCATCCGTTAGATTCACAAAAATATCCAAATGTTCAGTCATGATTTTCGCAGCTAAACTCATCGCTTCCATCGGCATGATAGAACCGTCAGTCCAAATTTCCATCGTCAATTTATCAAAATCGTCACGACGGCCAACCCGCGTGTTTTCTACTTGGTAGTTAACACGATTTACTGGTGTGTAAATTGAATCAACTGGAATCACGCCAATTGGCATATCTTCCCGTTTGTTTTCATCAGCTTGCACATAACCACGACCAGGTCGAACAGTCAAGCGGGCATGAAAAGTTGCCCCTTCAGAAACGCTACAAATATACATATCTTTGTTTAAAATTTCTACATCACTGTCAACGATAATGTCACCAGCAGTAACAGTCGCTGGACCGGTAATGTCGATTTCAAGGGTTTTTTCTTCTTGCGTGTACATCTTAAGAGCAAGACCTTTGATATTCAAAATGATTTGTGCGACATCTTCGCGCACACCTTTGACGGTGGAGAATTCGTGTAAGACGCCATCAATTTGAATACTTGTGATCGCTGCCCCAGGCAATGAAGATAATAA"

def analyszeDNA(dnaSeq):
total = len(dnaSeq)
print("The total length of the gene is " + str(total))
countA = dnaSeq.count('A')
countT = dnaSeq.count('T')
countG = dnaSeq.count('G')
countC = dnaSeq.count('C')
print("There are " + str(countA) + " A's")
print("There are " + str(countT) + " T's")
print("There are " + str(countG) + " G's")
print("There are " + str(countC) + " C's")
  
percntA = 100*countA/total
percntT = 100*countT/total
percntG = 100*countG/total
percntC = 100*countC/total
print("%A = " + str(percntA))
print("%T = " + str(percntT))
print("%G = " + str(percntG))
print("%C = " + str(percntC))
  
nucleotide = total//3
print("total number of amino acids produced by the DNA sequence " + str(nucleotide))

analyszeDNA(input)

# link for code: https://paste.ee/p/O3bJK

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote