Python In the following code cell, write a function named get_ec() that takes xm
ID: 3601780 • Letter: P
Question
Python
In the following code cell, write a function named get_ec() that takes xml, a string in xml format containing protein entries, and returns a list of EC numbers (text of the "ecNumber" element) for each of these entries.
Example
For the string in the code cell below:
ec_string1 = '''<?xml version="1.0" encoding="UTF-8"?>
<entries>
<protein>
<recommendedName>
<fullName>Nucleoside diphosphate kinase</fullName>
<shortName>NDK</shortName>
<shortName>NDP kinase</shortName>
<ecNumber>2.7.4.6</ecNumber>
</recommendedName>
</protein>
<protein>
<recommendedName>
<fullName>Tyrosine--tRNA ligase</fullName>
<ecNumber>6.1.1.1</ecNumber>
</recommendedName>
<alternativeName>
<fullName>Tyrosyl-tRNA synthetase</fullName>
<shortName>TyrRS</shortName>
</alternativeName>
</protein>
</entries>
'''.strip()
def get_ec(xml):
'''
Takes entries of proteins and returns the text from the element "ecNumber" as a list.
Paramters
---------
xml: String. An XML format string.
Returns
-------
ec_list: List. Contains text of the element "ecNumber" for each protein entry.
'''
ec_list = []
# YOUR CODE HERE
return ec_list
Explanation / Answer
We will 2 list of index of position <ecNumber> and </ecNumber>
We will put the substring between those indexes.
def get_ec(xml):
ec_starting=[n for n in xrange(len(xml)) if xml.find('<ecNumber>', n) == n]
ec_end=[n for n in xrange(len(xml)) if xml.find('</ecNumber>', n) == n]
ec_list=[]
for f,b in zip(ec_starting,ec_end):
ec_list.append(xml[f+10:b]) #adding 10,for moving index upto end of <ecNumber>
return ec_list
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.