Python In the following code cell, write a function named get_sub_seq() that tak
ID: 3601778 • Letter: P
Question
Python
In the following code cell, write a function named get_sub_seq() that takes xml, an amino acid sequence entry in xml format (as shown in the code cell below), start and end, two integers specifying the starting and ending indices of the sub-sequence (slice of the original sequence) being returned. The sub-sequence must include the amino acid letter at the end index.
Hint:
Use replace() to remove newline characters in the "sequence" element text.
There will always be one sequence entry, in xml format, supplied as an argument to get_sub_seq(). Therefore, a for loop wouldn't be necessary.
Example
For the string in the code cell below:
seq_string1 = '''<?xml version="1.0" encoding="UTF-8"?>
<sequence length="1263" mass="143449" checksum="411533D3BB51B502" modified="2004-12-07" version="1">
MSKTSKSNKNPKSIEEKYQKKNLHEHILHSPDTYIGSIEEKTCNMWIFNESAGEDDAKII
FKEITYVPGLYKIYDEVIVNAADHNKRCQTCNIIKVDIDQKTGQISVWNNGDGIDVAIHK
EHNIWVPSMIFGELLTSTNYDKNEEKTVGGKNGFGAKLANIYSVEFTIETVDANKGKKFF
QRFTNNMYDKEEPKISSFKKSSYTKITFIPDFKKFGLKCLDDDTLALFKKRVFDLAMTTN
AKVYFNDKQIVQNNFKKYVGLYFPEGSQHKVVIDDTTHERWKVGVIYDPTDQLEHQNISF
VNSICTSRGGTHVEQVVGQIVNGLKTAIVKKAKNVQIKPAMIKENLIFFVDATIVNPDFD
TQTKEYLTKKAANFGSKFEVTEKFIKGVIKTGVCDQIIANAKAREEANLSKTDGKGRGPV
RYEKLYNAHKAGTKEGYKCTLILTEGDSAKTFAMSGLNVIGRDYYGVFPLRGKLLNVRDA
SPKKIADNEEITAIKKIVGLEQGKVYDDLKGLRYGSIMILADQDVDGYHIKGLIMNFIHC
FWPSLVKYEGFIQSFATPLLKATKGKGKTKQVVAFTSPQSFEEWKKENNDGKGWSIKYYK
GLGTSDPAEAQECFADLNDKLVKYFWEPKKKNLESESNSKSVDSNKSKTTNKKKIESEFI
EEESDIISDTYKPKNKDISEDAMTLAFAGGREDDRKIWINTYNPDNYLDPSKKRISYYDF
IHKELITFSVDDVLRSVPNLMDGFKPSHRKVFYGSVEKNIYKQEIKVSDLTGFVSNMTKY
HHGDQSLSSTIVGMAQNYVGSNNLNLLMPLGMFGSRLTGGKDSASPRYLNTKLDDLAKKI
FIDYDFDILQHQSEDNCRIEPVYYAPIIPMILVNGAEGIGTGYSTKIYPCNPRDIIANIK
RLLTNENPKTMKPWFRHLTGTIEKIDGAKYISRAKYEIIGKDTIHITDLPVGIWTDNYKA
FLDNLIVQGTAQNAEEKKASKAVSSAKNTKTTTKAGSKTGSRTRKNPALAKKSQKSVTAK
VAKKNPVASSIKTYSEDCTDIRISFTIVFHPGKLDTLIKSGKLDTGLKLVKPLNLTNMHL
FNEKGKIKKYDTYGAILRNFVKVRLNLYQKRKDYLLGKWKKEMDILKWKVKFIEYVIEGK
IVIFKNGKSKKKEEVLKALEDLKFPKFIVGNESYPSYGYITSIGLFNLTLEEVEKLKKQL
ADKKQELAILEAKSPEEIWEEELDEFVEAYDIWEKEVDENYNDLLNKKKGSTGKKSRKTS
TQK
</sequence>
'''.strip()
def get_sub_seq(xml, start, end):
'''
Takes a sequence entry in xml format and returns a sub-sequence with indices specified by the `start` and `end`.
Character at `end` index must be included in the sequence.
Paramters
---------
xml: String. An XML format string.
start: int. starting index of the sub-sequence.
end: int. end index of the sub-sequence (character at this index must be included in the sub-sequence).
Returns
-------
sub_seq: String. Sub-sequence.
'''
sub_seq = ''
# YOUR CODE HERE
return sub_seq
Explanation / Answer
The code is so simple I think there is no need for documentation.If you want something to be changed please comment.
Thank You
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.