ScientistsperformingDNAsequencealignmentoftendesireamoreparametrized way of alig

ID: 3817478 • Letter: S

Question

ScientistsperformingDNAsequencealignmentoftendesireamoreparametrized way of aligning two strings than simply computing a longest common subse- quence between these two strings. One way to achieve such flexibility is to use the Smith-Waterman algorithm, which is generalization of the dynamic pro- gramming algorithm for computing a longest common subsequence so that it can handle weighted scoring functions. In particular, for two strings, X and Y , sup- pose we are given functions, defined on characters, a, from X, and b from Y , as follows: M (a, b) = the positive benefit for a match, if a = b M (a, b) = the negative cost for a mismatch, if a = b I(a) = the negative cost for inserting a at a position in X D(b) = the negative cost of deleting b from some position in Y . Given a string, X, of length n and a string, Y , of length m, describe an algorithm running in O(nm) time for finding the maximum-weight way of transforming X into Y , according to the above weight functions, using the operations of match- ing, substitution, insertion in X, and deletion from Y .

Explanation / Answer

We want to convert the string X into string Y. Let a is in X and b is in Y.

M (a, b) = the positive benefit for a match, if a==b

M(a, b) = the negative weight for a mismatch, i a!=b

I(a)=the negative weight for inserting a at a position in X

D(b) = the negative weight of deleting b from some position in Y

Suppose we need to convert a string X of size m to a string Y of size n with maximum weight.
First we will consider a smaller subproblem and define an optimal structure of the solution.
Let weight ( i , j ) denotes the maximum weight to convert first ' i ' characters of X to first ' j ' characters of Y.
There are 2 possibilities :
1 ) if X ( i ) == Y ( j ) , problem is reduced to finding weight ( i - 1 , j - 1 ) ( no edit required )
2 ) if X ( i ) != Y ( j ) , then we can choose any of the following edit operations :
    Insert : problem reduces to finding weight ( i , j - 1 ) + INSERT_weight
                i.e we insert the mismatched character and add the weight of insertion.
    Delete : problem reduces to finding weight ( i - 1 , j ) + DELETE_weight
    Replace : problem reduces to finding weight ( i - 1 , j - 1 ) + REPLACE_weight
Now, our objective is to find maximum edit weight i.e to maximize weight ( i , j ) . So, we will consider the maximum of the values of above subproblems or possibilities and the optimal structure of the solution is defined as :
weight ( i , j ) = max [ weight ( i , j - 1 ) + I , weight ( i - 1 , j ) + D , weight ( i - 1 , j - 1 ) + R ] if X ( i ) != Y ( j ) and
weight ( i , j ) = weight ( i - 1 , j - 1 ) if X ( i ) == Y ( j )
Now, we can find weight ( i , j ) for any values of i and j.
Finally, weight ( m , n ) gives the required maximum edit weight to convert X to Y.

Navigate

ScientistsperformingDNAsequencealignmentoftendesireamoreparametrized way of alig

Scipio Quiz 1 Below was the active trading information of US Treasury Bills in M

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

ScientistsperformingDNAsequencealignmentoftendesireamoreparametrized way of alig

Question

Explanation / Answer

Related Questions

Navigate