Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Hello; I’m trying to modify an R-script (below) to calculate the Viterbi path fo

ID: 3924627 • Letter: H

Question

Hello; I’m trying to modify an R-script (below) to calculate the Viterbi path for the following nucleotide sequence: CGCTTCAGGCAAGT

for sequence 1 (coding, c) and sequence 2 (noncoding, n)

Initial transition probablities:

a0c= a0n =0.5

ann= anc =0.5

acc =0.55 acn= 0.45

where, aij is the transition probability)

The emission probabilities for the 4 nucleotides (A, C, G, T) are:

pA=0.20 pC=0.30 pG=0.36 pT=0.14 for the coding sequence (c), and

pA=0.19 pC=0.28 pG=0.33, and pT=0.20 for the non-coding sequence (n)

I got this example R-script from the CRAN project, and need help populating the correct values in each variable and/or otherwise modifying the script.

Here is the example script:

#Initialise HMM

hmm = initHMM(c("A","B"), c("L","R"), transProbs=matrix(c(.6,.4,.4,.6),2),

emissionProbs=matrix(c(.6,.4,.4,.6),2))

print(hmm)

# Sequence of observations

observations = c("L","L","R","R")

# Calculate Viterbi path

viterbi = viterbi(hmm,observations)

print(viterbi)

Here is what I have so far:

# Initialise HMM

hmm = initHMM(c("1", "2", "3", "4"), c("A", "C", "G", "T"), startProbs=matrix(c(0.25, 0.25, 0.275, 0.225),transProbs=matrix(c(0.20,0.30,0.36,0.14),emissionProbs=matrix(c(0.20,0.30,0.36,0.14))

print(hmm)

# Sequence of observations

observations=c("C","G","C","T","T","C","A","G","G","C","A","A","G")

# Calculate Viterbi path

viterbi = viterbi(hmm,observations)

print(viterbi)

Here is the console output:

# Initialise HMM

> hmm = initHMM(c("1", "2", "3", "4")), c("A", "C", "G", "T")), startProbs=matrix(c(0.25, 0.25, 0.275, 0.225)),transProbs=matrix(c(0.20,0.30,0.36,0.14)),emissionProbs=matrix(c(0.20,0.30,0.36,0.14))

Error: unexpected ',' in "hmm = initHMM(c("1", "2", "3", "4")),"

>

> print(hmm)

$States

[1] "A" "B"

$Symbols

[1] "L" "R"

$startProbs

A   B

0.5 0.5

$transProbs

    to

from   A   B

   A 0.6 0.4

   B 0.4 0.6

$emissionProbs

      symbols

states   L   R

     A 0.6 0.4

     B 0.4 0.6

> # Sequence of observations

> observations=c("C","G","C","T","T","C","A","G","G","C","A","A","G")

> # Calculate Viterbi path

> viterbi = viterbi(hmm,observations)

Error in hmm$emissionProbs[state, observation[1]] :

subscript out of bounds

> print(viterbi)

[1] "A" "A" "B" "B"

>

Explanation / Answer

you have made a small mistake. the first line will be

hmm = initHMM( c("1", "2", "3", "4"), c("A", "C", "G", "T"), startProbs=matrix(c(0.25, 0.25, 0.275, 0.225),2),transProbs=matrix(c(0.20,0.30,0.36,0.14),2),emissionProbs=matrix(c(0.20,0.30,0.36,0.14),2))

here is the entire code and output

code

require(HMM)
# Initialise HMM
hmm = initHMM( c("1", "2", "3", "4"), c("A", "C", "G", "T"), startProbs=matrix(c(0.25, 0.25, 0.275, 0.225),2),transProbs=matrix(c(0.20,0.30,0.36,0.14),2),emissionProbs=matrix(c(0.20,0.30,0.36,0.14),2))
print(hmm)
# Sequence of observations
observations=c("C","G","C","T","T","C","A","G","G","C","A","A","G")
# Calculate Viterbi path
viterbi = viterbi(hmm,observations)
print(viterbi)

OUTPUT