Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

(b) Consider the multi-layer perceptron shown in Fig. 4.2. Use back propagation

ID: 3853232 • Letter: #

Question

(b) Consider the multi-layer perceptron shown in Fig. 4.2. Use back propagation algorithm to find updated values for weights w_4 and w_8, given the inputs (x_1 = 0.5, x_2 = 0) and the corresponding desired outputs (d_1 = 0, d_2 = 1). y_o1 and y_o2 are the outputs from the two neurons in the output layer. Assume that the error function is, E = 1/2 sigma^2_i = 1 [e_i]^2, where e_1 = d_1 - y_o1 and e_2 = d_2 - y_o2, the learning rate parameter is, eta = 1, and, the activation function is, phi = 1/1 + e^-x.

Explanation / Answer

Answer: Back-propagation algorithm works in two passes - forward pass, then backward pass.

---------------------------------

(Matlab code. See comments for explanation.)

%Given:
%inputs
x1=0.5;
x2=0.0;
x=[x1 x2];
%desired outputs
d1=0;
d2=1;
%learning rate parameter
eta=1;
%activation function
si=1./(1.+exp(x));

%weights
w1=2;w2=-1;w3=0.75;w4=-2;w5=-2;w6=4;w7=1;w8=2;
w=[w1 w2 w3 w4 w5 w6 w7 w8];

%forward pass
%The given perceptron is a 1-1-1 nn. Bias is not given, hence assuming 0.
%See the calculation below:

%Calculation for hidden layer
disp("Forward pass");
net_h1=w1*x1+w3*x2;
net_h2=w2*x1+w4*x2;
disp("Input for hidden layer:");
printf("net_h1:%f, net_h2:%f ",net_h1,net_h2);
out_h1=1/(1+exp(net_h1));
out_h2=1/(1+exp(net_h2));
disp("Ouput of hidden layer:");
printf("out_h1:%f, out_h2:%f ",out_h1,out_h2);

%Calculation for output layer
net_y1=w5*out_h1+w7*out_h2;
net_y2=w6*out_h1+w8*out_h2;
disp("Input for output layer:");
printf("net_y1:%f, net_y2:%f ",net_y1,net_y2);
y1=1/(1+exp(net_y1));
y2=1/(1+exp(net_y2));
disp("Ouput of output layer:");
printf("y1:%f, y2:%f ",y1,y2);

%Calculation of error
e1=d1-y1;
e2=d2-y2;
disp("Error:");
printf("e1:%f, e2:%f ",e1,e2);
E=(1/2)*(e1^e1+e2^e2);
printf("E:%f ",E);

%backward pass
%In backward pass, weights are updated.
%See the calculation below:

%First between output layer and hidden layer
%first, calculate change in total error w.r.t. output
%For w8, only y2 needs to be considered.
gradient_E_y1=-(d1-y1);
gradient_E_y2=-(d2-y2);
%then, change in output w.r.t net input
%For w8, only y2 needs to be considered.
gradient_y1_nety1=y1*(1-y1);
gradient_y2_nety2=y2*(1-y2);
%Now, total change in net input of y2 w.r.t. w8
gradient_nety2_w8=out_h2;
%overall change in error, E w.r.t w8
gradient_E_w8=gradient_E_y2*gradient_y2_nety2*gradient_nety2_w8;
%Now calculate updated value of w8
w8_updated=w8-eta*gradient_E_w8;
printf("Updated w8=%f ",w8_updated);

%Second, between hidden layer and input layer
%Now same calculation as done above will be repeated for w5 also

%first, calculate change in total error w.r.t. out_h1
%For w5, only h2 needs to be considered.
gradient_e1_y2=gradient_E_y2;
gradient_e1_nety2=gradient_e1_y2*gradient_y2_nety2;
gradient_nety2_outh2=w8;
gradient_e1_outh2=gradient_e1_nety2*gradient_nety2_outh2;
gradient_e2_y2=gradient_E_y2;
gradient_e2_nety2=gradient_e2_y2*gradient_y2_nety2;
gradient_e2_outh2=gradient_e2_nety2*gradient_nety2_outh2;
gradient_E_outh2=gradient_e1_outh2+gradient_e2_outh2;
%then, change in output w.r.t net input
%For w5, only h2 needs to be considered.
gradient_outh2_neth2=out_h2*(1-out_h2);
%Now, total change in net input of h2 w.r.t. w5
gradient_neth2_w5=x2;;
%overall change in error, E w.r.t w5
gradient_E_w5=gradient_E_outh2*gradient_outh2_neth2*gradient_neth2_w5;
%Now calculate updated value of w8
w5_updated=w5-eta*gradient_E_w5;
printf("Updated w5=%f ",w5_updated);

-------------------------------------

Output:

------------------------------

Input for hidden layer:
net_h1:1.000000, net_h2:-0.500000
Ouput of hidden layer:
out_h1:0.268941, out_h2:0.622459
Input for output layer:
net_y1:0.084576, net_y2:2.320684
Ouput of output layer:
y1:0.478868, y2:0.089424
Error:
e1:-0.478868, e2:0.910576
E:0.506310
Updated w8=2.046153
Updated w5=-2.000000

------------------------------------------