For the backward phase (figure 7) the neuron j in the output layer calculates the error between its actual output value $o$_{j} , known from the forward phase, and the expected nominal target value $t$_{j} :
$\delta $_{j}:= (t_{j} o_{j}) .f'(a_{j})_{oj.(1  oj)}
The error $\delta $_{j} is propagated backwards to the previous hidden layer.
The neuron i in a hidden layer calculates an error $\delta \text{'}$_{i} that is propagated backwards again to its previous layer. Therefor a column of the weight matrix is used.
$\delta \text{'}$_{i}:= (∑_{j}w_{ji}.δ_{j}) .f'(a_{i})_{oi.(1  oi)}
To minimize the error the weights of the projective edges of neuron i and the bias values in the receptive layer have to be changed. The old values have to be increased by:
$\Delta w$_{ji}  $=$  $\eta .\delta $_{j}.o_{i}

$\Delta \&thetas;$_{j}  $=$  $\eta .\delta $_{j}.

$\eta $ is the training rate and has an empirical value: $\eta \approx 1$ .
The backpropagation algorithm optimizes the error by the method of gradient descent, where $\eta $ ist the length of each step.