The on-line training changes all the weights within each backward propagation after every item from the training set. Here the parallelization is very fine-grained. The vector-matrix-operations have to be calculated in parallel. This needs a lot of communication.
To reduce communication between the processors we use the idea of [6]. For each parallel calculated neuron its receptive and projective weights are stored on the responsible processor. Figure 3 shows the distribution of the neurons and the weight matrices under three processors.