An efficient preprocessing of the data is necessary to input it into the net. All information must be normalized to fit into the interval $[0,\; 1]$ . We assume that the necessary information is given for T weeks in the past. With the following definitions
$ADV$_{i}^{t}  $:=$  $number\; of\; advertising\; days\; for\; article\; i\; within\; week\; t$

$SAL$_{i}^{t}  $:=$  $sale\; of\; article\; i\; within\; week\; t$

$MAXSAL$_{i}  $:=$  $max$_{1 ≤t ≤T}SAL_{i}^{t}

we have decided to use the following inputs for each article i and week t:
For each article i and recent week t we use a threedimensional vector:
$vec$_{i}^{t}:= ( adv_{i}^{t}, pri_{i}^{t}, sal_{i}^{t})

For a week t in the future the vector is reduced by the unknown sale:
$vec$_{i}^{t}:= ( adv_{i}^{t}, pri_{i}^{t})

To predict the sale for one article within a week t, we use a window of the last n weeks. So we have the following input vector for each article i:
$input$_{i}^{t}:= ( vec_{i}^{tn}, vec_{i}^{tn+1}, ... , vec_{i}^{t1}, vec_{i}^{t})

Because all the considered articles belong to one product group, we have quite a constant sales volume of all products. An increasing sale of one article in general leads to a decrease of the other sales. Due to this, we concatenate the input vectors of all p articles to get the vector given to the input layer:
The sale of article i within week t ($sal$_{i}^{t} ) is the requested nominal value in the output layer that has to be learned by one net for this $INPUTt$ vector. So we have p nets and the ith net adapts the sale behaviour of article i. Therefore we have a training set with the following pairs (see figure 4):
$(INPUTt,\; sal$_{i}^{t}) with n ≤t ≤T

To forecast the unkown sale $sal$_{i}^{T+1} for any article i within a future week T+1 we give the following input vector to the trained ith net:
$INPUTT+1$

The output value of this net is expected to be the value $sal$_{i}^{T+1} , which has to be rescaled to the value for the sale of article i within week T+1:
$SAL$_{i}^{T+1}= {sal_{i}^{T+1}.MAXSAL_{i}
