Welcome to Ars-Informatica  


If you want to build a ship don't herd people together to collect wood and don't assign them tasks and work, but rather teach them to long for the endless immensity of the sea. (Antoine-Marie-Roger de Saint-Exupéry)

Hand written letters recognition system

In order to recognize hand written charaters one commonly used method is the use of the backpropagation algorithm applied to artificial neural networks domain.

Backpropagation, or propagation of error, is a common method of teaching artificial neural networks how to perform a given task.

It is a supervised learning method, and is an implementation of the Delta rule. It requires a teacher that knows, or can calculate, the desired output for any given input.

It is most useful for feed-forward networks (networks that have no feedback, or simply, that have no connections that loop).

Backpropagation requires that the activation function used by the artificial neurons (or "nodes") is differentiable.

Since hand written charaters may change according to the writer the algorithm used is the following:

  1. Present a training sample to the neural network, that is the user enters a set of hand written characters;
  2. The user tells the network which characters he has written. In this way the algorithm compares the network's output to the desired output from that sample.
  3. Calculate the error in each output neuron.
  4. For each neuron, calculate what the output should have been, and a scaling factor, how much lower or higher the output must be adjusted to match the desired output. This is the local error.
  5. Adjust the weights of each neuron to lower the local error.
  6. Assign "blame" for the local error to neurons at the previous level, giving greater responsibility to neurons connected by stronger weights.
  7. Repeat from step 4 on the neurons at the previous level, using each one's "blame" as its error.

As the algorithm's name implies, the errors (and therefore the learning) propagate backwards from the output nodes to the inner nodes. So technically speaking, backpropagation is used to calculate the gradient of the error of the network with respect to the network's modifiable weights.

This gradient is almost always then used in a simple stochastic gradient descent algorithm to find weights that minimize the error. Often the term "backpropagation" is used in a more general sense, to refer to the entire procedure encompassing both the calculation of the gradient and its use in stochastic gradient descent.

Backpropagation usually allows quick convergence on satisfactory local minima for error in the kind of networks to which it is suited.

Backpropagation networks are necessarily multilayer perceptrons (usually with one input, one hidden, and one output layer).

In order for the hidden layer to serve any useful function, multilayer networks must have non-linear activation functions for the multiple layers: a multilayer network using only linear activiation functions is equivalent to some single layer, linear network. Non-linear activation functions that are commonly used include the logistic function, the softmax function, and the gaussian function.

The backpropagation algorithm for calculating a gradient has been rediscovered a number of times, and is a special case of a more general technique called automatic differentiation in the reverse accumulation mode.

Attached to this article is a simple example written in C++ .NET that uses this techinque. The core library is taken from an open source project: the fann library

A detailed description of the implementation and the results (both in italian) can be found here and here