Learning with Neural Methods Computational Intelligence - Learning with Neural Methods on Structured Data

Research of Marc Strickert

(Context) Neural Gas for Sequences

Image of Neural Gas Animation Neural Gas is an unsupervised learning algorithm which provides a density based prototype representation of given data.
The key idea is to calculate a proximity ranking of all prototypes to a given training pattern, and to adopt the prototype locations into the direction of this pattern by steps proportional to an amount of exponential rank: the closest prototype is adapted most down to the farthest which is changed least. This update is performed many times for all training patterns. Unlike the SOM above, there is no topology restriction, like a reduction to a two-dimensional grid.

A 2D toy example is demonstrated by the animation on the left, for which the 3841 letter pixels are approximated by 50 prototypes (some animation frames are skipped).

But what, if data live in non-Euclidean space, thus proximity has to be defined differently. Or what, if data have a discrete structure without intrinsic natural order, like nominal symbol sequences?

My approach is to instead of representing the data to learn the data context: convert symbols into a unary code, which can be interpreted as the stimulation of specialized neurons. Now classify this activity in the presence of previously trained internal activations of the neurons, which is a recursive evaluation scheme: for each external stimulus exists an however good or bad ranking of internal neural activity. This activity is then taken as context to be again classified by the neurons, leading to a new internal activation to be classified, and so on.

One benefit is a conversion of data structure into an activation space like [0;1]N to which the Euclidean norm or more sophisticated measures like the Kullback Leibler divergence can be applied. Another benefit relates to the kind of data representation: prototypes cannot take impossible intermediate interpolation states between the training data, because not the structure itself but its elements' activation contexts are learned.

Questions of interests are: What's a good proportion of external and internal activations? Which complexity of sequences can be represented? How deep has recursion to be carried out? And there are many more, like efficient coding of large nets and reduction of runtime complexity.


LNM - Home: Computer Science - University of Osnabrück  | 12 August 2002