Scientists understand how neurons, the smallest computational units of the brain, behave during tasks, but how brains learn to make efficient choices is unknown, in particular when the brain’s working memory is involved.
In his thesis, Jaldert Rombouts, PhD student in the life sciences research group at Centrum Wiskunde & Informatica (CWI) in Amsterdam, investigates neural networks, mathematical models inspired by the brain. In particular he studied how such networks can be trained through reinforcement learning. The researcher developed a biologically plausible neural network model that can learn to remember past events in order to use them in the future. The results of this research are relevant for the development of self-learning systems.
To build his model Rombouts combined insights from neurosciences with theoretical principles from machine learning such as ‘Temporal Difference Learning’. Since the late nineties, this theory is of great interest because it connects mathematical principles with observations in the brain, and sheds light on how animals may learn from reward and punishment.
He shows that neural networks can learn complex behavior and remember which information is relevant and which isn’t, just by rewarding and punishing the model appropriately. "According to this principle a telephone might be trained to change the volume for calls under certain circumstances”, he explains. “The telephone is “rewarded” when the call is answered and "punished" by not doing so. In this way, behavior can be adjusted in the right direction."
According to Rombouts, self-learning systems can be applied in many process and product innovations. “Programming is one of the most expensive components in product development, and self-learning systems may yield significant savings."
Rombouts will defend his dissertation "Biologically Plausible reinforcement Learning”, 4 September, 2015 at VU University of Amsterdam. Promotors are Prof.dr. Pieter Roelfsema (Netherlands Institute for Neuroscience) and Dr. Sander Bohte (CWI).