20-27 February 2019
HSE Study Center “Voronovo”
Europe/Moscow timezone

The master equation for the reinforcement learning

26 Feb 2019, 20:36
HSE Study Center “Voronovo”

HSE Study Center “Voronovo”

Voronovskoe, Moscow Russian Federation
Talk [10+2 min] Young Scientist Forum


Edgar Vardanyan (Yerevan Physics Institute, Yerevan State University)


We look the reinforcement learning dynamics. As the dynamics is a stochastic process, the adequate mathematical tool is the master equation. We introduce the probability distributions for the actions and value functions, then get a master equation, describing the reinforcement learning process. We derived a Hamilton-Jacobi equation for the latter equation. We verify a unique feature of the model (compared to the Master equation of the chemical reaction with few molecules or evolution models with finite population): the variance of distribution disappeared at the steady state, which gives a good credit for the application of the moment closing approximation. Our method (recursive equations) gives accurate expressions both for the mean and variance of variables, while HJE provides only correct results for the mean values. Looking the recursive equations, we express the value function distribution via the solution of a system of ordinary differential equations.

Primary authors

Edgar Vardanyan (Yerevan Physics Institute, Yerevan State University) Dr David Saakian (Yerevan Physics Institute) Dr Ricard Sole (Universitat Pompeu Fabra)

Presentation Materials

Your browser is out of date!

Update your browser to view this website correctly. Update my browser now