Kindly note that the Indico instance has been moved to the new address, indico.mosphys.ru. All registrations made for events listed at the Indico home page, remain active and valid.

20–27 Feb 2019
HSE Study Center “Voronovo”
Europe/Moscow timezone

The master equation for the reinforcement learning

26 Feb 2019, 20:36
12m
HSE Study Center “Voronovo”

HSE Study Center “Voronovo”

Voronovskoe, Moscow Russian Federation
Talk [10+2 min] Young Scientist Forum

Speaker

Edgar Vardanyan (Yerevan Physics Institute, Yerevan State University)

Description

We look the reinforcement learning dynamics. As the dynamics is a stochastic process, the adequate mathematical tool is the master equation. We introduce the probability distributions for the actions and value functions, then get a master equation, describing the reinforcement learning process. We derived a Hamilton-Jacobi equation for the latter equation. We verify a unique feature of the model (compared to the Master equation of the chemical reaction with few molecules or evolution models with finite population): the variance of distribution disappeared at the steady state, which gives a good credit for the application of the moment closing approximation. Our method (recursive equations) gives accurate expressions both for the mean and variance of variables, while HJE provides only correct results for the mean values. Looking the recursive equations, we express the value function distribution via the solution of a system of ordinary differential equations.

Primary authors

Edgar Vardanyan (Yerevan Physics Institute, Yerevan State University) Dr David Saakian (Yerevan Physics Institute) Dr Ricard Sole (Universitat Pompeu Fabra)

Presentation materials