Sample efficiency in data-driven Model Predictive Control and Reinforcement Learning
Sebastian Peitza, Katharina Biekerb, Jan Stennera, Vikas Chidanandaa, Oliver Wallscheidc, Steven L. Bruntond, Kunihiko Tairae
a | Department of Computer Science, Paderborn University, Paderborn, Germany |
b | Department of Computer Science, LMU Munich, Munich, Germany |
c | Department of Electrical Engineering, Paderborn University, Paderborn, Germany |
d | University of Washington, Seattle, WA, USA |
e | UCLA, Los Angeles, CA, USA |
As in almost every other branch of science, the advances in data science and machine learning have also resulted in improved modeling, simulation and control of nonlinear dynamical systems, prominent examples being autonomous driving or the control of complex chemical processes. However, many of these approaches face the issues that they (1) do not have strong performance guarantees and (2) often tend to be very data hungry. In this presentation, we discuss different approaches to improve the sample efficiency in data-driven feedback control. We address both model predictive control (MPC) as well as reinforcement learning (RL). In MPC, learning an accurate surrogate model is paramount for the performance. Exploiting techniques from mixed-integer control, we show that one can leverage performance guarantees of autonomous systems — which have been studied much more extensively — to obtain related error bounds for control problems. In RL, we address both the usage of surrogate models as well as the exploitation of system symmetries to improve sample efficiency. We demonstrate our findings using several example systems governed by partial differential equations.