On Reinforcement Learning, Effect Handlers, and the State Monad
We study algebraic effects and handlers as a way to support decision-making abstractions in functional programs, whereas a user can ask a learning algorithm to resolve choices without implementing the underlying selection mechanism, and give feedback by way of rewards. Differently from some recently proposed approaches to the problem based on the selection monad, we express the underlying intelligence as a reinforcement learning algorithm implemented as a set of handlers for some of these algebraic operations, including those for choices and rewards. We show how we can in practice use algebraic operations and handlers — as available in the programming language EFF — to clearly separate the learning algorithm from its environment, thus allowing for a good level of modularity. We then show how the host language can be taken as a 𝜆-calculus with handlers, this way highlighting the essential linguistic features. We conclude by hinting at how type and effect systems could ensure safety properties, at the same time pointing at some directions for further work.
Sun 11 SepDisplayed time zone: Belgrade, Bratislava, Budapest, Ljubljana, Prague change
11:00 - 12:30 | |||
11:00 30mTalk | Relative Monads in CBPV for Stack-based Effects HOPE Max S. New University of Michigan | ||
11:30 30mTalk | Temporal refinements for Call-By-Push-Value with fixpoint HOPE Guilhem Jaber University of Nantes, Kenji Maillard Inria Nantes & University of Chile, Colin Riba LIP - ENS de Lyon File Attached | ||
12:00 30mTalk | On Reinforcement Learning, Effect Handlers, and the State Monad HOPE Ugo Dal Lago University of Bologna; Inria, Alexis Ghyselen University of Bologna, Francesco Gavazzo University of Bologna & INRIA Sophia Antipolis |