C-Learning: No reward function needed for Goal-Conditioned RL

Introduction

Typically, reinforcement learning is viewed as transforming a reward function to a useful behavior. However, designing reward function can be quite tedious, can we learn useful behaviors without reward functions?

Fig. from Original Author’s Presentation
Fig. from Original Author’s Presentation

Problem Set-Up

Framing GCRL as Density Estimation

future state density
marginal state density
learned classifier

Experiment and Results

Conclusion

References and Citations

@inproceedings{
eysenbach2021clearning,
title={C-Learning: Learning to Achieve Goals via Recursive Classification},
author={Benjamin Eysenbach and Ruslan Salakhutdinov and Sergey Levine},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=tc5qisoB-C}
}

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store