C-Learning: No reward function needed for Goal-Conditioned RL


Typically, reinforcement learning is viewed as transforming a reward function to a useful behavior. However, designing reward function can be quite tedious, can we learn useful behaviors without reward functions?

Fig. from Original Author’s Presentation
Fig. from Original Author’s Presentation

Problem Set-Up

Framing GCRL as Density Estimation

future state density
marginal state density
learned classifier

Experiment and Results


References and Citations

title={C-Learning: Learning to Achieve Goals via Recursive Classification},
author={Benjamin Eysenbach and Ruslan Salakhutdinov and Sergey Levine},
booktitle={International Conference on Learning Representations},



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store