We propose an actor-critic architecture for 3D molecular design that exploits the symmetries of the design process using spherical harmonics.
We present a reinforcement learning formulation that enables molecular design directly in Cartesian coordinates.
We propose a novel Bayesian batch active learning approach motivated by approximating the complete data posterior of the model parameters.
We factor contexts for contextual policy search into environment and target components, such that experience can be directly generalized over target contexts.
We incorporate bi-perspective reward learning from human preferences into a general hierarchical reinforcement learning framework for robotic grasping.