My research interests lie primarily in reinforcement learning, and integrated learning and planning. More specifically, I am interested in formal assurances for RL algorithms. This interest spans a number of questions, but in particular:

1) What RL algorithms are guaranteed to converge to the optimal policy or value function under reasonable conditions?

2) How can we approach exploration in a way that is provably efficient?

3) Can we use planners with well-known properties to create RL agents with these same properties?