Find an error with your paper? Please login to CMT to fix any errors. Fixes will eventually be propagated here. Orals | Spotlights | Posters Orals "How hard is my MDP?" The distribution-norm to the rescue In Reinforcement Learning (RL), state-of-the-art algorithms require a large number of samples per state-action pair to estimate the transition kernel $p$. In many problems, a good approximation o