Stochastic Reinforcement Learning