Reinforcement learning

=References=

MDP

Markov Decision Processes, Processus de Décision de Markov

Q-learning
Value iteration
Policy iteration