What is a Multi-armed Bandit Experiment?
Problem Intuition
Let’s imagine you’ve decided to hit up the casino and try your luck at the slots. You don’t know which slots are best, but you have reason to believe that some of the slot machines are better than others. You only have a limited amount of money to play with, so you want to play each of the machines enough to learn which is going to be best (explore), and then as you learn, play the best machines most frequently to maximize your return (exploit). Choosing which machine to play is referred to as the exploration / exploitation dilemma.
Read More