- Contextual bandits, a dynamic approach to treatment personalization
- Differences between contextual bandits, multi-armed bandits, A/B testing, multiple MABs, multi-step reinforcement learning, and uplift modeling
- Exploration and exploitation strategies, including ε-greedy, upper confidence bound, and Thompson sampling