Leveraging Similarities in Multi-Armed Bandits
This study investigates online learning with similarity-structured action sets encoded by rooted trees, demonstrating that standard one-point feedback cannot exploit these similarities. The authors propose unified algorithms for richer feedback models that replace the number of actions with a similarity-aware effective count to improve regret bounds.