Actor-Critic vs. Value-Based: Empirical Trade-offs

Actor-Critic vs. Value-Based: Empirical Trade-offs

August 17, 2025 · Foad Hassanlou, Mohammad Hossien Jamshidi Goharrizi
Price-based control for constrained contextual bandits.

Learning Safely on a Shoestring: Small-Budget Contextual Bandits with Knapsacks

June 24, 2025 · Arian Aghamohseni, Kia Joolai
paper cover

Three Dogmas of Reinforcement Learning

August 6, 2024 · Arash Alikhani
expgen overview animation

ExpGen: Explore to Generalize in Zero-Shot RL

July 22, 2024 · Arash Alikhani