Learning Agents

[논문 리뷰] Reinforcement Learning with Verifiable Rewards Incentivizes Correct Reasoning in Base LLMs

작성자: 김민경 | 2026, Feb 05

[논문 리뷰] Offline Reinforcement Learning with Implicit Q-Learning

작성자: 김동민 | 2026, Jan 08

[논문 리뷰] Horizon Reduction Makes RL Scalable

작성자: 이동진 | 2025, Dec 11

[논문 리뷰] SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

작성자: 김재훈 | 2025, Nov 27

[논문 리뷰] Dual Goal Representations

작성자: 민예린 | 2025, Nov 20

[논문 리뷰] Direct Prefernce Optimization: Your Language Model is Secretly a Reward Model

작성자: 김민경 | 2025, Nov 13

[논문 리뷰] Temporal Difference Flows

작성자: 홍준형 | 2025, Oct 30

[논문 리뷰] Dual RL: Unification and New Methods for Reinforcement and Imitation Learning

작성자: 김동민 | 2025, Oct 23