[논문 리뷰] Reinforcement Learning with Verifiable Rewards Incentivizes Correct Reasoning in Base LLMs

[논문 리뷰] Reinforcement Learning with Verifiable Rewards Incentivizes Correct Reasoning in Base LLMs

작성자: 김민경

2026, Feb 05    

논문 정보

제목: Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs

저자: Xumeng Wen, Zihan Liu, Shun Zheng, Shengyu Ye, Zhirong Wu, Yang Wang, Zhijian Xu, Xiao Liang, Junjie Li, Ziming Miao, Jiang Bian, Mao Yang.

학회: ICLR 2026

링크: https://arxiv.org/abs/2506.14245


발표 자료