[논문 리뷰] Reinforcement Learning with Verifiable Rewards Incentivizes Correct Reasoning in Base LLMs
작성자: 김민경
2026, Feb 05
논문 정보
제목: Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs
저자: Xumeng Wen, Zihan Liu, Shun Zheng, Shengyu Ye, Zhirong Wu, Yang Wang, Zhijian Xu, Xiao Liang, Junjie Li, Ziming Miao, Jiang Bian, Mao Yang.
학회: ICLR 2026