Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs
Zhihe Yang, Xufang Luo, Zilong Wang, Dongqi Han, Zhiyuan He, Dongsheng Li, Yunjian Xu
2026 International Conference on Learning Representations | April 2026
Zhihe Yang, Xufang Luo, Zilong Wang, Dongqi Han, Zhiyuan He, Dongsheng Li, Yunjian Xu
2026 International Conference on Learning Representations | April 2026
Nan Chen, Luna K. Qiu, Arran Zeyu Wang, Zilong Wang, Yuqing Yang
ArXiv | June 2025, Vol abs/2506.13270
Zilong Wang, Nan Chen, Luna K. Qiu, Ling Yue, Geli Guo, Yang Ou, Shiqi Jiang, Yuqing Yang, Lili Qiu
October 2024
Siyun Zhao, Yuqing Yang, Zilong Wang, Zhiyuan He, Luna K. Qiu, Lili Qiu
September 2024
Zhihe Yang, Xufang Luo, Zilong Wang, Dongqi Han, Zhiyuan He, Dongsheng Li, Yunjian Xu
2026 International Conference on Learning Representations | April 2026
Zhihe Yang, Xufang Luo, Zilong Wang, Dongqi Han, Zhiyuan He, Dongsheng Li, Yunjian Xu
2026 International Conference on Learning Representations | April 2026
Nan Chen, Luna K. Qiu, Arran Zeyu Wang, Zilong Wang, Yuqing Yang
ArXiv | June 2025, Vol abs/2506.13270
Zilong Wang, Nan Chen, Luna K. Qiu, Ling Yue, Geli Guo, Yang Ou, Shiqi Jiang, Yuqing Yang, Lili Qiu
October 2024
Siyun Zhao, Yuqing Yang, Zilong Wang, Zhiyuan He, Luna K. Qiu, Lili Qiu
September 2024
Zhihe Yang, Xufang Luo, Zilong Wang, Dongqi Han, Zhiyuan He, Dongsheng Li, Yunjian Xu
2026 International Conference on Learning Representations | April 2026
Nan Chen, Luna K. Qiu, Arran Zeyu Wang, Zilong Wang, Yuqing Yang
ArXiv | June 2025, Vol abs/2506.13270
Zilong Wang, Nan Chen, Luna K. Qiu, Ling Yue, Geli Guo, Yang Ou, Shiqi Jiang, Yuqing Yang, Lili Qiu
October 2024
Siyun Zhao, Yuqing Yang, Zilong Wang, Zhiyuan He, Luna K. Qiu, Lili Qiu
September 2024