Seed-Thinking-v1.5: Advancing Superb Reasoning Models with Reinforcement Learning

摘要

We introduce Seed-Thinking-v1.5, capable of reasoning through thinking before responding, resulting in improved performance on a wide range of benchmarks. Seed-Thinking-v1.5 achieves 86.7 on AIME 2024, 55.0 on Codeforces and 77.3 on GPQA, demonstrating excellent reasoning abilities in STEM and coding. Beyond reasoning tasks, the method demonstrates notable generalization across diverse domains. For instance, it surpasses DeepSeek R1 by 8% in win rate on non-reasoning tasks, indicating its broader applicability. Compared to other state-of-the-art reasoning models, Seed-Thinking-v1.5 is a Mixture-of-Experts (MoE) model with a relatively small size, featuring 20B activated and 200B total parameters. As part of our effort to assess generalized reasoning, we develop two internal benchmarks, BeyondAIME and Codeforces, both of which will be publicly released to support future research.

作者

Jiaze Chen, TianTian Fan, Xin Liu, Lingjun Liu, Zhiqi Lin, Mingxuan Wang, Chengyi Wang, Xiangpeng Wei, Wenyuan Xu,Yufeng Yuan, Yu Yue, Lin Yan, Qiying Yu, Xiaochen Zuo, Chi Zhang

期刊/会议

https://github.com/ByteDance-Seed/Seed-Thinking-v1.5

模型成果

Seed2.0 Seedance 2.0 Seedream 5.0 Lite Seeduplex Seed GR-RL

研究团队

LLM Infrastructures Vision Speech Multimodal Interaction & World Model AI for Science Robotics Responsible AI

了解更多

博客 Seed Edge 校园招聘