LLM

The Seed Large Language Model (LLM) team is dedicated to aggressively advancing the next generation of LLMs, tackling fundamental challenges in LLM development head-on. Our areas of focus include model pretraining, posttraining, inference, memory capabilities, learning, interpretability and other related directions. We dive deep into the latest technologies and create comprehensive solutions from concept to completion. In our endeavor to adopt LLMs in real-life scenarios, we persistently seek out methods to enhance applications through technological innovation.

Research topics

Horizon

Limits of the Long CoT Model

Explore the limits of long reasoning models, continuously expanding from the perspective of inference-time scaling and model scaling, with the objective of solving complex problems that humans cannot yet address.

O-model Architecture

The scaling law of inference dimension is a key to achieving ultimate intelligence. We aim to develop a lifelong learning intelligent system, enabling models to possess reasoning capability with linear complexity.

Memory

Establish a streaming memory mechanism that can manage the context of unlimited length and truly achieve online learning, such as learning to code by reading algorithms and learning a new language by reading a grammar book.

Agent's Reasoning and Planning Capabilities

Committed to solving the core and fundamental problems in the field of agent, building a super intelligence system to accelerate the leapfrog development in natural sciences, economic production, and daily life, and comprehensively enhancing societal efficiency and the quality of human life.

Pretrain

Next-Generation Pretraining Paradigm

Exploring large-scale synthetic data to overcome the growth constraints of real-world data. Research AI's autonomous data iteration to ensure a seamless transition between pre-training and post-training.

Data Compression for Language Models

The compression of human civilization and world knowledge relies on continuous data mining and repeated hypotheses and experiments.

Push the Limit of Model Performance and Efficiency

Push the boundaries of model intelligence with visionary ambition, while exploring the lower limit of parameter scale in a pragmatic way. The model's performance and efficiency must both be excellent.

Research on Training Dynamics and Mechanisms

Let scaling laws extend to every aspect of model optimization and enable explainability mechanisms to inspect neural network loads. Reveal the principles of model training from a physics perspective.

Enhanced Long Context Capabilities

Unlock the model's capacity to manage long contexts and increase token length limits for better content understanding and generation. The model should be able to grasp and generate extensive content within a single instance.

Posttrain

Large Scale Reinforcement Learning

Address the issue of large-scale RL scaling, enhance model capabilities, and align with human preferences.

Reward Model System

Integrate model, verifier, tool, and agent to provide accurate and generalized signals for data selection, synthesis, and RL training.

Superb Reasoning and Generalization

Further push the boundaries of reasoning, achieving human expert level across more domains.

Long Horizon Task / Agent

Address long-distance, multi-turn modeling in long horizon task/agent, enabling models to truly solve complex problems in the human world.

Next Generation of RM/RL Algorithms

Explore new RM/RL algorithms that can overcome the current limitations.

Data Quality Optimization

Continuously optimize post-training data to further enhance the model's capability limits.

Code

Code Pre-training

Enhance the foundational coding abilities of the Doubao model through methods such as raw data filtering and data synthesis based on commit/issue/pr data.

Data Synthesis Based on Execution Feedback

The characteristic of code data is that it can be "run" to leverage computational power for supervision signals beyond internet data. Scaling up these methods enhances the coding and logic capabilities of the next-generation large models.

Automatic Construction of Code Agent Data

Automate the creation of correct and diverse coding competition/engineering problems and automate engineering environment configuration to provide data support for large-scale reinforcement learning of code agents.

Learning to Learn

Research aimed at model self-evolution, enabling the model to learn to acquire and process training data to improve itself.

Model

Model Reliable

Research on ensuring stable and efficient training during the scaling up of models. Analyze and solve the stability and efficiency issues in parameter optimization during the scaling process to maintain stable training and adhere to the scaling law effectively.

Long Context

Research on long context and combine it with deep research and reasoning to optimize the performance and efficiency of training and inference.

Model Structure

Study the structure of foundation models, such as MoE, residual connections, normalization, tokenization, and other algorithmic aspects to achieve higher efficiency in LLMs. Investigate how model structures impact the performance ceiling of large models.

Efficient

Cover multiple aspects, including computational efficiency (completing training and inference within limited computational resources and time), storage efficiency (occupying less GPU memory), and data utilization efficiency (learning more knowledge from limited data). This includes techniques such as quantization, engineering optimization of MFU, pruning, etc.