Home Top Seed Seed Edge Research News Join Us

EN

中文

Home Top Seed Seed Edge Research News Join Us

Infrastructures

The Seed Infrastructures team oversees the distributed training, reinforcement learning framework, high-performance inference, and heterogeneous hardware compilation technologies for AI foundation models.

Research topics

Ultra-large-scale training clusters

Study methods to improve the stability and model flops utilization (MFU) of large scale training clusters, including cross-cluster, low precision, fault tolerant and elastic training techniques.

Large-scale

Stability

Large-scale

Stability

Reinforcement learning systems

Research on end-to-end large model reinforcement learning systems, designing the next-generation RL systems under dynamic loads, complex agent/environment interactions, heterogeneous resources, and multimodal scenarios.

Reinforcement learning

Agent

Optimization

Reinforcement learning

Agent

Inference parallelization solutions

Research on overcoming compute and memory access bottlenecks during inference, including multi-node inference and parallel inference strategies on heterogeneous hardware.

Inference

Parallel

Inference

Parallel

Next-Generation Model and Hardware Co-Optimizatio

Research on advanced model architectures, training and inference paradigms by co-designing next-generation hardware systems with next-generation generative and understanding model architectures.

Systems-algorithm co-design

Model architecture

Systems-algorithm co-design

Model architecture

Compiler Optimization for Heterogeneous Hardware

Research on high-performance operator compilation and joint optimization of computation and communication for emerging hardware architectures.

Heterogeneous systems

Compiler

Heterogeneous systems

Compiler

Selected Papers

Elucidating the Design Space of Multimodal Protein Language Models

Multimodal protein language models (PLMs) integrate sequence and token-based structural information, serving as a powerful foundation for protein modeling, generation, and design. However, the reliance on tokenizing 3D structures into discrete tokens causes substantial loss of fidelity about fine-grained structural details and correlations. In this paper, we systematically elucidate the design space of multimodal PLMs to overcome their limitations. We identify tokenization loss and inaccurate structure token predictions by the PLMs as major bottlenecks. To address these, our proposed design space covers improved generative modeling, structure-aware architectures and representation learning, and data exploration. Our advancements approach finer-grained supervision, demonstrating that token-based multimodal PLMs can achieve robust structural modeling. The effective design methods dramatically improve the structure generation diversity, and notably, folding abilities of our 650M model by reducing the RMSD from 5.52 to 2.36 on PDB testset, even outperforming 3B baselines and on par with the specialized folding models.

Cheng-Yen Hsieh, Xinyou Wang, Daiheng Zhang, Dongyu Xue, Fei Ye, Shujian Huang, Zaixiang Zheng, Quanquan Gu

Elucidating the Design Space of Multimodal Protein Language Models

Cheng-Yen Hsieh, Xinyou Wang, Daiheng Zhang, Dongyu Xue, Fei Ye, Shujian Huang, Zaixiang Zheng, Quanquan Gu

Seedance 1.0: Exploring the Boundaries of Video Generation Models

Notable advances in diffusion modeling have propelled rapid improvements in video generation, yet current foundational model still confront critical challenges in synergistically balancing prompt following, motion plausibility, and visual quality. In this report, we introduce Seedance 1.0, a high-performance and inference-efficient video foundation generation model that integrates several core technical improvements: (i) multi-source data curation augmented with precision and meaningful video captioning, enabling comprehensive learning across diverse scenarios; (ii) an efficient pre-training paradigm that enables multiple features or functions such as interleaved multimodal positional encoding, native multi-shot generation capacity, and multi-task modeling; (iii) carefully-designed post-training optimization leveraging fine-grained supervised fine-tuning, video-specific RLHF with multi-dimensional reward mechanisms for considerable performance improvements; (iv) excellent model acceleration achieving 10× inference speedup through multi- stage distillation strategies and system-level optimizations. Seedance 1.0 can generate a 5-second video at 1080p resolution only with 41.4 seconds. Compared to state-of-the-art video generation models, Seedance 1.0 stands out with high-quality and fast video generation with superior spatiotemporal fluidity with structural stability, precise instruction adherence in complex multi-subject contexts, native multi-shot narrative coherence with consistent subject representation, and ultra-fast inference.

Seed Vision Team

Computer Vision

Seedance 1.0: Exploring the Boundaries of Video Generation Models

Seed Vision Team

Computer Vision

Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal Learning

Modern robot navigation systems encounter difficulties in diverse and complex indoor environments. Traditional approaches rely on multiple modules with small models or rule-based systems and thus lack adaptability to new environments. To address this, we developed Astra, a comprehensive dual-model architecture, Astra-Global and Astra-Local, for mobile robot navigation. Astra-Global, a multimodal LLM, processes vision and language inputs to perform self and goal localization using a hybrid topological-semantic graph as the global map, and outperforms traditional visual place recognition methods. Astra-Local, a multitask network, handles local path planning and odometry estimation. Its 4D spatial-temporal encoder, trained through self-supervised learning, generates robust 4D features for downstream tasks. The planning head utilizes flow matching and a novel masked ESDF loss to minimize collision risks for generating local trajectories, and the odometry head integrates multi-sensor inputs via a transformer encoder to predict the relative pose of the robot. Deployed on real in-house mobile robots, Astra achieves high end-to-end mission success rate across diverse indoor environments.

Sheng Chen, Peiyu He, Jiaxin Hu, Ziyang Liu, Yansheng Wang, Tao Xu, Chi Zhang, Chongchong Zhang, Chao An, Shiyu Cai, Duo Cao, Kangping Chen, Shuai Chu, Tianwei Chu, Mingdi Dan, Min Du, Weiwei Fang, Pengyou Fu, Junkai Hu, Xiaowei Jiang, Zhaodi Jiang, Fuxuan Li, Jun Li, Minghui Li, Mingyao Li, Yanchang Li, Zhibin Li, Guangming Liu, Kairui Liu, Lihao Liu, Weizhi Liu, Xiaoshun Liu, Yufei Liu, Yunfei Liu, Qiang Lu, Yuanfei Luo, Xiang Lv, Hongying Ma, Sai Ma, Lingxian Mi, Sha Sa, Hongxiang Shu, Lei Tian, Chengzhi Wang, Jiayu Wang, Kaijie Wang, Qingyi Wang, Renwen Wang, Tao Wang, Wei Wang, Xirui Wang, Chao Wei, Xuguang Wei, Zijun Xia, Zhaohao Xiao, Tingshuai Yan, Liyan Yang, Yifan Yang, Zhikai Yang, Zhong Yin, Li Yuan, Liuchun Yuan, Chi Zhang, Jinyang Zhang, Junhui Zhang, Linge Zhang, Zhenyi Zhang, Zheyu Zhang, Dongjie Zhu, Hang Li, Yangang Zhang

Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal Learning

Sheng Chen, Peiyu He, Jiaxin Hu, Ziyang Liu, Yansheng Wang, Tao Xu, Chi Zhang, Chongchong Zhang, Chao An, Shiyu Cai, Duo Cao, Kangping Chen, Shuai Chu, Tianwei Chu, Mingdi Dan, Min Du, Weiwei Fang, Pengyou Fu, Junkai Hu, Xiaowei Jiang, Zhaodi Jiang, Fuxuan Li, Jun Li, Minghui Li, Mingyao Li, Yanchang Li, Zhibin Li, Guangming Liu, Kairui Liu, Lihao Liu, Weizhi Liu, Xiaoshun Liu, Yufei Liu, Yunfei Liu, Qiang Lu, Yuanfei Luo, Xiang Lv, Hongying Ma, Sai Ma, Lingxian Mi, Sha Sa, Hongxiang Shu, Lei Tian, Chengzhi Wang, Jiayu Wang, Kaijie Wang, Qingyi Wang, Renwen Wang, Tao Wang, Wei Wang, Xirui Wang, Chao Wei, Xuguang Wei, Zijun Xia, Zhaohao Xiao, Tingshuai Yan, Liyan Yang, Yifan Yang, Zhikai Yang, Zhong Yin, Li Yuan, Liuchun Yuan, Chi Zhang, Jinyang Zhang, Junhui Zhang, Linge Zhang, Zhenyi Zhang, Zheyu Zhang, Dongjie Zhu, Hang Li, Yangang Zhang

Learn More

Featured Jobs

Research Scientist in ML Systems

Seattle / San Jose

Experienced Hiring

Software Engineer, ML System Architecture

Seattle / San Jose

Experienced Hiring

Research Scientist, Applied Machine Learning

Seattle / San Jose

Campus Recruitment

Software Engineer in Machine Learning Systems

Seattle / San Jose

Campus Recruitment

Software Engineer Intern (Seed - Machine Learning System)

Seattle / San Jose

Internship

Research Scientist Intern (Seed - Machine Learning System)

Seattle / San Jose

Internship

Research Scientist in ML Systems

Seattle / San Jose

Experienced Hiring

Software Engineer, ML System Architecture

Seattle / San Jose

Experienced Hiring

Research Scientist, Applied Machine Learning

Seattle / San Jose

Campus Recruitment

Software Engineer in Machine Learning Systems

Seattle / San Jose

Campus Recruitment

Software Engineer Intern (Seed - Machine Learning System)

Seattle / San Jose

Research Scientist Intern (Seed - Machine Learning System)

Seattle / San Jose

Do great things with great people

Join ByteDance Seed

User Agreement Privacy Policy

Follow ByteDance Seed to keep abreast of the latest technological developments, research achievements, and career opportunities.

Copyright © 2025 Bytedance Seed

Do great things with great people

Join ByteDance Seed

User Agreement Privacy Policy

Copyright © 2025 Bytedance Seed