AI for Science
The Seed-AI for Science team applies AI to drive breakthroughs in scientific computing research paradigms, focusing on biological foundation models, quantum chemistry, and molecular dynamics.

Research topics

Multimodal Biological Foundation Model
Develop multimodal foundation models for natural sciences, focusing on the design, conformation generation, and structure prediction of biological molecules such as proteins, DNA, and RNA.
Multimodal Foundation Models
Natural Sciences
Multimodal Foundation Models
Natural Sciences

Quantum Chemistry
Focus on interdisciplinary research at the crossroads of machine learning, quantum physics, and quantum chemistry to enable large-scale, high-precision numerical simulations for scientific computing.
Machine Learning
Quantum Physics
Quantum Chemistry
Machine Learning
Quantum Physics

Multimodal Foundation Model for Biomolecular Structure
Develop a structure-centric foundation model for biomolecules to support key tasks like predicting complex structures and dynamics, functional modeling, and molecular design across all biomolecule types—including proteins, DNA, RNA, small molecules, ions, and post-translational modifications. The aim is to create the globally impactful Protenix open-source model series.
Biomolecular Structure
Foundation Model
Open-source Model
Biomolecular Structure
Foundation Model

AI Molecular Dynamics
Explore the application of machine learning techniques in force field development, molecular dynamics simulations, enhanced sampling, and other computational methods, scaling their use to advance drug and material discovery.
Machine Learning
Molecular Dynamics
Drug
Material
Machine Learning
Molecular Dynamics
Selected Papers

Jun 12, 2025
Elucidating the Design Space of Multimodal Protein Language Models
Multimodal protein language models (PLMs) integrate sequence and token-based structural information, serving as a powerful foundation for protein modeling, generation, and design. However, the reliance on tokenizing 3D structures into discrete tokens causes substantial loss of fidelity about fine-grained structural details and correlations. In this paper, we systematically elucidate the design space of multimodal PLMs to overcome their limitations. We identify tokenization loss and inaccurate structure token predictions by the PLMs as major bottlenecks. To address these, our proposed design space covers improved generative modeling, structure-aware architectures and representation learning, and data exploration. Our advancements approach finer-grained supervision, demonstrating that token-based multimodal PLMs can achieve robust structural modeling. The effective design methods dramatically improve the structure generation diversity, and notably, folding abilities of our 650M model by reducing the RMSD from 5.52 to 2.36 on PDB testset, even outperforming 3B baselines and on par with the specialized folding models.
Cheng-Yen Hsieh, Xinyou Wang, Daiheng Zhang, Dongyu Xue, Fei Ye, Shujian Huang, Zaixiang Zheng, Quanquan Gu
AI for Science
2025.06.12
Elucidating the Design Space of Multimodal Protein Language Models
Cheng-Yen Hsieh, Xinyou Wang, Daiheng Zhang, Dongyu Xue, Fei Ye, Shujian Huang, Zaixiang Zheng, Quanquan Gu
AI for Science
Learn More
Featured Jobs
Research Scientist in Foundation Models for Science
Research Scientist, Computational Biology
Machine Learning Research Scientist - Atomistic AI
Research Scientist in Generative AI for Science (ByteDance Seed) - 2026 Start (PhD)
Research Scientist Graduate, Biomolecular Structure Foundation Models, (Seed AI-for-Science) - 2026 Start (PhD)
Machine Learning Research Scientist – Atomistic AI - 2026 Start (PhD)
Research Scientist in Foundation Models for Science
San Jose
Experienced Hiring
Apply Now
Research Scientist, Computational Biology
Seattle
Experienced Hiring
Apply Now
Machine Learning Research Scientist - Atomistic AI
Seattle
Experienced Hiring
Apply Now
Research Scientist in Generative AI for Science (ByteDance Seed) - 2026 Start (PhD)
San Jose
Campus Recruitment
Apply Now
Research Scientist Graduate, Biomolecular Structure Foundation Models, (Seed AI-for-Science) - 2026 Start (PhD)
Seattle
Campus Recruitment
Apply Now
Machine Learning Research Scientist – Atomistic AI - 2026 Start (PhD)
Seattle
Campus Recruitment
Apply Now