AI for Science
The Seed-AI for Science team applies AI to drive breakthroughs in scientific computing research paradigms, focusing on biological foundation models, quantum chemistry, and molecular dynamics.
Research topics
Multimodal biological foundation model
Develop multimodal foundation models for natural sciences, focusing on the design, conformation generation, and structure prediction of biological molecules such as proteins, DNA, and RNA.
Multimodal Foundation Models
Natural Sciences
Quantum chemistry
Focus on interdisciplinary research at the crossroads of machine learning, quantum physics, and quantum chemistry to enable large-scale, high-precision numerical simulations for scientific computing.
Machine Learning
Quantum Physics
Quantum Chemistry
Multimodal foundation model for biomolecular structure
Develop a structure-centric foundation model for biomolecules to support key tasks like predicting complex structures and dynamics, functional modeling, and molecular design across all biomolecule types—including proteins, DNA, RNA, small molecules, ions, and post-translational modifications. The aim is to create the globally impactful Protenix open-source model series.
Biomolecular Structure
Foundation Model
Open-source Model
AI Molecular Dynamics
Explore the application of machine learning techniques in force field development, molecular dynamics simulations, enhanced sampling, and other computational methods, scaling their use to advance drug and material discovery.
Machine Learning
Molecular Dynamics
Drug
Material

Selected Papers

Sep 02, 2025
PXDesign: Fast, Modular, and Accurate De Novo Design of Protein Binders
PXDesign achieves nanomolar binder hit rates of 20–73% across five of six diverse protein targets, surpassing prior methods such as AlphaProteo. This experimental success rate is enabled by advances in both binder generation and filtering. We develop both a diffusion-based generative model (PXDesign-d) and a hallucination-based approach (PXDesign-h), each showing strong in silico performance that outperforms existing models. Beyond generation, we systematically analyze confidence-based filtering and ranking strategies from multiple structure predictors, comparing their accuracy, efficiency, and complementarity on datasets spanning de novo binders and mutagenesis. Finally, we validate the full design process experimentally, achieving high hit rates and multiple nanomolar binders. To support future work and community use, we release a unified benchmarking framework at https://github.com/bytedance/PXDesignBench, provide public access to PXDesign via a webserver at https://protenix-server.com, and share all designed binder sequences at https://protenix.github.io/pxdesign.
Milong Ren, Jinyuan Sun, Jiaqi Guan, Cong Liu, Chengyue Gong, Yuzhe Wang, Lan Wang, Qixu Cai, Xinshi Chen, Wenzhi Xiao, Protenix Team
Molecular Biology
Jun 12, 2025
Elucidating the Design Space of Multimodal Protein Language Models
Multimodal protein language models (PLMs) integrate sequence and token-based structural information, serving as a powerful foundation for protein modeling, generation, and design. However, the reliance on tokenizing 3D structures into discrete tokens causes substantial loss of fidelity about fine-grained structural details and correlations. In this paper, we systematically elucidate the design space of multimodal PLMs to overcome their limitations. We identify tokenization loss and inaccurate structure token predictions by the PLMs as major bottlenecks. To address these, our proposed design space covers improved generative modeling, structure-aware architectures and representation learning, and data exploration. Our advancements approach finer-grained supervision, demonstrating that token-based multimodal PLMs can achieve robust structural modeling. The effective design methods dramatically improve the structure generation diversity, and notably, folding abilities of our 650M model by reducing the RMSD from 5.52 to 2.36 on PDB testset, even outperforming 3B baselines and on par with the specialized folding models.
Cheng-Yen Hsieh, Xinyou Wang, Daiheng Zhang, Dongyu Xue, Fei Ye, Shujian Huang, Zaixiang Zheng, Quanquan Gu
AI for Science
Learn More

Featured Jobs

Research Scientist in Foundation Models for Science
San Jose
Experienced Hiring
Apply Now
Research Scientist, Computational Biology
Seattle
Experienced Hiring
Apply Now
Machine Learning Research Scientist - Atomistic AI
Seattle
Experienced Hiring
Apply Now
Research Scientist in Generative AI for Science (ByteDance Seed) - 2026 Start (PhD)
San Jose
Campus Recruitment
Apply Now
Research Scientist Graduate, Biomolecular Structure Foundation Models, (Seed AI-for-Science) - 2026 Start (PhD)
Seattle
Campus Recruitment
Apply Now
Machine Learning Research Scientist – Atomistic AI - 2026 Start (PhD)
Seattle
Campus Recruitment
Apply Now