AI for Science
Seed-AI for Science 团队专注于科学计算领域的前瞻技术探索,围绕生物领域基础模型、量子化学、分子动力学等方向,用 AI 推动科学领域的研究范式突破

课题方向

多模态生物基础大模型
开发自然科学的多模态基础大模型,用于蛋白质、DNA、RNA 等生物分子的设计、构象生成和结构预测
Multimodal Foundation Models
Natural Sciences
Multimodal Foundation Models
Natural Sciences

量子化学
专注于机器学习与量子物理、量子化学的交叉研究,实现大规模高精度科学计算数值模拟
Machine Learning
Quantum Physics
Quantum Chemistry
Machine Learning
Quantum Physics

多模态生物分子结构大模型
构建以结构为中心的生物分子大模型,支撑全生物分子类型(蛋白、DNA、RNA、小分子、离子、翻译后修饰)的复合物结构和动态预测、功能建模、分子设计等关键任务,打造有全球影响力的 Protenix 开源模型系列
Biomolecular Structure
Foundation Model
Open-source Model
Biomolecular Structure
Foundation Model

AI 分子动力学
探索机器学习方法在力场开发、分子动力学模拟、增强采样和其他性质计算方法中的应用,并规模化应用在药物和材料的发现中
Machine Learning
Molecular Dynamics
Drug
Material
Machine Learning
Molecular Dynamics
精选论文

2025.06.12
Elucidating the Design Space of Multimodal Protein Language Models
Multimodal protein language models (PLMs) integrate sequence and token-based structural information, serving as a powerful foundation for protein modeling, generation, and design. However, the reliance on tokenizing 3D structures into discrete tokens causes substantial loss of fidelity about fine-grained structural details and correlations. In this paper, we systematically elucidate the design space of multimodal PLMs to overcome their limitations. We identify tokenization loss and inaccurate structure token predictions by the PLMs as major bottlenecks. To address these, our proposed design space covers improved generative modeling, structure-aware architectures and representation learning, and data exploration. Our advancements approach finer-grained supervision, demonstrating that token-based multimodal PLMs can achieve robust structural modeling. The effective design methods dramatically improve the structure generation diversity, and notably, folding abilities of our 650M model by reducing the RMSD from 5.52 to 2.36 on PDB testset, even outperforming 3B baselines and on par with the specialized folding models.
Cheng-Yen Hsieh, Xinyou Wang, Daiheng Zhang, Dongyu Xue, Fei Ye, Shujian Huang, Zaixiang Zheng, Quanquan Gu
AI for Science
2025.06.12
Elucidating the Design Space of Multimodal Protein Language Models
Cheng-Yen Hsieh, Xinyou Wang, Daiheng Zhang, Dongyu Xue, Fei Ye, Shujian Huang, Zaixiang Zheng, Quanquan Gu
AI for Science
查看更多论文
热招岗位
科学计算云原生工程师-Seed
CADD/结构生物学/计算生物算法研究员-Seed
生物分子结构大模型算法研究员-Seed
机器学习算法研究员-Seed
量子化学与机器学习研究员-Seed
多模态生物基础大模型研究员-Seed