HomeModelsResearchJoin Us
EN
中文
HomeModelsResearchJoin Us
Speech
The mission of the Seed Speech team is to enrich interactive and creative processes through the application of multimodal speech technologies. The team focuses on the forefront of research and product development in speech and audio, music, natural language understanding, and multimodal deep learning.
Latest advancements

Seed LiveInterpret

Seed LiveInterpret
Seed LiveInterpret is a real-time simultaneous interpretation model that delivers high-quality, low-latency speech-to-speech interpretation with real-time voice cloning support. Compared to traditional cascaded systems (S2T-T2S), its end-to-end architecture shows significant improvements in both interpretation quality and end-to-end latency.

Seed Realtime Voice Model

Seed Realtime Voice Model
A foundational real-time voice system that achieves human-like end-to-end conversational interactions. Compared to traditional cascaded architectures, it demonstrates superior expressive capabilities, precise vocal control, and coherent emotional continuity. It also features low latency and the ability to be interrupted at any time during dialogue.

Seed-Music

Seed-Music
Seed-Music is a collection of music generation models with flexible control capabilities, offering four core functions: controllable music generation, score-to-music conversion, lyric and music editing, and zero-sample vocal cloning. By smartly merging the benefits of language and diffusion models into the composition process, Music makes crafting songs accessible to everyone.

Selected papers

Jul 24, 2025
Seed LiveInterpret 2.0: End-to-end Simultaneous Speech-to-speech Translation with Your Voice
Speech&Audio
Feb 25, 2025
You Only Sample Once: Taming One-Step Text-to-Image Synthesis by Self-Cooperative Diffusion GANs
Computer Vision
Sep 13, 2024
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Speech&Audio
View More

Featured roles

Research Scientist, Multimodality
San Jose / Seattle
Experienced Hiring
Apply Now
Research Scientist, Foundation Model, Music Intelligence
San Jose
Experienced Hiring
Apply Now
Research Scientist in Foundation Model, Speech & Audio Generation - 2025 Start (PhD)
San Jose / Seattle
Campus Hiring
Apply Now
Research Scientist in Foundation Model, Music - 2025 Start (PhD)
San Jose
Campus Hiring
Apply Now
Student Researcher (Seed - Foundation Model - Speech Understanding) - 2025 Start (PhD)
San Jose / Seattle
Internship
Apply Now
Student Researcher (Seed - Music Foundation Model) - 2025 Start (PhD)
San Jose
Internship
Apply Now
View More
Models
Seed1.8Seed1.5-VLSeedance 1.5 proSeedream 4.5Seed LiveInterpret 2.0Seed Realtime VoiceSeed Music
Teams
LLMInfrastructuresVisionSpeechMultimodal Interaction & World ModelAI for ScienceRoboticsResponsible AI
Learn More
ModelsResearchJoin UsTop SeedSeed Edge
Models
Seed1.8
Seed1.5-VL
Seedance 1.5 pro
Seedream 4.5
Seed LiveInterpret 2.0
Seed Realtime Voice
Seed Music
Teams
LLM
Infrastructures
Vision
Speech
Multimodal Interaction & World Model
AI for Science
Robotics
Responsible AI
Learn More
Models
Research
Join Us
Top Seed
Seed Edge
Advancing the frontier of intelligence, in service of humanity
Join ByteDance Seed
Copyright © 2026 Bytedance Seed
User AgreementPrivacy Policy
Contact us : seed.feedback@bytedance.com
Join ByteDance Seed
Copyright © 2026 Bytedance Seed
User AgreementPrivacy Policy