Using AI to Translate Speech For a Primarily Oral Language

Media Release by Meta

AI-powered speech translation has mainly focused on written languages, yet nearly 3,500 living languages are primarily spoken and don’t have a widely used writing system. This makes it impossible to build machine translation tools using standard techniques, which require large amounts of written text in order to train an AI model.

To address this challenge, we’ve built the first AI-powered speech-to-speech translation system for Hokkien, a primarily oral language that’s widely spoken within the Chinese diaspora but lacks a standard written form. We’re open-sourcing our Hokkien translation models, evaluation datasets and research papers so that others can reproduce and build on our work.

The translation system is part of our Universal Speech Translator project, which is developing new AI methods that we hope will eventually allow real-time speech-to-speech translation across many languages. We believe spoken communication can bring people together wherever they are located — even in the metaverse.

A New Modeling Approach

Many speech translation systems rely on transcriptions. However, since primarily oral languages don’t have standard written forms, producing transcribed text as the translation output doesn’t work. So, we focused on speech-to-speech translation.

To do this, we developed a variety of methods, such as using speech-to-unit translation to translate input speech to a sequence of acoustic sounds, and generated waveforms from them or rely on text from a related language, in this case Mandarin.

Looking to the Future of Translation

While the Hokkien translation model is still a work in progress and can translate only one full sentence at a time, it’s a step toward a future where simultaneous translation between languages is possible. The techniques we pioneered can be extended to many other written and unwritten languages.

We’re also releasing SpeechMatrix, which is a large collection of speech-to-speech translations developed through our innovative natural language processing toolkit called LASER. These tools will enable other researchers to create their own speech-to-speech translation systems and build on our work. And our progress in what researchers refer to as unsupervised learning demonstrates the feasibility of building high-quality speech-to-speech translation models without any human annotations. This will help extend those models to work for languages where there isn’t any labeled training data available to train the system.

Our AI research is helping break down language barriers in both the physical world and the metaverse to encourage connection and mutual understanding. We look forward to expanding our research and bringing this technology to more people in the future.

Using AI to Translate Speech For a Primarily Oral Language

A New Modeling Approach

Looking to the Future of Translation

Latest News

How organisations can prepare for the next wave of AI and...

Good Design Australia opens entries for 2026 awards

Report: AI implementation falls short as APAC execs demand measurable ROI

Bentley launches global search for top infrastructure engineering innovations

Minitab acquisition highlights growing role of experimentation in modern manufacturing

Featured

The rise of AI chatbots: what are they and how to...

World’s First Space Hotel to Start Construction in 2025

The “deal with it” glasses were sold as NFT for $22,000

SpaceX Starship rocket lands successfully ..then it blows up

Google makes it easier to swap between user profiles in Chrome

More News

Bitcoin rises 5% to $50,942.58

IBM aims to produce net zero Greenhouse Gas Emissions by 2030

Apple unveils iPhone 14 Pro and iPhone 14 Pro Max

This Incredible Map Shows 25,000 Supermassive Black Holes

Latest News

How organisations can prepare for the next wave of AI and automation

Good Design Australia opens entries for 2026 awards

Report: AI implementation falls short as APAC execs demand measurable ROI

More News

Fuelled by hope and fear, cryptocurrency markets are primed for contagion

NVIDIA Wins NeurIPS Awards for Research on Generative AI, Generalist AI Agents

Australian Government announced a move to regulate cryptocurrency

Using AI to Translate Speech For a Primarily Oral Language

A New Modeling Approach

Looking to the Future of Translation

Latest News

Featured

More News

Recommended For You

Latest News

More News