Curated By: Shaurya Sharma
Last Updated: August 23, 2023, 08:23 IST
Menlo Park, California, USA
Meta has unveiled a brand new AI mannequin referred to as SeamlessM4T, which is designed to assist customers translate textual content and speech extra effectively throughout totally different languages.
The firm says SeamlessM4T is the primary all-in-one multimodal and multilingual AI translation mannequin. It can acknowledge speech in almost 100 languages, and translate speech to textual content in almost 100 enter and output languages. It additionally helps text-to-text translation, text-to-speech translation, and even speech-to-speech translation.
Meta is making SeamlessM4T obtainable publicly with a analysis license in order that researchers can construct additional on the already current work.
“Building a universal language translator, like the fictional Babel Fish in The Hitchhiker’s Guide to the Galaxy, is challenging because existing speech-to-speech and speech-to-text systems only cover a small fraction of the world’s languages. But we believe the work we’re announcing today is a significant step forward in this journey,” Meta notes.
It additionally mentioned that when evaluating this mannequin to different “approaches using separate models, SeamlessM4T’s single system approach reduces errors and delays, increasing the efficiency and quality of the translation process. This enables people who speak different languages to communicate with each other more effectively.”
Meta additionally acknowledged that the creation of this mannequin is all in direction of making a “universal translator.” And, that the present mannequin attracts inspiration from a few of the firm’s current fashions like No Language Left Behind and Massively Multilingual Speech.
“In the future, we want to explore how this foundational model can enable new communication capabilities—ultimately bringing us closer to a world where everyone can be understood.,” Meta mentioned.
In associated information, Meta additionally lately unveiled its AudiCraft AI instrument which lets customers create authentic audio tracks utilizing text-based prompts. The instrument is split into three fashions: AudioGen, MusicGen, and EnCodec. AudioGen generates audio from textual content prompts based mostly on public sound results, whereas MusicGen does the identical factor however with music licensed by Meta. EnCodec decoder permits for greater high quality music technology with fewer artefacts.