Spotify and OpenAI are working together to give podcasters the ability to translate themselves into other languages. This happens almost completely automatically and even in the podcaster’s voice. The AI translator tool is initially available to a select few podcasters, who can have their episodes translated from English to Spanish, French and German.
The translations are done using OpenAI’s Whisper. Whisper actually specializes in transcribing texts. These can then be transformed back into language. In order to reproduce the original voice of a podcaster, they must first create a synthetic version of their voice; a few seconds of audio input are sufficient. Of course, this is not possible for everyone, also for security reasons, writes OpenAI: “These options also pose new risks, such as misuse to impersonate known people or other fraud.”
That’s why there’s another version of voice technology in ChatGPT’s voice assistant function, which has also just been announced. OpenAI itself worked with professional voice actors to synthesize their voices. Although these should still sound a little tinny, they should still be close to the original and thus enable a completely different experience than previous voice assistants such as Alexa or Siri.
Spotify writes in its announcement about voice translation for podcasters: “With recent advances, we asked ourselves: Are there more ways to bridge the language barrier so these voices can be heard worldwide?” The first testers are Dax Shepard, Monica Padman, Lex Fridman, Bill Simmons and Steven Barlett, all English-speaking podcasters. Some past episodes as well as upcoming episodes will be translated. Other podcasters are expected to follow soon, for example Trevor Noah is already in the pipeline.
Youtube and Hey Gen also offer video translators
YouTube has also already released an AI-supported translation function for videos. This allows YouTubers to create alternative audio tracks to reach a wider audience. This function is currently only available for selected people and languages. It works just like the collaboration between OpenAI and Spotify. First, a transcript of the audio track is created, translated and then transferred again to a text-to-speech model. The Aloud team is responsible at Google. However, there is currently no way to use your own synthesized voice. So far it is a computer-generated voice that then delivers the translation. However, Google has already announced that this will change.
Google recently presented the current version of the Universal Translator at Google I/O. With this, exactly such translations are possible, including video and lip-synchronized movements. It is not yet published for security reasons. Things are different with Hey Gen, which is a freely available tool that can do exactly such translations for videos and recently caused a stir on social networks.
Show moreShow less
To the homepage
#Spotify #automatically #translates #podcasts #desired