OpenAI’s Voice Engine Creates Synthetic Voices from Just 15 Seconds of Audio

OpenAI, the brains behind ChatGPT and Sora AI, has been cooking up something new: Voice Generation. This fancy tool can create synthetic voices from just 15 seconds of audio—pretty cool, right?

According to their blog post, OpenAI has been testing Voice Engine since late 2022. It’s already being used in the Read Aloud feature of the ChatGPT app, where it reads out answers to you in an “emotive and realistic” way.

The possibilities are endless with Voice Generation. It could be used for education, translating podcasts, reaching remote communities, and even supporting people who are non-verbal. Imagine that!

But hold your horses—Voice Engine is still in a limited preview. OpenAI wants to make sure it’s not misused, especially when it comes to spreading misinformation or copying voices without permission.

“We hope to start a dialogue on the responsible deployment of synthetic voices,” says OpenAI. They want to make sure everyone’s on board before rolling this out widely.

With elections coming up in the US and UK, and AI tools getting smarter by the day, there’s concern about trustworthiness—especially when it comes to audio. Voice authentication and phone scams could become even trickier to handle.

It’s a tricky situation, but one thing’s for sure: OpenAI is on it, and we’ll have to find ways to navigate this new tech landscape together.