Tech

Microsoft’s text-to-speech AI can imitate anyone

Published

January 12, 2023

Microsoft’s text-to-speech AI can imitate anyone

Microsoft researchers announced the text-to-speech AI model VALL-E, which can simulate the voice of a real person based on just a three-second audio sample. In this way, while preserving the intonations characteristic of the speaker, he reproduces any audio-textual material, as if the speech of a particular person had been heard. Its creators envision its use as an advanced application for reading and editing text, even with other generative AI models such as GPT-3, which generates the text.

Redmond points to VALL-E as a neural language model, based on a compression neural network called EnCodec that Meta announced last year. Unlike other text-to-speech processes that work by manipulating waveforms, Microsoft Audio Codec creates symbols from selected text and sample audio signals.

VALL-E essentially analyzes the characteristics of a given person’s speech, and splits the information using EnCodec into separate components, “phonetic codes,” to create the final waveform. In addition to imitating the tone of the speaker, it can also imitate the “acoustic environment” of the sound sample. For example, if the sample is cut from a phone call, it reproduces the acoustics and frequency characteristics of the phone call.

The Redmond researchers worked with the audio library provided by Meta, which contains more than 60,000 hours of English speech by more than 7,000 people. Since in order for VALL-E to generate high-quality, realistic content, the audio sample must show a high match with one of the data used for training, so it is planned to expand the database with additional data in the future.

Due to the violations, Microsoft does not make the test or the VALL-E code available to others at this time. According to its announcement, the company will follow its own guidelines for AI-related developments in the future, and a separate form is being prepared to determine if a VALL-E-assisted audio segment has been generated. Offline project on his GitHub page You can listen to how the algorithm makes music: it’s not perfect yet, and some tracks sound like a machine, but there are some really scary realistic results.

In this article:

Click to comment

At least 22 dead: Reserve soldiers stage a massacre in Lewiston, USA

In a harrowing incident that has shaken the community of Lewiston, Maine, a series of shootings on Wednesday evening resulted in a tragic loss...

Muhammad BowenNovember 26, 2023

Joe Biden rushed into the crisis room unexpectedly – some people are already talking about the worst

President Joe Biden’s abrupt departure from a speech on the U.S. economy at the White House on Monday sent a ripple of speculation and...

Arzu AddisonOctober 29, 2023

According to a cracker, the PC version has been discontinued due to copy protection systems

Tech

According to a cracker, the PC version has been discontinued due to copy protection systems

a DSO . Games You mentioned that a hacker recently managed to hack the PC version of Resident Evil Village. This is not interesting...

Ayhan AnthonyJuly 11, 2021

World

China has found a hitherto unknown substance, and the whole world may change

Chinese scientists have discovered a little-known type of ore containing a rare earth metal highly sought after for its superconducting properties. The ore, called...

Aygen CrawfordOctober 21, 2023

CampusLATELY

Tech

Microsoft’s text-to-speech AI can imitate anyone

Leave a Reply
Cancel reply

Leave a Reply

You May Also Like

Top News

At least 22 dead: Reserve soldiers stage a massacre in Lewiston, USA

Top News

Joe Biden rushed into the crisis room unexpectedly – some people are already talking about the worst

Tech

According to a cracker, the PC version has been discontinued due to copy protection systems

World

China has found a hitherto unknown substance, and the whole world may change

Leave a Reply Cancel reply

Leave a Reply

You May Also Like

Top News

At least 22 dead: Reserve soldiers stage a massacre in Lewiston, USA

Top News

Joe Biden rushed into the crisis room unexpectedly – some people are already talking about the worst

Tech

According to a cracker, the PC version has been discontinued due to copy protection systems

World

China has found a hitherto unknown substance, and the whole world may change

Leave a Reply
Cancel reply