Meta, the tech giant that combines Facebook and Instagram, announced a few weeks ago that it had developed an AI-powered voice synthesizer that only requires a two-second audio sample. Now, Google has also created its own technology, AudioPaLM, which is also suitable for synchronization.
Google’s new AI technology, AudioPaLM, may revolutionize voice over compositing and dubbing. Meta, the company that brings together Facebook and Instagram, announced a few months ago that it had developed a new artificial intelligence that can synthesize the voice of a specific speaker based on just a two-second sample. However, the company led by Mark Zuckerberg refuses to share the technology called Voicebox with the public because it considers it too dangerous.
Now, however, Google has also turned on the accelerator. According to a report by The Decoder, the search giant recently unveiled its similar technology that could revolutionize the field of synchronization. Based on Google’s large language model, PaLM-2, AudioPaLM requires a slightly longer audio sample of at least 3 seconds to mimic the sounds of loudspeakers. However, in return, it also generates written text from spoken words and translates the text into other languages. In this way, the algorithm can also generate simultaneous subtitles with the speaker’s voice.
Subtitles are also available in transcripts made from audio files, and according to Google, AudioPaLM is also well suited for speech recognition. This technology can be used in many areas, from multilingual voice assistants to automated transcription applications.
Google wrote in a research paper on the technology available on GitHub available at
However, this is not the only similar technology that Google is developing. The company’s YouTube subsidiary recently announced that it will offer AI-generated diplomas on its platform. The innovation is based on an algorithm developed by the recently acquired startup Loudly.
The competition in the field of artificial intelligence seems to be intensifying among the tech giants. Google’s new technology, AudioPaLM, is certainly an exciting and promising development that could take human voice imitation and language translation to a whole new level. Only time will tell how much this synchronization and multilingual communication will transform in our increasingly globally connected world.
source: computer word
(Cover image: AI. Illustration: Getty Images)