Tech

With the new model of OpenAI, we are just one step closer to the virtual assistants of science fiction

Published

May 14, 2024

With the new model of OpenAI, we are just one step closer to the virtual assistants of science fiction

On Monday night Hungarian time, OpenAI, the leading company in the field of artificial intelligence, presented a very impressive new model, which is natively capable of conducting conversations in real time, completely without delay, and based not only on written text, but also on videos and images. And the sounds. The model, called GPT-4o, will arrive gradually over the coming weeks, and the company has announced that, unlike the current top model, it will be available to everyone for free.

A few days before Monday's event, speculation has already begun about what exactly OpenAI will announce. the According to Reuters Most of the world's press treated it as a fact that they would be competing with the still-established Google search engine with their AI-based solution, but the company's CEO, Sam Altman, said the next day He explained this on X (formerly Twitter)., that neither a search engine nor GPT-5 will be announced. But he added that

They've been working hard on something new that they think people will love and it still feels like magic to him, and from what we saw on Monday, that's not surprising at all.

At the event, they didn't waste much time, they even covered the most important things first, namely that ChatGPT will get a desktop version and a new UI, and that their new model, available to everyone for free and the shipment will be called GPT-4o. Regarding the former, Mira Moratti, the company's CTO, said that an important element of their mission is to make their AI models freely available and easy to use, despite the fact that the models themselves are becoming more complex.

Then they jumped first to GPT-4o, which is a GPT-4 level model, but compared to the current top model (this is the GPT-4 Turbo, which was introduced about half a year ago), it is much faster and was capable of it. Significant improvement in all input methods compared to its predecessor. The API that developers can use for this purpose is also being updated, and according to Moratti, it is twice as fast as the previous model and has five times the maximum speed, but it costs only half as much. The new model will include all the functionality that was previously available only to subscribers, but will still have five times the maximum.

Moratti said the models have gotten better and better in recent years, but now handling them has been greatly simplified, so they've taken a big step toward making the interaction between people and machines feel natural. The most significant change is that voice-based input, where three models have worked together so far (one that takes notes, another that interprets and answers them, and a third that reads them), works natively in GPT-4o. This means that

The model can communicate in real time, without delay, based on camera images, written text and live speech, and based on the presentation, it looks like it was taken straight from science fiction.

This seems like a bit of a stretch at first, but based on the demos, they've actually put together something that a year ago you would have easily said they must have pre-assembled. Armed with the new model, you should talk to ChatGPT in the same way as Google Assistant or Siri, but unlike them, you can then talk to it in real-time as if you were talking to someone else. Not only because you don't have to wait for him for seconds, but also because what he says and how he says it feels eerily natural.

The new model recognizes and reacts to human tone of voice in real time, and you can't get too caught up in their use of words, it even recognizes that someone is panting nervously or breathing quietly and can respond to it. You can interrupt what he says as a real person, and he himself can imitate different emotional styles. This was presented by telling a story with him in an increasingly dramatic tone, and it became more and more dramatic. He had to finish by singing the ending, and before that he sighed exactly why he was drawn to this.

Such elements, which are not essential for conveying information, but are common in human communication, came up with the model all the time when they switched to video, and ChatGPT started talking even before they showed it what they wanted from it, he said. Things like, “Oops, I was a little too happy.” However, the model easily led an OpenAI expert to solve a first-order equation written on paper, and when Moratti interrupted him from the background, saying: “Okay, but what use is this in everyday life,” he even gave a little lecture on the topic.

Plus, of course, he was able to help with the programming: in about a second he understood what the code shown to him on the screen was doing and summarized it, and then did the same with the graph after which the code was shown running. Additionally, it turns out he can translate in real time, and was even able to tell from an expert's face that he was having a good time (and before that, he joked a line because he was accidentally shown the table first). Everything was as it was

We'll be just one step closer to the holographic virtual intelligence of Mass Effect – or the AI system of the movie Her (in Hungarian, woman), and it's no coincidence Altman wrote this too After the demo – and if GPT-4o really works that way, it might as well.

You can watch the full broadcast below, On the OpenAI channel And you can find interesting things beyond the demos shown here, a It's a joke the Preparing for the interview And Meet the dog more To GPT-4o they talk to each other and sing together.

In this article:

Click to comment

At least 22 dead: Reserve soldiers stage a massacre in Lewiston, USA

In a harrowing incident that has shaken the community of Lewiston, Maine, a series of shootings on Wednesday evening resulted in a tragic loss...

Muhammad BowenNovember 26, 2023

Joe Biden rushed into the crisis room unexpectedly – some people are already talking about the worst

President Joe Biden’s abrupt departure from a speech on the U.S. economy at the White House on Monday sent a ripple of speculation and...

Arzu AddisonOctober 29, 2023

According to a cracker, the PC version has been discontinued due to copy protection systems

Tech

According to a cracker, the PC version has been discontinued due to copy protection systems

a DSO . Games You mentioned that a hacker recently managed to hack the PC version of Resident Evil Village. This is not interesting...

Ayhan AnthonyJuly 11, 2021

The PS5 really beats the Xbox Series X – here’s why

Given the differences in styles with next-generation consoles, the so-called “console war” between Sony and Microsoft is arguably moot. Most console players, however, will...

Arzu AddisonOctober 8, 2020

CampusLATELY

Tech

With the new model of OpenAI, we are just one step closer to the virtual assistants of science fiction

Leave a Reply
Cancel reply

Leave a Reply

You May Also Like

Top News

At least 22 dead: Reserve soldiers stage a massacre in Lewiston, USA

Top News

Joe Biden rushed into the crisis room unexpectedly – some people are already talking about the worst

Tech

According to a cracker, the PC version has been discontinued due to copy protection systems

Top News

The PS5 really beats the Xbox Series X – here’s why

Leave a Reply Cancel reply

Leave a Reply

You May Also Like

Top News

At least 22 dead: Reserve soldiers stage a massacre in Lewiston, USA

Top News

Joe Biden rushed into the crisis room unexpectedly – some people are already talking about the worst

Tech

According to a cracker, the PC version has been discontinued due to copy protection systems

Top News

The PS5 really beats the Xbox Series X – here’s why

Leave a Reply
Cancel reply