OpenAI’s Spring Update Is a More Natural Chatbot

For a week or so, it seemed like OpenAI was ready to take on Google and announce a ChatGPT-powered search engine. In this case, though, the rumor mill had it all wrong. Instead, during the company’s Spring Update event earlier today, OpenAI unveiled some modest upgrades to ChatGPT’s underlying model—but in a surprising and at times unsettling way.

Introducing GPT-4o, the new flagship model

OpenAI’s big announcement was a new model, GPT-4o. In a twist, the company revealed GPT-4o isn’t a only for paying customers—it’s available to everyone, for free. The company sees GPT-4o as the first step towards making interacting with AI much more “natural,” a stance that made sense as the presentation went on.

GPT-4o works with voice, text, and vision, so you can interact with ChatGPT using any type of content you like. In addition, OpenAI is making many of its premium features free for all. Free users can access GPTs for the first time via the GPT Store, upload images (documents or pictures) and chat with ChatGPT about them, and access ChatGPT’s memory feature. That last one is especially useful: ChatGPT will remember what you talked about in past chats, so your future chats are informed by those conversations.

Paid users still have up to 5x capacity limits, so there is something to justify spending that $20 per month.

OpenAI showed off the new model by demonstrating a breathing exercise. The demonstrator asked ChatGPT for some relaxation tips, which included an instruction to breath deeply in. They demonstrator then breathed rapidly and loudly, in an attempt to check whether the model would identify the incorrect technique. Indeed, the model corrected the behavior, but it was a bit choppy: The model kept cutting in and out as it gave feedback about the breathing technique. That said, you can “naturally” interrupt the model as it speaks, so it’s possible the demonstrator was accidentally interrupting throughout.

From here, the demonstrators asked ChatGPT to come up with a story. It started off as you might expect ChatGPT to, but one demonstrator interrupted, asking for more emotion in the voice. Truth be told, it was impressive how the voice model started acting like a cartoon voiceover artist, especially once asked a second time to emphasize emotions. It even started talking like a stereotypical robot when prompted.

The part that rubbed me a bit wrong was when demonstrators showed how you can give ChatGPT a live feed from your camera to analyze your surroundings. They used a simple math homework example, but I don’t know if I’m ready for ChatGPT to have constant access to my environment. If I want to ask it a question about something in front of me, a picture or video will do fine. To further my point, during this part of the demo, they tried to shut off the model, but it unexpectedly said something along the lines of “wow, that’s quite the outfit you have on.” Yeah, I’m really not here for the AI live feed.

It can also identify facial expressions from the live feed, which, again: creepy. One demonstrator put their face in the feed and asked what they looked like, and ChatGPT said something along the lines of “a piece of wood,” which the demonstrator quickly corrected, saying it was responding to an image he had submitted to the chatbot previously. (Sure, Jan.) Once he gave ChatGPT another shot, it did manage to identify his facial expression.

GPT-4o can also do live translations, which the team demoed live. One person pretended they only spoke Italian, while the other said they only spoke English: The live translation worked well, as far as I could tell: ChatGPT spoke in Italian, and I have to take OpenAI’s word that everything it said was correct.

Per the demo, GPT-4o will be rolling out over the new few weeks, and I’m looking forward to testing it out. Until then, I’m left feeling a bit unnerved by this experience. The voice effects are quite realistic, and at times it all feels fairly natural in a way that is entirely unnatural. ChatGPT will experience “human” moments, such as saying something “oh, silly me” or “well, that makes more sense” after being corrected, for example. Sure, it’s impressive, but I’m not sure I want this tech in my life. What’s wrong with computers being distinctly computers? Why do I need to pretend my AI is actually alive? In any case, I’m not keeping that live feed open.

There’s a new ChatGPT desktop app too

While it was overshadowed by GPT-4o, OpenAI also announced a desktop app for ChatGPT, as well as a new UI, but didn’t dive too deeply into the changes.

The app seems similar to the web and mobile versions of ChatGPT, plus some new features. Demonstrators showed off a voice app built-into this version of ChatGPT; it can’t see anything on your screen, but you can talk to it in the same conversational way. In the demo, they copied code over to the voice app, and ChatGPT analyzed the code and explained it, as you’d expect.

by Life Hacker