What Can OpenAI’s GPT-4o Do?

In this post:

  • The latest model from OpenAI, ChatGPT-4o, “feels like AI from the movies.”
  • It can accept and return requests in any combination of text, audio, and image requests.
  • It responds with an empathic voice than a typical AI model would.

OpenAI on Monday revealed its latest flagship model called GPT-4o (“o” for “omni”), and it’s seemingly the closest we have gotten to having an intelligent assistant as the “Jarvis” in the Iron Man movie. 

The selling point is that ChatGPT-4o can handle different modalities, which most of the existing AI models cannot do. What this means is that GPT-4o can accept and generate any combination of text, audio, and image requests.

The staged demo presented by the team on X (formerly Twitter) was so impressive that many people hyped it up. One big feat is that GPT-4o responds to audio inputs in as little as 232 milliseconds, which is similar to human response time during conversation.

“It feels like AI from the movies; and it’s still a bit surprising to me that it’s real,” OpenAI’s CEO Sam Altman wrote in a blog post Monday. “Getting to human-level response times and expressiveness turns out to be a big change.”

OpenAI has started rolling out GPT-4o’s text and image features to users. In the coming weeks, the audio and video capabilities will be released to “a small group of trusted partners in the API,” the company said. 

Notwithstanding, here are some of the things you can do with the ChatGPT-4o model. 

Things You Can Do With GPT-4o

Create Images with Legible Texts

Up until now, some AI image generators like Midjourney still struggle to make images with readable texts. OpenAI said GPT-4o now understands text descriptions much better and can make legible texts on images. 

Image Source: OpenAI

Real-Time Translation

In a situation where a translator is needed, GPT-4o can act as one. In a video demonstration, OpenAI’s team showed that GPT-4o could repeat something said in English in Spanish, perhaps other languages, and back from Spanish to English.

Look and Tell

For people who are visually impaired, or just for the fun of it, ChatGPT-4o can look and tell what is happening around your surroundings through the phone camera. In one case, the model was able to tell someone was having a birthday celebration when it noticed a cake and candle in the room.

Solve Math Problems

GPT-4o can also look at math problems on a paper sheet or display screen and give the answer to them. Not just that, it can also tutor and guide you to learn how to solve the problem.

AI in Visual Meeting

GPT-4o can join visual meetings and hold conservations with participants. It can also help users prepare for job interview meetings. 

Disclaimer. The information provided is not trading advice. Cryptopolitan.com holds no liability for any investments made based on the information provided on this page. We strongly recommend independent research and/or consultation with a qualified professional before making any investment decisions.

Share link:

Most read

Loading Most Read articles...

Stay on top of crypto news, get daily updates in your inbox

Related News

Energy Sector Sees the Largest Rise in AI Mentions in Q1, Beyond Tech
Subscribe to CryptoPolitan