AI Made Friendly HERE

5 new GPT-4o features making ChatGPT better than ever

Key Takeaways

  • GPT-4o promises real-time voice interaction with multiple tones and voices for a more human-like experience.
  • Vision capabilities allow GPT-4o to answer questions about photos and screenshots, and should ultimately support video.
  • May 13 sees GPT-4o joining all ChatGPT tiers, but with differences in prompt limits and voice function availability.

On May 13, OpenAI officially launched GPT-4o, its next AI model. Since GPT-4 is already the basis of much of the hype around generative AI, 4o could be poised to send shockwaves throughout the industry. Here’s everything that OpenAI revealed about the new AI technology, and why it’s a big step forward.

Related

How to use ChatGPT to make AI-generated art and images

Whether you need a stock photo or a portrait of Big Foot, ChatGPT can now use DALL-E AI to generate images. Here’s how, plus tips and tricks.

1 Real-time voice conversations

No keyboard required

OpenAI

There’s a strong focus on real-time voice exchanges with GPT-4o. The model can pick up on the tone in your voice, and will try to respond in an appropriate tone of its own. In some circumstances you can even ask it to add more or less drama to its response, or use a different voice — like a robotic one for a story being told by a robot, or singing for the end of a fairytale.

Perhaps more significantly, you can interrupt the AI at any time, say if it’s getting a request wrong, or you want to change its tone or voice mid-stream. 4o will do its best to correct itself, using the rest of a conversation as context. In a staged demonstration by OpenAI this all felt very natural, with the AI even apologizing when someone pointed out that it was missing some critical source data.

You’ll have to wait to try the new voice features, unfortunately. They’re initially deploying only to ChatGPT Plus subscribers, and only in an early alpha state sometime before the end of June.

Related

How to get Spotify Premium for free

If you’re sick of interrupting ads harshing your music vibe, here’s how to get Premium benefits for free.

2 Better vision capabilities and multilingual support

Words aren’t always enough

OpenAI

GPT-4o can also answer questions about photos and desktop screenshots. These may be similar to ones you’d ask Meta/Ray-Ban’s Smart Glasses or the Humane AI pin — something like “What brand of pants are these?” — but are potentially more complex, such as explaining a block of app code, or translating a restaurant menu. OpenAI says that down the road, 4o may be capable of even more complicated tasks, such as watching live sports and explaining the rules involved. For now the focus appears to be on static images rather than video.

Related to vision are improved multilingual functions. 4o is claimed to have better performance across 50 different languages, with an API twice as fast as the one for GPT-4 Turbo.

Related

Amazon’s simple vision for the Echo Frames in an Apple Vision Pro era

I met with Jean Wang, Amazon’s Director of Smart Eyewear, to learn how the latest Echo Frames keep things simple in a AR/VR headset world.

3 You can create images with readable text

Extending the possibilities of AI art

OpenAI

Generating images with legible text has long been a weak point of AI, but GPT-4o appears more capable in this regard. Text can not only be legible, but arranged in creative ways, such as typewriter pages, a movie poster, or using poetic typography. It also appears to be adept at emulating handwriting, to the point that some prompts might create images indistinguishable from real human output.

Text can not only be legible, but arranged in creative ways, such as typewriter pages, a movie poster, or using poetic typography.

You can even ask 4o to include doodles in the margins.

Related

With GPT-4o, ChatGPT can generate art with text that’s actually readable

OpenAI’s new GPT-4o model for ChatGPT finally fixes one of AI’s biggest flaws: Text on AI-generated images.

4 Native Mac and Windows apps

Quicker, more powerful access

OpenAI

Aside from the web version of ChatGPT, there’s now a dedicated Mac app with keyboard shortcut and screenshot support, currently restricted to Plus subscribers. A Windows app should be available by the end of 2024. It could be that OpenAI isn’t in a rush to put a first-party client in Windows 11 — GPT is, after all, the foundation of Copilot, and Microsoft probably doesn’t want its integrated Windows tech upstaged.

Related

OpenAI finally has a ChatGPT desktop app. Mac users get first dibs

A Windows version will be launched “later this year,” according to OpenAI.

5 Everyone can access GPT-4o for free

Down with gatekeeping

OpenAI/ Pocket-lint

In a way, this may actually be the biggest advancement. OpenAI has traditionally gated the most cutting-edge versions of GPT, but 4o is free to every ChatGPT user from the start. The main limitations are on real-time voice conversation — which is being restricted to Plus subscribers, once it actually rolls out — and the number of prompts you can use. ChatGPT Plus and Team subscribers get five times the amount of prompts, which matters a great deal, since conversations revert to GPT-3.5 once your prompt limit is hit. You may need Plus if you expect GPT-4o to behave like the computer on the Enterprise.

Related

I tested ChatGPT Plus against Copilot Pro to see which AI is better

I created a ChatGPT Plus vs. Copilot Pro battle to see which AI chatbot subscription service is really worth your $20 every month.

FAQ

Q: What is GPT-4o?

GPT-4o is an evolution of the GPT-4 AI model, currently used in services like OpenAI’s own ChatGPT. The O stands for “omni” — not because it’s omniscient, but because it unifies voice, text, and vision. That contrasts with GPT-4, which is mostly about typed text interactions, exceptions like image generation and text-to-speech transcription notwithstanding.

Q: How and when is GPT-4o going to be available?

The model is coming to all tiers of ChatGPT as of May 13, including free users. There are some catches here — ChatGPT Plus and Team subscribers get five times the amount of prompts, and for everyone, conversations fall back to GPT-3.5 once prompt limits are hit. Also, the new voice functions are initially deploying only to Plus subscribers, and only in an early alpha state sometime before the end of June. We’ll see 4o enterprise features introduced around the same time.

It’s not clear when we’ll see GPT-4o migrate outside of ChatGPT, for example to Microsoft Copilot. But OpenAI is opening the chatbots in the GPT Store to free users, and it would be odd if third parties didn’t leap on technology easily accessible through ChatGPT. The company is being cautious, however — for its voice and video tech, it’s beginning with “a small group of trusted partners,” citing the possibility of abuse.

Originally Appeared Here

You May Also Like

About the Author:

Early Bird