SMARTPHONES

OpenAI now has an AI model with vision, and everyone else should be scared

What you need to know

  • One day before Google I/O 2024, OpenAI debuted a new AI model known as GPT-4o.
  • The “o” in GPT-4o stands for “omni,” referencing the model’s multimodal interaction capabilities. 
  • GPT-4o appears to bring the multimodal, vision-based functionality touted by companies like Humane and Rabbit to virtually any device.
  • OpenAI’s latest model has the potential to displace a handful of products and services, from the Humane AI Pin to the Google Assistant to Duolingo.

This is a big week for artificial intelligence, as OpenAI held an event on Monday, May 13, and Google I/O 2024 is taking place on May 14 and 15 as well. Although reports that OpenAI might be prepping a search competitor didn’t pan out, OpenAI did launch GPT-4o on Monday. The latest AI model from OpenAI is multimodal and can process combinations of vision, text, and voice input. Though it’s still early, quick tests and demos of the GPT-4o model have left both users and AI researchers impressed.

Certain characteristics of GPT-4o make it more likely to displace existing products and services than any other form of AI we’ve seen to date. The support for combinations of vision, text, and voice input takes the novelty factor away from hardware devices like the Humane AI Pin and the Rabbit R1. Response times that are claimed to be as quick as a human when using voice have the potential to make Google Assistant look outdated. Finally, rich translation and learning features could make apps like Duolingo redundant. 




Source link

Related Articles

Back to top button