VENDORiQ: OpenAI Announces GPT-4o Latest Multimodal Model

Uncover the game-changing capabilities of OpenAI's GPT-4o model for revolutionising AI strategies in enterprises.

The Latest

On May 13, 2024, OpenAI announced the launch of GPT-4o, its latest AI model that represents a significant step towards multimodal generative AI. Unlike its previous models focused on text, GPT-4o integrates audio, vision, and text inputs and outputs, enabling more natural and intuitive human-computer interactions.

Why It’s Important

The ability to process and generate information across multiple modalities is significant. IBRS has previously predicted that the future of enterprise search will leverage multimodal AI and that such services will also be heavily used in service, content creation, information classification, value extraction, and analysis. 

In addition, GPT-4o’s real-time audio response capabilities, with latencies as low as 232 milliseconds, open up new possibilities for voice assistants and conversational AI applications and is exactly the type of service needed for next-generation workplaces, as detailed in How We’ll Work In The Year 2035.

Moreover, GPT-4o’s multilingual capabilities and enhanced performance on non-English languages broaden its potential applications in global markets. However, it should be noted that most of the large language models, including GPT-4o, still trail the quality of specialised machine translation services that leverage smaller, ‘genre-based’ models.

Significantly, GPT-4o is competitively priced: 50% lower cost compared to previous models. This makes it a far more accessible and cost-effective solution for enterprises and positions it well against the rising tide of competitive models.  

IBRS believes that OpenAI, backed by ongoing heavy investment from Microsoft, will continue to aggressively ‘buy market share’ over the next two years. The company continues to operate at a loss: nearly half-billion USD in 2022-2023, but saw 900% growth. Most financial analysts expect it to make a larger loss in the 2023-24 timeframe, with slower but still considerable growth. This means that OpenAI will continue to represent exceptional value for AI services until it has established a dominant position in the market, closely aligned with Microsoft’s efforts.  

However, ICT strategists should consider retaining contestability and flexibility in AI models, which is essential. Building the ability to quickly plug in and replace AI models with business processes is paramount, both to take advantage of the rapid pace of model innovation, as well as to set up a longer-term approach to avoiding potential monopolistic practices later this decade.

Who’s Impacted

  • CTO
  • AI developers
  • IT teams

What’s Next?

  • Enterprises should closely evaluate GPT-4o’s capabilities and limitations to determine its suitability for their specific use cases.
  • Establish robust policies and safeguards to mitigate risks associated with synthetic media generation and deep fakes.
  • Monitor legal and regulatory developments surrounding multimodal AI and adjust strategies accordingly.
  • Invest in upskilling and training programs to effectively and responsibly equip employees with the necessary skills to leverage multimodal models, such as GPT-4o.
  • Ensure that AI models are incorporated into existing processes and work practices in a manner that allows them to be replaced quickly and at a low cost.

As OpenAI continues to push the boundaries of AI capabilities, the introduction of GPT-4o marks a significant milestone in the evolution of human-computer interaction. However, it also underscores the need for proactive measures to address the ethical and societal implications of these powerful technologies.

Trouble viewing this article?