VENDORiQ: CustomGPT Levels Up Business Processes with Enhanced Image Analysis

CustomGPT.ai's new visual processing allows AI agents to incorporate images (diagrams, screenshots) as visual citations into responses, leveraging existing multimodal large language model (LLM) technology.

The Latest:

CustomGPT.ai has introduced an update enabling its AI agents to process and incorporate visual information. This functionality allows images to be integrated directly into conversational AI responses, serving as visual citations. 

The system is designed to permit the upload of various image types, such as product photos, technical diagrams, charts, graphs, and software screenshots, which the AI then utilises to augment its explanations.  A related ‘document analyst’ feature, currently in beta, extends this capability by allowing end-users to upload their own images during interactions. This feature is intended to facilitate more dynamic and visual question-answering scenarios, where the AI can analyse user provided visuals to inform its responses. This additional analysis function incurs an additional cost per use.

Why it Matters

The capabilities presented by CustomGPT.ai, while framed as novel, represent an implementation and specific application of existing large language model technologies, particularly those with multimodal capabilities, such as OpenAI’s GPT-4V.  Multiple large multimodal models have this capability.  The specific innovation here lies in integrating visual understanding directly into user interactions and internal documentation systems. 

The potential use cases are many and varied: For example: 

  • Customer support: Integrating visual citations means agents could provide more explicit and less ambiguous answers when discussing products or troubleshooting, potentially reducing miscommunication and support call durations.
  • Technical support and training: The ability to incorporate diagrams, schematics, or software screenshots could simplify complex instructions, assisting users in understanding procedures more effectively. This could reduce training time and decrease errors during operations. 
  • Media: The ability to analyse provided images and perform research or detect ‘fakes’ (AI-generated items) could be automated. 
  • Public sector: Citizen-provided images of council or national park assets could be analysed and maintenance actions scheduled.

But it Will Cost Ya’!

An important note here is that the new document analyst feature will be an additional consumption cost. Almost all of the large North American AI vendors are bleeding cash and looking for new revenue streams. This will drive innovation in not only creating more sophisticated models, but also in how they leverage their models to create new ‘value-added’ services.  These services, in turn, will drive up the overall costs of AI for many organisations.

Who’s Impacted?

  • Chief Technology Officer (CTO): Needs to assess the technical feasibility, scalability, and long-term implications of deploying multimodal AI, particularly concerning API integration and potential vendor lock-in. 
  • Head of Customer Service/Support: Directly impacted by the potential for improved customer interactions, faster resolution times, and reduced agent workload through visual assistance. 
  • Operations Managers: Could see benefits in process documentation, quality control, and field service through visual information processing, leading to operational efficiencies.

Next Steps

Perform a detailed assessment of the financial implications, including subscription costs, usage-based fees, and potential return on investment from improved efficiency or customer satisfaction for emerging new business features from the major AI vendors, but do not rush into adopting them. It is very likely that providers of core business platforms—especially those that manage customer and service interactions—will adopt similar capabilities over time.

Trouble viewing this article?

Search

Register for complimentary membership where you will receive:
  • Complimentary research
  • Free vendor analysis
  • Invitations to events and webinars
Delivered to your inbox each week