Why it Matters
The capabilities presented by CustomGPT.ai, while framed as novel, represent an implementation and specific application of existing large language model technologies, particularly those with multimodal capabilities, such as OpenAI’s GPT-4V. Multiple large multimodal models have this capability. The specific innovation here lies in integrating visual understanding directly into user interactions and internal documentation systems.
The potential use cases are many and varied: For example:
- Customer support: Integrating visual citations means agents could provide more explicit and less ambiguous answers when discussing products or troubleshooting, potentially reducing miscommunication and support call durations.
- Technical support and training: The ability to incorporate diagrams, schematics, or software screenshots could simplify complex instructions, assisting users in understanding procedures more effectively. This could reduce training time and decrease errors during operations.Â
- Media: The ability to analyse provided images and perform research or detect ‘fakes’ (AI-generated items) could be automated.Â
- Public sector: Citizen-provided images of council or national park assets could be analysed and maintenance actions scheduled.
But it Will Cost Ya’!
An important note here is that the new document analyst feature will be an additional consumption cost. Almost all of the large North American AI vendors are bleeding cash and looking for new revenue streams. This will drive innovation in not only creating more sophisticated models, but also in how they leverage their models to create new ‘value-added’ services. These services, in turn, will drive up the overall costs of AI for many organisations.
Who’s Impacted?
- Chief Technology Officer (CTO): Needs to assess the technical feasibility, scalability, and long-term implications of deploying multimodal AI, particularly concerning API integration and potential vendor lock-in.Â
- Head of Customer Service/Support: Directly impacted by the potential for improved customer interactions, faster resolution times, and reduced agent workload through visual assistance.Â
- Operations Managers: Could see benefits in process documentation, quality control, and field service through visual information processing, leading to operational efficiencies.
Next Steps
Perform a detailed assessment of the financial implications, including subscription costs, usage-based fees, and potential return on investment from improved efficiency or customer satisfaction for emerging new business features from the major AI vendors, but do not rush into adopting them. It is very likely that providers of core business platforms—especially those that manage customer and service interactions—will adopt similar capabilities over time.


