VENDORiQ: Inception’s Diffusion Mercury Models: Is it time for the big game?

Inception's updates to Mercury models, including a larger context window, tool calling, and data privacy controls, make them a viable enterprise-grade alternative.

The Latest

Inception has announced several updates to its Mercury and Mercury Coder models and the associated API Platform. Key enhancements include an expanded context window, now supporting 128,000 tokens for both models. Mercury is a relatively new player in the AI space. It differentiates itself by using a ‘parallel diffusion process’ (similar to image generation) for faster text generation than the more well-known models of OpenAI, Anthropic and Meta, which rely on sequential processing.

The enhanced Mercury models also gained functionality for tool calling and structured output generation, alongside the introduction of ‘non-zero temperature’ (aka, creativity) settings for greater control over output variability.

From a platform perspective, Inception introduced a free tier, providing 10 million tokens upon account creation, new billing limit controls with an 80 per cent usage alert and hard blocking, and a user-configurable option to opt out of data being used for model training.

Why it Matters:

The announced updates address several critical areas for organisations looking to integrate advanced text diffusion models into their operations. 

  • The increase to a 128K token context window positions Inception’s Mercury models alongside leading industry offerings. This capacity is particularly relevant for use cases requiring the processing of extensive documents, such as legal contracts, research papers, or entire code repositories, enabling more comprehensive analysis and summarisation within a single interaction. However, organisations should assess whether this larger context window translates directly into improved task cost-performance for their specific applications, as the effective utilisation of vast contexts remains a technical challenge across the industry. 
  • The introduction of tool calling and structured output capabilities signifies a maturing of the Mercury API for enterprise deployment. Tool calling empowers AI systems to interact more dynamically with external applications and services, laying the groundwork for more complex, automated workflows. Structured outputs, conversely, are fundamental for integrating model responses reliably into existing software pipelines, reducing the need for post-processing and enhancing data integrity. These features move the models beyond mere conversational agents, enabling them to become more integral components of an automated business process.
  • The ability to set a non-zero temperature offers developers more granular control over model output, allowing for a balance between deterministic responses for accuracy-critical tasks and more creative or diverse outputs for generative applications. This flexibility is a standard expectation for advanced model APIs.
  • The new commercial model is noteworthy. The provision of a free tier enables development teams to experiment and prototype diffusion solutions without initial financial commitment, potentially lowering the barrier to entry for new projects. The implementation of billing limits with alerts and hard stops directly addresses a primary concern for IT leadership regarding cost predictability in consumption-based cloud services. 
  • The opt-out for data usage in model training provides enterprises with a crucial data privacy control. This feature is vital for organisations handling sensitive or proprietary data, offering a mechanism to mitigate risks associated with unintended data exposure or model ingress. CIOs and CISOs will likely view this as a necessary control for ensuring compliance and maintaining data sovereignty when utilising third party AI services.

Taken together, Inception’s announcements bring Mercury into the ‘big game’, moving diffusion models from an enjoyable alternative to a viable contender for production.

Who’s Impacted?

  • AI/ML Teams & Development Teams: Explore how Inception’s diffusion models can be applied, and compare their performance against more traditional sequential models from OpenAI and other major LLM vendors.
  • Finance & Procurement Teams: Should be aware that the free tier and billing limit functionalities should not be used as the basis for a proof-of-concept.

Next Steps

  • Pilot projects for new models with tool calling and structured output to assess integration complexities and reliability for specific workflows
  • Evaluate if diffusion models provide performance or quality improvements.
  • Review the data usage opt-out policy and its implications for enterprise data governance and regulatory compliance before onboarding sensitive datasets to the platform.
  • Establish internal cost monitoring and allocation strategies, leveraging the AI vendor’s billing limit features to manage expenditure and prevent unexpected costs.

Trouble viewing this article?

Search

Register for complimentary membership where you will receive:
  • Complimentary research
  • Free vendor analysis
  • Invitations to events and webinars
Delivered to your inbox each week