VENDORiQ: Pinecone Launches Serverless Architecture

Unlock the potential of Pinecone's serverless architecture for faster and more accurate AI applications. Discover how it revolutionises data retrieval and indexing.

The Latest

February 2024: Pinecone launched Pinecone serverless, its new vector database that allows users to build faster and more accurate AI applications. According to the company, the new architecture offers reduced cost to users through usage-based pricing and no minimum cost per index.

Users can still create pod-based indexes using the new API and access indexes created on the legacy API. It allows multi-tenant computing to prevent provisioning and managing the infrastructure. The new indexing and retrieval algorithms also provide improved vector search speed across the blob storage.

Why It’s Important

With the growth of generative AI and vast amounts of data being fed to its algorithms, vector databases allow unstructured data such as text, image and audio to be converted into high-dimensional data vectors. This helps to make similarity searches faster and more accurate.

IBRS has explored the different approaches to deploying AI. Moving to serverless API components that are price based on consumption is much cheaper than running these components on enterprise servers, Cloud servers, or containers.

In one experiment, IBRS created a RAG (retrieval augmented generation) AI consisting of 4000 documents, each around 7000 kb in size. Building and running the model on the ‘single platforms’ of all three major hyper scale Clouds, was multiple times more costly than using OpenAI and Pinecone directly, with the service orchestration being a serverless API capability (from Google Cloud). Pre-package AI solutions built using these same types of AI components (Copilot, Google Vertext Search, etc.) can be more expensive, but they can be delivered more quickly.

Who’s Impacted

  • CIO
  • Developers
  • Data scientists

What’s Next?

  • Assess your company’s vector database needs by evaluating current generative AI projects. Identify areas where vector search and retrieval can improve performance and accuracy. To determine data volume and complexity, consider the amount and type of data. Next, assess the update frequency and required query speed.
  • Investigate alternative service providers offering serverless vector databases. Understand their cost implications and potential savings compared to pod-based or on-premise solutions. Then, evaluate multi-tenant compute models for infrastructure management benefits.
  • Before using serverless indexes in production, it’s recommended to check the current limitations and test thoroughly.

Related IBRS Advisory

  1. Serverless Computing – Revolutionise Application Architecture
  2. Know The Serverless Computing Model

Trouble viewing this article?

Search