Google BigQuery Revolutionizes Data Search with Vector Search Functionality


  • Vector search in BigQuery enables semantic search, similarity detection, and RAG with LLMs.
  • Automatic index updates and optimized lookups enhance vector search performance.
  • Integration with LangChain simplifies Python-based workflows and third-party framework support.

In a groundbreaking move, Google announced the integration of vector search functionality into its BigQuery platform, marking a significant advancement in data and AI capabilities. This latest feature enables users to conduct vector similarity searches, essential for a wide array of data and AI applications such as semantic search, similarity detection, and retrieval-augmented generation (RAG) using large language models (LLMs).

In its preview mode, BigQuery’s vector search enables approximate nearest-neighbor search, a crucial component for various data and AI use cases. The VECTOR_SEARCH function, supported by an optimized index, streamlines the identification of closely matching embeddings through efficient lookups and distance computations.

Automatic index updates and optimization

BigQuery’s vector indexes are automatically updated, ensuring seamless integration with the latest data. The initial implementation, dubbed IVF (Inverted File for Vectors), combines a clustering model with an inverted row locator, creating a two-piece index that optimizes performance.

Google has simplified Python-based integrations with open-source and third-party frameworks by leveraging LangChain. This integration enables developers to incorporate vector search capabilities into their existing workflows seamlessly.

Expanding textual data approaches

Max Ostapenko, a senior product manager at Opera, expressed his excitement about the new feature, stating, “Just got positively surprised trying out vector search with embeddings in BigQuery! We are diving into the world of enhancing product insights with Vertex AI now. It expands your approaches to working with textual data.”

To assist users in leveraging the power of vector search, Google has provided a comprehensive tutorial. Using the Google Patents public dataset as an example, the tutorial demonstrates three distinct use cases: patent search using pre-generated embeddings, patent search with BigQuery embedding generation, and RAG via integration with generative models.

Omid Fatemieh and Michael Kilberry, engineering lead and head of product at Google, respectively, highlight the advanced capabilities of BigQuery, enabling users to extend the search cases into full RAG journeys. Specifically, users can leverage the output from VECTOR_SEARCH queries as context for invoking Google’s natural language foundation (LLM) models via BigQuery’s ML.GENERATE_TEXT function.

Google’s commitment to enhancing BigQuery extends beyond vector search. The cloud provider has announced the availability of Gemini 1.0 Pro for BigQuery customers via Vertex AI. Furthermore, a new BigQuery integration with Vertex AI for text and speech has been introduced.

Billing and pricing

While the introduction of vector search brings enhanced functionality to BigQuery users, it’s essential to note that billing for the CREATE VECTOR INDEX statement and VECTOR_SEARCH function is based on BigQuery compute pricing. For the CREATE VECTOR INDEX statement, only the indexed column is considered for calculating processed bytes, ensuring transparent and predictable user billing.

With the integration of vector search, Google BigQuery continues to push the boundaries of data analytics and AI, empowering users with powerful tools to unlock insights and drive innovation.

Disclaimer. The information provided is not trading advice. Cryptopolitan.com holds no liability for any investments made based on the information provided on this page. We strongly recommend independent research and/or consultation with a qualified professional before making any investment decisions.

Share link:

Glory Kaburu

Glory is an extremely knowledgeable journalist proficient with AI tools and research. She is passionate about AI and has authored several articles on the subject. She keeps herself abreast of the latest developments in Artificial Intelligence, Machine Learning, and Deep Learning and writes about them regularly.

Most read

Loading Most Read articles...

Stay on top of crypto news, get daily updates in your inbox

Related News

Subscribe to CryptoPolitan