With the popularity of digital platforms such as social media and mobile devices, there has been an explosion of unstructured data such as images, audio, video and documents. This trend presents businesses and organizations with unprecedented challenges, with the key to extracting valuable insights from large amounts of unstructured data.

 

To help businesses gain valuable insights from data, Google has introduced an integration solution between BigQuery and Vertex AI. This integration provides powerful support for the construction of a new generation of AI applications, allowing business to leverage a variety of generative AI models, such as Gemini, and AI services, such as Document AI and Translation AI, thus able to handle unstructured data in BigQuery object tables efficiently. BigQuery already supports data analysis using a variety of large language models (LLMS) hosted in Vertex AI, with models including Gemini 1.0 Pro and Gemini 1.0 Pro Vision performing particularly well in analytical states such as text summaries and sentiment analysis. Usually the goal can be achieved in a shorter time.


If the above features do not meet your needs, BigQuery also supports the use of LoRA technology for model fine-tuning, where the model behavior needs to be carefully defined in LoRA, or when prompt does not consistently produce the expected results. LoRA technology fine-tuning allows models to learn specific responses, adopt new behaviors, and keep up to date with the latest information.


Recently, Google integrated the latest Gemini models into BigQuery, along with improvements for safety and grounding support:

  1. Enhanced Gemini 1.5 model Supports: ML.GENERATE_TEXT SQL function is now supporting the foundation model of Gemini 1.5 Pro and Gemini 1.5 Flash, which enables BIgQuery users to process natural language processing (NLP) tasks, visual tasks, audio analysis, and PDF file summaries at higher quality.
  2. Enhanced AI security and precise responses:  Google Cloud has enhanced the ML.GENERATE_TEXT SQL function and added customizable security Settings for Google Search Basics and Responsible AI (RAI) responses, allowing users to define thresholds for hate speech, dangerous content, and more to ensure safe and accurate content generation.
  3. Tuning and evaluation of the Gemini 1.0 model: Tuning and evaluation of the Gemini 1.0 model: Google Cloud extends the CREATE MODEL DDL and ML.EVALUATE SQL functions to allow BigQuery users to fine-tune and evaluate Gemini 1.0 Pro models to further customize AI capabilities.

In the upcoming sections, we will explore these new features in greater detail.


BigQuery ML & Gemini 1.5

To use Gemini 1.5 Pro in BigQuery, first create a remote model that represents the hosted Vertex AI Gemini endpoint. This step usually takes only a few seconds. Once the model is built, use the model to generate text that combines the data directly with BigQuery tables.

 

CREATE MODEL `mydataset.gemini_1_5_pro`

REMOTE WITH CONNECTION `us.bqml_llm_connection`

OPTIONS(endpoint = ‘gemini-1.5-pro’);

 

With Gemini 1.5, the ML.GENERATE_TEXT() function can take a BigQuery managed table as input, automatically adding your PROMPT statement to each database record to tailor the prompt for every row. The “temperature” parameter adjusts the randomness of the generated outputs.

SELECT *

FROM

 ML.GENERATE_TEXT(

   MODEL mydataset.gemini_1_5_pro,

   (

     SELECT CONCAT(

            ‘Create a descriptive paragraph of no more than 25 words for a product with in a department named ‘, department,

            ‘, category named “‘, category, ‘”‘,

            ‘and the following name: ‘, name

        )

 AS prompt

     FROM mydataset.my_table

   ),

   STRUCT(0.8 AS temperature));

 

With Gemini 1.5 models, the ML.GENERATE_TEXT() function can now handle object tables as input, enabling you to work with unstructured data like images, videos, audio files, and documents. When using object tables, the prompt is a single string included in the STRUCT option, which is applied to each object in the table one at a time.

 

This article is translated and adapted from the official Google Cloud blog.

Microfusion has been dedicated to help businesses gain valuable insights from data through AI technologies. In this era of rapid advances in digitization and artificial intelligence, it is vital to keep abreast of the latest technological information. Recently, Microfusion also assisted a well-known chain restaurant brand in Taiwan to use Google Maps public opinion analysis technology to deeply analyze the emotion and appearance of surrounding merchants to formulate accurate business strategies for the brand. If you are interested in these technologies or want to learn more about specific examples of how they can be applied to your business, please feel free to contact us.

We look forward to being your digital transformation partner and bringing you the latest and hottest technology topics to help your business stand out from the competition.