gcp_vertex_ai_embeddings

Beta

Generates vector embeddings to represent a text string, using the Vertex AI API.

# Configuration fields, showing default values
label: ""
gcp_vertex_ai_embeddings:
  project: "" # No default (required)
  credentials_json: "" # No default (optional)
  location: us-central1
  model: text-embedding-004 # No default (required)
  task_type: RETRIEVAL_DOCUMENT
  text: "" # No default (optional)
  output_dimensions: 0 # No default (optional)

This processor sends text strings to the Vertex AI API, which generates vector embeddings for them. By default, the processor submits the entire payload of each message as a string, unless you use the text field to customize it.

For more information, see the Vertex AI documentation.

Fields

project

The ID of your Google Cloud project.

Type: string

credentials_json

Set your Google Service Account Credentials as JSON.

This field contains sensitive information. Review your cluster security before adding it to your configuration.

Type: string

location

The location of the Vertex AI large language model (LLM) that you want to use.

Type: string

Default: us-central1

model

The name of the LLM to use. For a full list of models, see the Vertex AI Model Garden.

Type: string

# Examples
model: text-embedding-004
model: text-multilingual-embedding-002

task_type

Use the following options to optimize embeddings that the model generates for specific use cases.

Type: string

Default: RETRIEVAL_DOCUMENT

Option

Summary

CLASSIFICATION

Classify texts according to preset labels.

CLUSTERING

Cluster texts based on their similarities.

FACT_VERIFICATION

Optimize for queries that prove or disprove a fact, such as "apples grow underground".

QUESTION_ANSWERING

Optimize for proper questions, such as "Why is the sky blue?".

RETRIEVAL_DOCUMENT

Optimize for document search, also known as a corpus.

RETRIEVAL_QUERY

Optimize for queries, such as "What is the best fish recipe?" or "best restaurant in Chicago".

SEMANTIC_SIMILARITY

Optimize for text similarity.

For more information about task_type options, see Choose an embeddings task type

text

The text you want to generate vector embeddings for. By default, the processor submits the entire payload as a string. This field supports interpolation functions.

Type: string

output_dimensions

The maximum length of a generated vector embedding. If this value is set, generated embeddings are truncated to this size.

Type: int