# openai_chat_completion

> For the complete documentation index, see [llms.txt](https://docs.redpanda.com/llms.txt). Component-specific: [cloud-data-platform-full.txt](https://docs.redpanda.com/cloud-data-platform-full.txt)

---
title: openai_chat_completion
latest-operator-version: v26.1.4
latest-console-tag: v3.7.3
latest-connect-version: 4.93.0
latest-redpanda-tag: v26.1.9
docname: connect/components/processors/openai_chat_completion
page-component-name: cloud-data-platform
page-version: master
page-component-version: master
page-component-title: Cloud
page-relative-src-path: connect/components/processors/openai_chat_completion.adoc
page-edit-url: https://github.com/redpanda-data/cloud-docs/edit/main/modules/develop/pages/connect/components/processors/openai_chat_completion.adoc
page-git-created-date: "2024-09-09"
page-git-modified-date: "2026-05-26"
---

<!-- Source: https://docs.redpanda.com/cloud-data-platform/develop/connect/components/processors/openai_chat_completion.md -->

**Available in:** Cloud, [Self-Managed](https://docs.redpanda.com/connect/components/processors/openai_chat_completion/%20%22View%20the%20Self-Managed%20version%20of%20this%20component%22)

Generates responses to messages in a chat conversation, using the OpenAI API and external tools.

#### Common

```yml
processors:
  label: ""
  openai_chat_completion:
    server_address: https://api.openai.com/v1
    api_key: "" # No default (required)
    model: "" # No default (required)
    prompt: "" # No default (optional)
    system_prompt: "" # No default (optional)
    history: "" # No default (optional)
    image: "" # No default (optional)
    max_tokens: "" # No default (optional)
    temperature: "" # No default (optional)
    user: "" # No default (optional)
    response_format: text
    json_schema:
      name: "" # No default (required)
      description: "" # No default (optional)
      schema: "" # No default (required)
    tools: [] # No default (required)
```

#### Advanced

```yml
processors:
  label: ""
  openai_chat_completion:
    server_address: https://api.openai.com/v1
    api_key: "" # No default (required)
    model: "" # No default (required)
    prompt: "" # No default (optional)
    system_prompt: "" # No default (optional)
    history: "" # No default (optional)
    image: "" # No default (optional)
    max_tokens: "" # No default (optional)
    temperature: "" # No default (optional)
    user: "" # No default (optional)
    response_format: text
    json_schema:
      name: "" # No default (required)
      description: "" # No default (optional)
      schema: "" # No default (required)
    schema_registry:
      url: "" # No default (required)
      name_prefix: schema_registry_id_
      subject: "" # No default (required)
      refresh_interval: "" # No default (optional)
      tls:
        skip_cert_verify: false
        enable_renegotiation: false
        root_cas: ""
        root_cas_file: ""
        client_certs: []
      oauth:
        enabled: false
        consumer_key: ""
        consumer_secret: ""
        access_token: ""
        access_token_secret: ""
      basic_auth:
        enabled: false
        username: ""
        password: ""
      jwt:
        enabled: false
        private_key_file: ""
        signing_method: ""
        claims: {}
        headers: {}
    top_p: "" # No default (optional)
    frequency_penalty: "" # No default (optional)
    presence_penalty: "" # No default (optional)
    seed: "" # No default (optional)
    stop: [] # No default (optional)
    tools: [] # No default (required)
```

This processor sends user prompts to the OpenAI API, and the specified large language model (LLM) generates responses using all available context, including supplementary data provided by [external tools](#tools). By default, the processor submits the entire payload of each message as a string, unless you use the `prompt` configuration field to customize it.

To learn more about chat completion, see the [OpenAI API documentation](https://platform.openai.com/docs/guides/chat-completions), and [Examples](#Examples).

## [](#fields)Fields

### [](#api_key)`api_key`

The API secret key for OpenAI API.

> ⚠️ **CAUTION**
>
> This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see [Manage Secrets](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/secret-management/) before adding it to your configuration.

**Type**: `string`

### [](#frequency_penalty)`frequency_penalty`

Specify a number between `-2.0` and `2.0`. Positive values penalize new tokens based on the frequency of their appearance in the text so far. This decreases the model’s likelihood to repeat the same line verbatim.

**Type**: `float`

### [](#history)`history`

Include messages from a prior conversation. You must use a Bloblang query to create an array of objects in the form of `[{"role": "user", "content": "<text>"}, {"role":"assistant", "content":"<text>"}]` where:

-   `role` is the sender of the original messages, either `system`, `user`, or `assistant`.

-   `content` is the text of the original messages.


For more information, see [Examples](#Examples).

**Type**: `string`

### [](#image)`image`

An optional image to submit along with the prompt. The result of the Bloblang mapping must be a byte array.

**Type**: `string`

```yaml
# Examples:
image: root = this.image.decode("base64") # decode base64 encoded image
```

### [](#json_schema)`json_schema`

The JSON schema used by the model when generating responses in `json_schema` format. To learn more about supported JSON schema features, see the [OpenAI documentation](https://platform.openai.com/docs/guides/structured-outputs/supported-schemas).

**Type**: `object`

### [](#json_schema-description)`json_schema.description`

An optional description, which helps the model understand the schema’s purpose.

**Type**: `string`

### [](#json_schema-name)`json_schema.name`

The name of the JSON schema to use.

**Type**: `string`

### [](#json_schema-schema)`json_schema.schema`

The JSON schema for the model to use when generating the output.

**Type**: `string`

### [](#max_tokens)`max_tokens`

The maximum number of tokens to generate for chat completion.

**Type**: `int`

### [](#model)`model`

The name of the OpenAI model to use.

**Type**: `string`

```yaml
# Examples:
model: gpt-4o

# ---

model: gpt-4o-mini

# ---

model: gpt-4

# ---

model: gpt4-turbo
```

### [](#presence_penalty)`presence_penalty`

Specify a number between `-2.0` and `2.0`. Positive values penalize new tokens if they have appeared in the text so far. This increases the model’s likelihood to talk about new topics.

**Type**: `float`

### [](#prompt)`prompt`

The user prompt for which a response is generated. By default, the processor sends the entire payload as a string unless customized using this field.

This field supports [interpolation functions](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/interpolation/#bloblang-queries).

**Type**: `string`

### [](#response_format)`response_format`

Specify the configured [model’s](#model) output format.

If you choose the `json_schema` option, you must also configure a `json_schema` or `schema_registry`.

**Type**: `string`

**Default**: `text`

**Options**: `text`, `json`, `json_schema`

### [](#schema_registry)`schema_registry`

The schema registry to dynamically load schemas for model responses in `json_schema` format. Schemas must be in JSON format. To learn more about supported JSON schema features, see the [OpenAI documentation](https://platform.openai.com/docs/guides/structured-outputs/supported-schemas).

**Type**: `object`

### [](#schema_registry-basic_auth)`schema_registry.basic_auth`

Configure basic authentication for requests from this component to your schema registry.

**Type**: `object`

### [](#schema_registry-basic_auth-enabled)`schema_registry.basic_auth.enabled`

Whether to use basic authentication in requests.

**Type**: `bool`

**Default**: `false`

### [](#schema_registry-basic_auth-password)`schema_registry.basic_auth.password`

The password to use for authentication. Used together with `username` for basic authentication or with encrypted private keys for secure access.

> ⚠️ **CAUTION**
>
> This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see [Manage Secrets](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/secret-management/) before adding it to your configuration.

**Type**: `string`

**Default**: `""`

### [](#schema_registry-basic_auth-username)`schema_registry.basic_auth.username`

The username of the account credentials to authenticate as. Used together with `password` for basic authentication.

**Type**: `string`

**Default**: `""`

### [](#schema_registry-jwt)`schema_registry.jwt`

Beta

Allows you to specify JWT authentication.

**Type**: `object`

### [](#schema_registry-jwt-claims)`schema_registry.jwt.claims`

Values used to pass the identity of the authenticated entity to the service provider. In this case, between this component and the schema registry.

**Type**: `object`

**Default**: `{}`

### [](#schema_registry-jwt-enabled)`schema_registry.jwt.enabled`

Whether to use JWT authentication in requests.

**Type**: `bool`

**Default**: `false`

### [](#schema_registry-jwt-headers)`schema_registry.jwt.headers`

The key/value pairs that identify the type of token and signing algorithm (optional).

**Type**: `object`

**Default**: `{}`

### [](#schema_registry-jwt-private_key_file)`schema_registry.jwt.private_key_file`

Path to a file containing the PEM-encoded private key using PKCS#1 or PKCS#8 format. The private key must be compatible with the algorithm specified in the `signing_method` field.

**Type**: `string`

**Default**: `""`

### [](#schema_registry-jwt-signing_method)`schema_registry.jwt.signing_method`

The cryptographic algorithm used to sign the JWT token. Supported algorithms include RS256, RS384, RS512, and EdDSA. This algorithm must be compatible with the private key specified in the `private_key_file` field.

**Type**: `string`

**Default**: `""`

### [](#schema_registry-name_prefix)`schema_registry.name_prefix`

A prefix to add to the schema registry name. To form the complete schema registry name, the schema ID is appended as a suffix.

**Type**: `string`

**Default**: `schema_registry_id_`

### [](#schema_registry-oauth)`schema_registry.oauth`

Configure OAuth version 1.0 to give this component authorized access to your schema registry.

**Type**: `object`

### [](#schema_registry-oauth-access_token)`schema_registry.oauth.access_token`

The value this component can use to gain access to the data in the schema registry.

**Type**: `string`

**Default**: `""`

### [](#schema_registry-oauth-access_token_secret)`schema_registry.oauth.access_token_secret`

The secret that establishes ownership of the `oauth.access_token` in OAuth 1.0 authentication.

> ⚠️ **CAUTION**
>
> This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see [Manage Secrets](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/secret-management/) before adding it to your configuration.

**Type**: `string`

**Default**: `""`

### [](#schema_registry-oauth-consumer_key)`schema_registry.oauth.consumer_key`

The value used to identify this component or client to your schema registry.

**Type**: `string`

**Default**: `""`

### [](#schema_registry-oauth-consumer_secret)`schema_registry.oauth.consumer_secret`

The secret that establishes ownership of the consumer key in OAuth 1.0 authentication.

> ⚠️ **CAUTION**
>
> This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see [Manage Secrets](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/secret-management/) before adding it to your configuration.

**Type**: `string`

**Default**: `""`

### [](#schema_registry-oauth-enabled)`schema_registry.oauth.enabled`

Whether to enable OAuth version 1.0 authentication for requests to the schema registry.

**Type**: `bool`

**Default**: `false`

### [](#schema_registry-refresh_interval)`schema_registry.refresh_interval`

How frequently to poll the schema registry for updates. If not specified, the schema does not refresh automatically.

**Type**: `string`

### [](#schema_registry-subject)`schema_registry.subject`

The subject name used to fetch the schema from the schema registry.

**Type**: `string`

### [](#schema_registry-tls)`schema_registry.tls`

Configure Transport Layer Security (TLS) settings to secure network connections. This includes options for standard TLS as well as mutual TLS (mTLS) authentication where both client and server authenticate each other using certificates. Key configuration options include `enabled` to enable TLS, `client_certs` for mTLS authentication, `root_cas`/`root_cas_file` for custom certificate authorities, and `skip_cert_verify` for development environments.

**Type**: `object`

### [](#schema_registry-tls-client_certs)`schema_registry.tls.client_certs[]`

A list of client certificates for mutual TLS (mTLS) authentication. Configure this field to enable mTLS, authenticating the client to the server with these certificates.

You must set `tls.enabled: true` for the client certificates to take effect.

**Certificate pairing rules**: For each certificate item, provide either:

-   Inline PEM data using both `cert` **and** `key` or

-   File paths using both `cert_file` **and** `key_file`.


Mixing inline and file-based values within the same item is not supported.

**Type**: `object`

**Default**: `[]`

```yaml
# Examples:
client_certs:
  - cert: foo
    key: bar

# ---

client_certs:
  - cert_file: ./example.pem
    key_file: ./example.key
```

### [](#schema_registry-tls-client_certs-cert)`schema_registry.tls.client_certs[].cert`

A plain text certificate to use.

**Type**: `string`

**Default**: `""`

### [](#schema_registry-tls-client_certs-cert_file)`schema_registry.tls.client_certs[].cert_file`

The path of a certificate to use.

**Type**: `string`

**Default**: `""`

### [](#schema_registry-tls-client_certs-key)`schema_registry.tls.client_certs[].key`

A plain text certificate key to use.

> ⚠️ **CAUTION**
>
> This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see [Manage Secrets](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/secret-management/) before adding it to your configuration.

**Type**: `string`

**Default**: `""`

### [](#schema_registry-tls-client_certs-key_file)`schema_registry.tls.client_certs[].key_file`

The path of a certificate key to use.

**Type**: `string`

**Default**: `""`

### [](#schema_registry-tls-client_certs-password)`schema_registry.tls.client_certs[].password`

A plain text password for when the private key is password encrypted in PKCS#1 or PKCS#8 format. The obsolete `pbeWithMD5AndDES-CBC` algorithm is not supported for the PKCS#8 format.

Because the obsolete pbeWithMD5AndDES-CBC algorithm does not authenticate the ciphertext, it is vulnerable to padding oracle attacks that can let an attacker recover the plaintext.

> ⚠️ **CAUTION**
>
> This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see [Manage Secrets](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/secret-management/) before adding it to your configuration.

**Type**: `string`

**Default**: `""`

```yaml
# Examples:
password: foo

# ---

password: ${KEY_PASSWORD}
```

### [](#schema_registry-tls-enable_renegotiation)`schema_registry.tls.enable_renegotiation`

Whether to allow the remote server to request renegotiation. Enable this option if you’re seeing the error message `local error: tls: no renegotiation`.

**Type**: `bool`

**Default**: `false`

### [](#schema_registry-tls-root_cas)`schema_registry.tls.root_cas`

Specify a root certificate authority to use (optional). This is a string that represents a certificate chain from the parent-trusted root certificate, through possible intermediate signing certificates, to the host certificate. Use either this field for inline certificate data or `root_cas_file` for file-based certificate loading.

> ⚠️ **CAUTION**
>
> This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see [Manage Secrets](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/secret-management/) before adding it to your configuration.

**Type**: `string`

**Default**: `""`

```yaml
# Examples:
root_cas: |-
  -----BEGIN CERTIFICATE-----
  ...
  -----END CERTIFICATE-----
```

### [](#schema_registry-tls-root_cas_file)`schema_registry.tls.root_cas_file`

Specify the path to a root certificate authority file (optional). This is a file, often with a `.pem` extension, which contains a certificate chain from the parent-trusted root certificate, through possible intermediate signing certificates, to the host certificate. Use either this field for file-based certificate loading or `root_cas` for inline certificate data.

**Type**: `string`

**Default**: `""`

```yaml
# Examples:
root_cas_file: ./root_cas.pem
```

### [](#schema_registry-tls-skip_cert_verify)`schema_registry.tls.skip_cert_verify`

Whether to skip server-side certificate verification. Set to `true` only for testing environments as this reduces security by disabling certificate validation. When using self-signed certificates or in development, this may be necessary, but should never be used in production. Consider using `root_cas` or `root_cas_file` to specify trusted certificates instead of disabling verification entirely.

**Type**: `bool`

**Default**: `false`

### [](#schema_registry-url)`schema_registry.url`

The base URL of the schema registry service.

**Type**: `string`

### [](#seed)`seed`

When set to a specific number, Redpanda Connect attempts to generate consistent responses for requests that use the same prompt, seed, and parameters.

**Type**: `int`

### [](#server_address)`server_address`

The OpenAI API endpoint to which the processor sends requests. Update the default value to use a different OpenAI-compatible service.

**Type**: `string`

**Default**: `[https://api.openai.com/v1](https://api.openai.com/v1)`

### [](#stop)`stop[]`

Specify up to four stop sequences to use. When the model encounters a stop pattern, it stops generating text and returns the final response.

**Type**: `array`

### [](#system_prompt)`system_prompt`

The system prompt to submit along with the user prompt. This field supports [interpolation functions](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/interpolation/#bloblang-queries).

**Type**: `string`

### [](#temperature)`temperature`

Choose a sampling temperature between `0` and `2`:

-   Higher values, such as `0.8` make the output more random.

-   Lower values, such as `0.2` make the output more focused and deterministic.


Redpanda recommends adding a value for this field or [`top_p`](#top_p), but not both.

**Type**: `float`

### [](#tools)`tools[]`

External tools the model can invoke, such as functions, APIs, or web browsing. You can build a series of processors that include definitions of these tools, and the specified model can choose when to invoke them to help answer a prompt. For more information, see [Examples](#Examples).

> 📝 **NOTE**
>
> If you don’t want to use external tools, enter an empty array `tools:[]`.

**Type**: `object`

### [](#tools-description)`tools[].description`

A description of this tool, the LLM uses this to decide if the tool should be used.

**Type**: `string`

### [](#tools-name)`tools[].name`

The name of this tool.

**Type**: `string`

### [](#tools-parameters)`tools[].parameters`

The parameters the LLM needs to provide to invoke this tool.

**Type**: `object`

**Default**: `[]`

### [](#tools-parameters-properties)`tools[].parameters.properties`

The properties for the processor’s input data

**Type**: `object`

### [](#tools-parameters-properties-description)`tools[].parameters.properties.description`

A description of this parameter.

**Type**: `string`

### [](#tools-parameters-properties-enum)`tools[].parameters.properties.enum[]`

Specifies that this parameter is an enum and only these specific values should be used.

**Type**: `array`

**Default**: `[]`

### [](#tools-parameters-properties-type)`tools[].parameters.properties.type`

The type of this parameter.

**Type**: `string`

### [](#tools-parameters-required)`tools[].parameters.required[]`

The required parameters for this pipeline.

**Type**: `array`

**Default**: `[]`

### [](#tools-processors)`tools[].processors[]`

The pipeline to execute when the LLM uses this tool.

**Type**: `processor`

### [](#top_p)`top_p`

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with `top_p` probability mass. For example, a `top_p` of `0.1` means only the tokens comprising the top 10% probability mass are sampled.

Redpanda recommends adding a value for this field or `temperature`, but not both.

**Type**: `float`

### [](#user)`user`

A unique identifier that represents the end-user generating the prompt. This value can help OpenAI monitor and detect [platform abuse](https://openai.com/policies/usage-policies/). This field supports [interpolation functions](https://docs.redpanda.com/cloud-data-platform/develop/connect/configuration/interpolation/#bloblang-queries).

**Type**: `string`

nclude::connect:components:partial$examples/processors/openai\_chat\_completion.adoc\[\]