# regr_avgy

> For the complete documentation index, see [llms.txt](https://docs.redpanda.com/llms.txt). Component-specific: [cloud-data-platform-full.txt](https://docs.redpanda.com/cloud-data-platform-full.txt)

---
title: regr_avgy
latest-operator-version: v26.1.4
latest-console-tag: v3.7.3
latest-connect-version: 4.93.0
latest-redpanda-tag: v26.1.9
docname: sql/sql-functions/aggregate-functions/statistics/regr-avgy
page-component-name: cloud-data-platform
page-version: master
page-component-version: master
page-component-title: Cloud
page-relative-src-path: sql/sql-functions/aggregate-functions/statistics/regr-avgy.adoc
page-edit-url: https://github.com/redpanda-data/cloud-docs/edit/main/modules/reference/pages/sql/sql-functions/aggregate-functions/statistics/regr-avgy.adoc
description: The `regr_avgy()` aggregate function calculates the mean of the dependent variable (y) for non-null pairs of dependent (y) and independent (x) variables
page-topic-type: reference
page-git-created-date: "2026-05-26"
page-git-modified-date: "2026-05-26"
---

<!-- Source: https://docs.redpanda.com/cloud-data-platform/reference/sql/sql-functions/aggregate-functions/statistics/regr-avgy.md -->

The `regr_avgy()` aggregate function calculates the mean of the dependent variable (y) for non-null pairs of dependent (y) and independent (x) variables. This function is used in linear regression analysis to compute the average value of the dependent variable where both variables are not NULL.

## [](#syntax)Syntax

The syntax for this function is:

```sql
REGR_AVGY(y, x)
```

## [](#parameters)Parameters

-   `y`: Variable being predicted.

-   `x`: Variable used for prediction.


## [](#examples)Examples

The following example uses a simplified version of the `film` table from the Pagila database, containing only the `title`, `length` and `rating` columns. The complete schema for the `film` table can be found on the [Pagila](https://www.postgresql.org/ftp/projects/pgFoundry/dbsamples/pagila/pagila/) database website.

```sql
DROP TABLE IF EXISTS film;
CREATE TABLE film (
  title text NOT NULL,
  length int,
  rating int
);
INSERT INTO film(title, length, rating) VALUES
  ('ATTRACTION NEWTON', 83, 5),
  ('CHRISTMAS MOONSHINE', 150, 7),
  ('DANGEROUS UPTOWN', 121, 4),
  ('KILL BROTHERHOOD', 54, 3),
  ('HALLOWEEN NUTS', 47, 5),
  ('HOURS RAGE', 122, 7),
  ('PIANIST OUTFIELD', 136, 7),
  ('PICKUP DRIVING', 77, 3),
  ('INDEPENDENCE HOTEL', 157, 7),
  ('PRIVATE DROP', 106, 4),
  ('SAINTS BRIDE', 125, 3),
  ('FOREVER CANDIDATE', 131, 7),
  ('MILLION ACE', 142, 5),
  ('SLEEPY JAPANESE', 137, 4),
  ('WRATH MILE', 176, 7),
  ('YOUTH KICK', 179, 7),
  ('CLOCKWORK PARADISE', 143, 5);
```

The following query uses the `regr_avgy()` function to calculate the mean of the dependent variable (`rating`) for rows where both `rating` and `length` are not NULL:

```sql
SELECT
    REGR_AVGY(rating, length) AS AverageRating
FROM film;
```

The query returns:

```sql
   averagerating
-------------------
 5.294117647058823
(1 row)
```