Cloud

regr_sxy

The regr_sxy() aggregate function calculates the sum of products of deviations for the dependent variable (y) and the independent variable (x) in a linear regression analysis. This value represents the covariance-like term used to compute the slope of the regression line.

Syntax

The syntax for this function is:

REGR_SXY(y, x)

Parameters

  • y: Variable being predicted.

  • x: Variable used for prediction.

Examples

This example uses a simplified version of the film table from the Pagila database, containing only the title, length and rating columns. The complete schema for the film table can be found on the Pagila database website.

DROP TABLE IF EXISTS film;
CREATE TABLE film (
  title text NOT NULL,
  length int,
  rating int
);
INSERT INTO film(title, length, rating) VALUES
  ('ATTRACTION NEWTON', 83, 5),
  ('CHRISTMAS MOONSHINE', 150, 7),
  ('DANGEROUS UPTOWN', 121, 4),
  ('KILL BROTHERHOOD', 54, 3),
  ('HALLOWEEN NUTS', 47, 5),
  ('HOURS RAGE', 122, 7),
  ('PIANIST OUTFIELD', 136, 7),
  ('PICKUP DRIVING', 77, 3),
  ('INDEPENDENCE HOTEL', 157, 7),
  ('PRIVATE DROP', 106, 4),
  ('SAINTS BRIDE', 125, 3),
  ('FOREVER CANDIDATE', 131, 7),
  ('MILLION ACE', 142, 5),
  ('SLEEPY JAPANESE', 137, 4),
  ('WRATH MILE', 176, 7),
  ('YOUTH KICK', 179, 7),
  ('CLOCKWORK PARADISE', 143, 5);

This query uses the regr_sxy() function to calculate the sum of products of deviations for non-null pair of rating and length:

SELECT
    REGR_SXY(rating, length) AS SumOfSquaresXY
FROM film;

The query returns:

  sumofsquaresxy
-------------------
 612.4705882352937
(1 row)