Cloud

Overview

Aggregate functions for statistics are typically used for statistical analysis. Redpanda SQL supports the following functions:

Functions Description

Functions	Description
`corr()`	Calculates the Pearson correlation coefficient between two sets of number pairs
`covar_pop()`	Calculates the population covariance between two sets of number pairs
`covar_samp()`	Calculates the sample covariance between two sets of number pairs
`regr_avgx()`	Calculates the average of the independent variable (sum(X)/N)
`regr_avgy()`	Calculates the average of the dependent variable (sum(Y)/N)
`regr_count()`	Calculates the number of input rows in which both expressions are non-null
`regr_intercept()`	Calculates the y-intercept of the univariate linear regression line for a group of data points
`regr_r2()`	Calculates the coefficient of determination (R2) for a linear regression model
`regr_slope()`	Calculates slope of the least-squares-fit linear equation determined by the (X, Y) pairs
`regr_sxx()`	Calculates the sum(X2) - sum(X)2/N (“sum of squares” of the independent variable)
`regr_sxy()`	Calculates the sum(XY) - sum(X) sum(Y)/N (“sum of products” of independent times dependent variable)
`regr_syy()`	Calculates the sum(Y2) - sum(Y)2/N (“sum of squares” of the dependent variable)
`stddev()`	Calculates the sample standard deviation of a set of numeric values
`stddev_pop()`	Calculates the population standard deviation of the input values
`stddev_samp()`	Calculates the sample standard deviation of the input values
`variance()`	Calculates the sample variance of a set of numeric values.
`var_pop()`	Calculates the population variance of the input values (square of the population standard deviation)
`var_samp()`	Calculates the sample variance of the input values (square of the sample standard deviation)

corr()

Calculates the Pearson correlation coefficient between two sets of number pairs

covar_pop()

Calculates the population covariance between two sets of number pairs

covar_samp()

Calculates the sample covariance between two sets of number pairs

regr_avgx()

Calculates the average of the independent variable (sum(X)/N)

regr_avgy()

Calculates the average of the dependent variable (sum(Y)/N)

regr_count()

Calculates the number of input rows in which both expressions are non-null