Cloud

Overview

Aggregate functions for statistics are typically used for statistical analysis. Redpanda SQL supports the following functions:

Functions Description

corr()

Calculates the Pearson correlation coefficient between two sets of number pairs

covar_pop()

Calculates the population covariance between two sets of number pairs

covar_samp()

Calculates the sample covariance between two sets of number pairs

regr_avgx()

Calculates the average of the independent variable (sum(X)/N)

regr_avgy()

Calculates the average of the dependent variable (sum(Y)/N)

regr_count()

Calculates the number of input rows in which both expressions are non-null

regr_intercept()

Calculates the y-intercept of the univariate linear regression line for a group of data points

regr_r2()

Calculates the coefficient of determination (R2) for a linear regression model

regr_slope()

Calculates slope of the least-squares-fit linear equation determined by the (X, Y) pairs

regr_sxx()

Calculates the sum(X2) - sum(X)2/N (“sum of squares” of the independent variable)

regr_sxy()

Calculates the sum(X*Y) - sum(X) * sum(Y)/N (“sum of products” of independent times dependent variable)

regr_syy()

Calculates the sum(Y2) - sum(Y)2/N (“sum of squares” of the dependent variable)

stddev()

Calculates the sample standard deviation of a set of numeric values

stddev_pop()

Calculates the population standard deviation of the input values

stddev_samp()

Calculates the sample standard deviation of the input values

variance()

Calculates the sample variance of a set of numeric values.

var_pop()

Calculates the population variance of the input values (square of the population standard deviation)

var_samp()

Calculates the sample variance of the input values (square of the sample standard deviation)