percent_rank
percent_rank() window function determines the relative rank of a value in a group of values, based on the ORDER BY expression in the OVER clause. It can be used with all data types supported by Redpanda SQL.
Syntax
The syntax for this function is:
PERCENT_RANK() OVER (
[PARTITION BY partition_expression]
ORDER BY sort_expression
)
The percent_rank() is calculated as:
(r - 1) / (n - 1)
Where r is the rank of the current row and n is the total number of rows in the window or partition.
Rows with equal values for the ranking criteria receive the same relative rank. The output data type for this function is double precision. The output will indicate the rank of values in a table, regardless of the input types.
-
If the optional
PARTITION BYexpression is present, the rankings are reset for each group of rows -
If the
ORDER BYexpression is omitted then all relative ranks are equal to 0
Parameters
-
(): This function does not take any arguments, but the parentheses are required. -
PARTITION BY: Optional. Divides the result set into partitions, each processed independently. If omitted, the entire result set is treated as a single partition. -
ORDER BY: Order of rows in each partition to which the function is applied.
Examples
The following examples use the winsales table that stores details about some sales transactions:
CREATE TABLE winsales(
salesid int,
dateid date,
sellerid int,
buyerid text,
qty int,
qty_shipped int);
INSERT INTO winsales VALUES
(30001, '8/2/2003', 3, 'b', 10, 10),
(10001, '12/24/2003', 1, 'c', 10, 10),
(10005, '12/24/2003', 1, 'a', 30, null),
(40001, '1/9/2004', 4, 'a', 40, null),
(10006, '1/18/2004', 1, 'c', 10, null),
(20001, '2/12/2004', 2, 'b', 20, 20),
(40005, '2/12/2004', 4, 'a', 10, 10),
(20002, '2/16/2004', 2, 'c', 20, 20),
(30003, '4/18/2004', 3, 'b', 15, null),
(30004, '4/18/2004', 3, 'b', 20, null),
(30007, '9/7/2004', 3, 'c', 30, null);
percent_rank() with ORDER BY
This example executes the percent_rank() function with ORDER BY keyword and calculates the descending percent rank of all rows based on the quantity sold:
SELECT salesid, qty,
PERCENT_RANK() OVER (ORDER BY qty DESC) AS p_rnk,
RANK() OVER (ORDER BY qty DESC) AS rnk
FROM winsales
ORDER BY 2,1;
Output that includes the sales ID along with the quantity sold and both percent and regular ranks:
salesid | qty | p_rnk | rnk
---------+-----+-------+-----
10001 | 10 | 0.7 | 8
10006 | 10 | 0.7 | 8
30001 | 10 | 0.7 | 8
40005 | 10 | 0.7 | 8
30003 | 15 | 0.6 | 7
20001 | 20 | 0.3 | 4
20002 | 20 | 0.3 | 4
30004 | 20 | 0.3 | 4
10005 | 30 | 0.1 | 2
30007 | 30 | 0.1 | 2
40001 | 40 | 0 | 1
percent_rank() with ORDER BY and PARTITION BY
This example executes the percent_rank() function with ORDER BY keyword and PARTITION BY clause, partitions the table by seller ID, orders each partition by the quantity, and assigns a percent rank to each row:
SELECT salesid, sellerid, qty,
PERCENT_RANK() OVER (PARTITION BY sellerid ORDER BY qty DESC) AS p_rnk
FROM winsales
ORDER BY 2,3,1;
The query returns:
salesid | sellerid | qty | p_rnk
---------+----------+-----+--------------------
10001 | 1 | 10 | 0.5
10006 | 1 | 10 | 0.5
10005 | 1 | 30 | 0
20001 | 2 | 20 | 0
20002 | 2 | 20 | 0
30001 | 3 | 10 | 1
30003 | 3 | 15 | 0.6666666666666666
30004 | 3 | 20 | 0.3333333333333333
30007 | 3 | 30 | 0
40005 | 4 | 10 | 1
40001 | 4 | 40 | 0
(11 rows)