Cloud

percent_rank

percent_rank() window function determines the relative rank of a value in a group of values, based on the ORDER BY expression in the OVER clause. It can be used with all data types supported by Redpanda SQL.

Syntax

The syntax for this function is:

PERCENT_RANK() OVER (
   [PARTITION BY partition_expression]
   ORDER BY sort_expression
)

The percent_rank() is calculated as:

(r - 1) / (n - 1)

Where r is the rank of the current row and n is the total number of rows in the window or partition.

Rows with equal values for the ranking criteria receive the same relative rank. The output data type for this function is double precision. The output will indicate the rank of values in a table, regardless of the input types.

  • If the optional PARTITION BY expression is present, the rankings are reset for each group of rows

  • If the ORDER BY expression is omitted then all relative ranks are equal to 0

Parameters

  • (): This function does not take any arguments, but the parentheses are required.

  • PARTITION BY: Optional. Divides the result set into partitions, each processed independently. If omitted, the entire result set is treated as a single partition.

  • ORDER BY: Order of rows in each partition to which the function is applied.

Examples

The following examples use the winsales table that stores details about some sales transactions:

CREATE TABLE winsales(
    salesid int,
    dateid date,
    sellerid int,
    buyerid text,
    qty int,
    qty_shipped int);
INSERT INTO winsales VALUES
    (30001, '8/2/2003', 3, 'b', 10, 10),
    (10001, '12/24/2003', 1, 'c', 10, 10),
    (10005, '12/24/2003', 1, 'a', 30, null),
    (40001, '1/9/2004', 4, 'a', 40, null),
    (10006, '1/18/2004', 1, 'c', 10, null),
    (20001, '2/12/2004', 2, 'b', 20, 20),
    (40005, '2/12/2004', 4, 'a', 10, 10),
    (20002, '2/16/2004', 2, 'c', 20, 20),
    (30003, '4/18/2004', 3, 'b', 15, null),
    (30004, '4/18/2004', 3, 'b', 20, null),
    (30007, '9/7/2004', 3, 'c', 30, null);

percent_rank() with ORDER BY

This example executes the percent_rank() function with ORDER BY keyword and calculates the descending percent rank of all rows based on the quantity sold:

SELECT salesid, qty,
   PERCENT_RANK() OVER (ORDER BY qty DESC) AS p_rnk,
   RANK() OVER (ORDER BY qty DESC) AS rnk
FROM winsales
ORDER BY 2,1;

Output that includes the sales ID along with the quantity sold and both percent and regular ranks:

   salesid | qty | p_rnk | rnk
---------+-----+-------+-----
   10001 |  10 |   0.7 |   8
   10006 |  10 |   0.7 |   8
   30001 |  10 |   0.7 |   8
   40005 |  10 |   0.7 |   8
   30003 |  15 |   0.6 |   7
   20001 |  20 |   0.3 |   4
   20002 |  20 |   0.3 |   4
   30004 |  20 |   0.3 |   4
   10005 |  30 |   0.1 |   2
   30007 |  30 |   0.1 |   2
   40001 |  40 |     0 |   1

percent_rank() with ORDER BY and PARTITION BY

This example executes the percent_rank() function with ORDER BY keyword and PARTITION BY clause, partitions the table by seller ID, orders each partition by the quantity, and assigns a percent rank to each row:

SELECT salesid, sellerid, qty,
   PERCENT_RANK() OVER (PARTITION BY sellerid ORDER BY qty DESC) AS p_rnk
FROM winsales
ORDER BY 2,3,1;

The query returns:

  salesid | sellerid | qty |       p_rnk
---------+----------+-----+--------------------
   10001 |        1 |  10 |                0.5
   10006 |        1 |  10 |                0.5
   10005 |        1 |  30 |                  0
   20001 |        2 |  20 |                  0
   20002 |        2 |  20 |                  0
   30001 |        3 |  10 |                  1
   30003 |        3 |  15 | 0.6666666666666666
   30004 |        3 |  20 | 0.3333333333333333
   30007 |        3 |  30 |                  0
   40005 |        4 |  10 |                  1
   40001 |        4 |  40 |                  0
(11 rows)