Cloud

regexp_match

The regexp_match() function matches a POSIX regular expression pattern to a string. It returns an array of TEXT[] type with substring(s) of matched groups within the first match.

Syntax

The syntax for regexp_match() function is:

REGEXP_MATCH(source_string, pattern, [flags])

Parameters

  • source_string: String on which to perform the match.

  • pattern: POSIX regular expression pattern to match.

  • flags: Optional. Flags that change the matching behavior of regexp_match().

The flags parameter is an optional string that controls how the function operates. Here is a list of flags that are supported by Redpanda SQL:

  • i: Use this flag for case-insensitive matching.

  • c: Force case-sensitive matching (the default). Including c has the same effect as omitting all flags.

If you specify multiple flags, the last one takes precedence. With ci the match is case-insensitive; with ic it is case-sensitive.

Examples

Basic usage

These examples demonstrate how to find the first occurrence of an email address in the input string:

SELECT REGEXP_MATCH('Contact us at hello@example.com', '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}');
     regexp_match
----------------------
 {hello@example.com}
(1 row)

Match multiple groups

The regexp_match() function can capture multiple groups within a match, which extracts key parts from a string in a structured way. This example shows how to extract the protocol, domain and path from a given URL:

SELECT REGEXP_MATCH('https://www.example.com/products/item123', '(https?)://([\w.-]+)/(.+)');
               regexp_match
------------------------------------------
 {https,www.example.com,products/item123}
(1 row)

Case-insensitive matching

This example shows how to match a pattern regardless of case-sensitivity:

SELECT REGEXP_MATCH('User.Name@Example.COM', '@([a-z0-9.-]+)$', 'i');
 regexp_match
---------------
 {Example.COM}
(1 row)

Match with patterns stored in a table

This example shows how to take the source string and regex pattern directly from the table. First, create two sample tables:

CREATE TABLE users (
    email TEXT NOT NULL
);

CREATE TABLE patterns (
    id INT,
    regex_pattern TEXT NOT NULL
);

Once that is done, insert values into those tables:

INSERT INTO users (email) VALUES
    ('user@example.com'),
    ('admin@test.org'),
    ('invalid-email@wrong');

INSERT INTO patterns (id, regex_pattern) VALUES
    (0, '^[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}$');

Validate whether user emails in the users table are valid. If the regex doesn’t match, a NULL value is returned.

SELECT users.email,
       patterns.regex_pattern,
       REGEXP_MATCH(users.email, patterns.regex_pattern, 'i') AS is_valid
FROM users
JOIN patterns ON patterns.id = 0;
        email        |              regex_pattern              |   is_valid
---------------------+-----------------------------------------+--------------------
 user@example.com    | ^[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}$ | {user@example.com}
 admin@test.org      | ^[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}$ | {admin@test.org}
 invalid-email@wrong | ^[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}$ |
(3 rows)

Restrictions

  • The function returns NULL if it cannot match the regular expression pattern.