Degraded State Handling
Node degradation refers to the condition in which a node cannot perform most queries. If Redpanda SQL is misconfigured or faces a startup issue, it enters a degraded state, returns an error, and rejects all requests. This state can be temporary or permanent, affecting a single node or the entire cluster. This guide explains when degradation occurs and its impact on the node or cluster.
Cluster state
In Redpanda SQL, most errors that would crash a server should instead put it into a degraded state. Here are key terms related to the node or cluster state:
-
Liveness: The node serves incoming client connections, for example via psql. It does not have to allow the user to connect to the database. Returning an error on a connection attempt still meets the liveness condition.
-
Readiness: The cluster can execute queries. It requires the leader node to be in a proper state. If the leader node is degraded, the cluster is not ready to execute queries.
Exception: An invalid postgresql_port is an exception to the degraded state. Without it being properly set, even the liveness condition is not met.
|
Degradation state period
The degradation state of a node can be either permanent or temporary.
Permanent degradation
Permanent degradation occurs when a node encounters an error from which it cannot recover. The server logs the reason for this error, and each query returns the error reason. As a result, the node goes into a degraded state. To resolve the issue, the node requires a reboot. Here are a few error examples that can put a Redpanda SQL node in a permanently degraded state:
-
Invalid configuration file
-
Invalid
OXLA_HOMElayout or version -
An error occurred while reading the database state on the leader node
Effects of degraded state
| Effect | Details |
|---|---|
Database connection |
If the leader is degraded, the user cannot connect to the database, and all connection attempts return a degradation error. |
Query handling |
|
Degradation types |
|
Query execution |
The |