Configure Storage in Kubernetes
Redpanda brokers must store their data on disk (/var/lib/redpanda/data
). By default, the Redpanda Helm chart uses the default StorageClass in a Kubernetes cluster to create one PersistentVolumeClaim for each Pod that runs a Redpanda broker. The default StorageClass in your Kubernetes cluster depends on the Kubernetes platform that you are using. You can customize the Helm chart to use the following storage volumes:
Prerequisites
If you're configuring Redpanda for production, you must create and mount an XFS file system on any storage volumes that host the data directory of Redpanda (
/var/lib/redpanda/data
). XFS is a high-performance file system that is required for running Redpanda in production. NFS file systems are not supported.Review the storage best practices.
Use PersistentVolumes
A PersistentVolume is storage in the cluster that has been provisioned by an administrator or dynamically provisioned using StorageClasses. For details about PersistentVolumes, see the Kubernetes documentation.
You can configure the Helm chart to use PersistentVolumes with a static provisioner or a dynamic provisioner. Redpanda recommends using a StorageClass with a dynamic provisioner. See the best practices.
- Dynamic provisioners
- Static provisioners
A dynamic provisioner creates a PersistentVolume on demand for each Redpanda broker.
Managed Kubernetes platforms and cloud environments usually provide a dynamic provisioner. If you are running Kubernetes on-premises, make sure that you have a dynamic provisioner for your storage type.
Make sure that you have at least one StorageClass in the cluster:
kubectl get storageclass
Example output
In a Google GKE cluster, this is the result:
NAME PROVISIONER AGE
standard (default) kubernetes.io/gce-pd 1dThis StorageClass is marked as the default, which means that this class is used to provision a PersistentVolume when the PersistentVolumeClaim doesn’t specify the StorageClass.
Configure the Helm chart with your StorageClass:
To use your Kubernetes cluster's default StorageClass, set
storage.persistentVolume.storageClass
to an empty string (""
):- --values
- --set
storageclass.yamlstorage:
persistentVolume:
enabled: true
size: 20Gi
storageClass: ""helm upgrade --install redpanda redpanda/redpanda -n redpanda --create-namespace \
--values storageclass.yaml --reuse-valueshelm upgrade --install redpanda redpanda/redpanda -n redpanda --create-namespace \
--set storage.persistentVolume.enabled=true \
--set storage.persistentVolume.size=20Gi \
--set storage.persistentVolume.storageClass=""To use a specific StorageClass, set its name in the
storage.persistentVolume.storageClass
configuration:- --values
- --set
storageclass.yamlstorage:
persistentVolume:
enabled: true
size: 20Gi
storageClass: "<storage-class>"helm upgrade --install redpanda redpanda/redpanda -n redpanda --create-namespace \
--values storageclass.yaml --reuse-valueshelm upgrade --install redpanda redpanda/redpanda -n redpanda --create-namespace \
--set storage.persistentVolume.enabled=true \
--set storage.persistentVolume.size=20Gi \
--set storage.persistentVolume.storageClass="<storage-class>"noteFor default values and documentation for configuration options, see the
values.yaml
file.
When you use a static provisioner, an existing PersistentVolume in the cluster is selected and bound to one PersistentVolumeClaim for each Redpanda broker.
Create one PersistentVolume for each Redpanda broker. Make sure to create PersistentVolumes with a capacity of at least the value of the
storage.persistentVolume.size
configuration.Set the
storage.persistentVolume.storageClass
to a dash ("-"
) to use a PersistentVolume with a static provisioner:- --values
- --set
storageclass.yamlstorage:
persistentVolume:
enabled: true
storageClass: "-"helm upgrade --install redpanda redpanda/redpanda -n redpanda --create-namespace \
--values storageclass.yaml --reuse-valueshelm upgrade --install redpanda redpanda/redpanda -n redpanda --create-namespace \
--set storage.persistentVolume.enabled=true \
--set storage.persistentVolume.storageClass="-"noteFor default values and documentation for configuration options, see the
values.yaml
file.
Resize PersistentVolumes
To give Redpanda brokers more storage, you can expand the size of PersistentVolumes. The way you expand PersistentVolumes depends on the provisioner that you use.
- Dynamic provisioners
- Static provisioners
Make sure that your StorageClass is capable of volume expansions. For a list of volumes that support volume expansion, see the Kubernetes documentation.
Increase the value of the
storage.persistentVolume.size
configuration:- --values
- --set
persistentvolume-size.yamlstorage:
persistentVolume:
enabled: true
size: <custom-size>Gihelm upgrade --install redpanda redpanda/redpanda -n redpanda --create-namespace \
--values persistentvolume-size.yaml --reuse-valueshelm upgrade --install redpanda redpanda/redpanda -n redpanda --create-namespace \
--set storage.persistentVolume.enabled=true \
--set storage.persistentVolume.size=<custom-size>Gi
The instructions for resizing PersistentVolumes vary depending on the way your file system is allocated. Follow the recommended process for your system. You do not need to make any configuration changes to the Helm chart.
Delete PersistentVolumeClaims
To prevent accidental loss of data, PersistentVolumesClaims are not deleted when Redpanda brokers are removed from a cluster. It is your responsibility to delete PersistentVolumeClaims when they are no longer needed. Check the reclaim policy of your PersistentVolumes before deleting a PersistentVolumeClaim.
kubectl get persistentvolume -n redpanda
For descriptions of each reclaim policy, see the Kubernetes documentation.
Use hostPath volumes
A hostPath volume mounts a file or directory from the host node's file system into your Pod. For details about hostPath volumes, see the Kubernetes documentation.
To store Redpanda data in hostPath volumes:
- Set the
storage.hostPath
configuration to the absolute path of a file on the local worker node. - Set
storage.persistentVolume.enabled
tofalse
. - Set
statefulset.initContainers.setDataDirOwnership.enabled
totrue
.
Pods that run Redpanda brokers must have read/write access to their data directories. The initContainer is responsible for setting write permissions on the data directories. By default, statefulset.initContainers.setDataDirOwnership
is disabled because most storage drivers call SetVolumeOwnership
to give Redpanda permissions to the root of the storage mount. However, some storage drivers, such as hostPath
, do not call SetVolumeOwnership
. In this case, you must enable the initContainer to set the permissions.
To set permissions on the data directories, the initContainer must run as root. However, be aware that an initContainer running as root can introduce the following security risks:
- Privilege escalation: If attackers gains access to the initContainer, they can escalate privileges to gain full control over the system. For example, attackers could use the initContainer to gain unauthorized access to sensitive data, tamper with the system, or start denial-of-service attacks.
- Container breakouts: If the container is misconfigured or the container runtime has a vulnerability, attackers could escape from the initContainer and access the host operating system.
- Image tampering: If attackers gain access to the container image of the initContainer, they could add malicious code or backdoors to it. Image tampering could compromise the security of the entire cluster.
If the Pod is deleted and recreated, it might be scheduled on another worker node and no longer have access to the same hostPath volume data.
- --values
- --set
storage:
hostPath: "<absolute-path>"
persistentVolume:
enabled: false
initContainers:
setDataDirOwnership:
enabled: true
helm upgrade --install redpanda redpanda/redpanda -n redpanda --create-namespace \
--values hostpath.yaml --reuse-values
helm upgrade --install redpanda redpanda/redpanda -n redpanda --create-namespace \
--set storage.persistentVolume.enabled=false \
--set storage.hostPath=<absolute-path> \
--set statefulset.initContainers.setDataDirOwnership.enabled=true
For default values and documentation for configuration options, see the values.yaml
file.
Use emptyDir volumes
An emptyDir volume is first created when a Pod is assigned to a node, and the volume exists as long as the Pod is running on that node. For details about emptyDir volumes, see the Kubernetes documentation.
To store Redpanda data in emptyDir volumes,
set the storage.hostPath
configuration to an empty string (""
),
and set storage.persistentVolume.enabled
to false
.
When a Pod is removed from a node for any reason, the data in the emptyDir volume is deleted permanently.
- --values
- --set
storage:
hostPath: ""
persistentVolume:
enabled: false
helm upgrade --install redpanda redpanda/redpanda -n redpanda --create-namespace \
--values emptydir.yaml --reuse-values
helm upgrade --install redpanda redpanda/redpanda -n redpanda --create-namespace \
--set storage.persistentVolume.enabled=false
For default values and documentation for configuration options, see the values.yaml
file.
Next steps
Enable rack awareness to minimize data loss in the event of a rack failure.