Migrate from Cluster and Console Custom Resources
To ensure compatibility with future versions of Redpanda and to benefit from new features, enhancements, and security updates, you must migrate from the deprecated Cluster and Console custom resources to the Redpanda custom resource. The migration process involves the following steps:
-
Deploy at least version 23.2 of the updated Redpanda Operator in the same Kubernetes cluster as your deprecated Redpanda Operator.
-
Migrate Cluster and Console resources to Redpanda resources.
Before implementing any changes in your production environment, Redpanda Data recommends testing the migration in a non-production environment. |
Prerequisites
Before migrating to the Redpanda Operator, you must have the name of your Cluster resource and the namespace in which it’s deployed. If you have multiple clusters, migrate one at a time.
kubectl get cluster -A
Example output:
NAMESPACE NAME AGE redpanda one-node-external 17m
If you also have a Console resource, you need the name of your Console resource and the namespace in which it’s deployed:
kubectl get console -A
Deploy the updated Redpanda Operator
The first step in the migration process is to deploy the updated Redpanda Operator in the same namespace as an existing Cluster resource.
-
Make sure that you have permission to install custom resource definitions (CRDs):
kubectl auth can-i create CustomResourceDefinition --all-namespaces
bashYou should see
yes
in the output.You need cluster-level permissions to install the Redpanda Operator CRDs in the next steps.
-
Install the Redpanda Operator custom resource definitions (CRDs):
kubectl kustomize "https://github.com/redpanda-data/redpanda-operator//src/go/k8s/config/crd?ref=v2.4.2" \ | kubectl apply -f -
bash -
Install the Redpanda Operator in the same namespace as your Cluster custom resource:
helm repo add redpanda https://charts.redpanda.com helm upgrade --install redpanda-controller redpanda/operator \ --namespace <namespace> \ --set image.tag=v2.4.2 \ --create-namespace
bash -
Ensure that the Deployment is successfully rolled out:
kubectl --namespace <namespace> rollout status -w deployment/redpanda-controller-operator
bashdeployment "redpanda-controller" successfully rolled out
Prepare existing Kubernetes resources
After you’ve deployed the updated Redpanda Operator, you must stop the deprecated Redpanda Operator from reconciling the deprecated resources and adopt some existing Kubernetes resources that are part of the Redpanda deployment.
-
Stop the deprecated Redpanda Operator from reconciling the Cluster and Console custom resources:
kubectl --namespace <namespace> annotate cluster <cluster-name> redpanda.vectorized.io/managed="false" kubectl --namespace <namespace> annotate console <console-name> redpanda.vectorized.io/managed="false"
bash -
Delete your Cluster resource’s existing StatefulSet:
kubectl --namespace <namespace> delete statefulset <cluster-name> --cascade=orphan
bash -
Update the label selectors of all Pods that were in the deleted StatefulSet:
-
To get the Pod names:
kubectl get pod -l app.kubernetes.io/instance=<cluster-name> --namespace <namespace>
bash -
To update the label selectors, do the following for each Pod:
kubectl --namespace <namespace> label pod <pod-name> app.kubernetes.io/component=redpanda-statefulset --overwrite
bash
-
-
Adopt your existing Services.
-
Label and annotate the Services:
kubectl --namespace <namespace> annotate service <cluster-name> meta.helm.sh/release-name=<cluster-name> --overwrite kubectl --namespace <namespace> annotate service <cluster-name> meta.helm.sh/release-namespace=<namespace> --overwrite kubectl --namespace <namespace> label service <cluster-name> app.kubernetes.io/managed-by=Helm --overwrite kubectl --namespace <namespace> annotate service <cluster-name>-external meta.helm.sh/release-name=<cluster-name> --overwrite kubectl --namespace <namespace> annotate service <cluster-name>-external meta.helm.sh/release-namespace=<namespace> --overwrite kubectl --namespace <namespace> label service <cluster-name>-external app.kubernetes.io/managed-by=Helm --overwrite
bash -
Update the selectors of the
<cluster-name>
Service:kubectl --namespace <namespace> edit service <cluster-name>
bashChange the selector to:
selector: app.kubernetes.io/instance: <cluster-name> app.kubernetes.io/name: redpanda
yaml
This step prevents Services from being redeployed, which reduces downtime. Because the names of these Services match the names of the Services that the Redpanda Helm chart will try to deploy, these annotations and labels bring the existing Services under the management of Helm so that they do not get deleted and redeployed when you apply the Redpanda resource.
-
-
Adopt the ServiceAccount:
kubectl --namespace <namespace> annotate serviceaccount <cluster-name> meta.helm.sh/release-name=<cluster-name> kubectl --namespace <namespace> annotate serviceaccount <cluster-name> meta.helm.sh/release-namespace=<namespace> kubectl --namespace <namespace> label serviceaccount <cluster-name> app.kubernetes.io/managed-by=Helm --overwrite
bash -
Delete the PodDisruptionBudget:
kubectl --namespace <namespace> delete PodDisruptionBudget <cluster-name>
bash
Migrate Cluster and Console resources to Redpanda resources
You can now convert your deprecated Cluster and Console resources to a Redpanda resource.
-
Create a Redpanda resource:
redpanda-cluster.yaml
apiVersion: cluster.redpanda.com/v1alpha1 kind: Redpanda metadata: name: <cluster-name> namespace: <namespace> annotations: cluster.redpanda.com/managed: "true" spec: migration: enabled: true clusterRef: name: <cluster-name> namespace: <namespace> consoleRef: name: <console-name> namespace: <namespace>
yamlWith this configuration, the updated Redpanda Operator will try to migrate your Cluster and/or Console resources to the new Redpanda resource.
-
The Redpanda Operator does not migrate all configurations. For example, if your cluster had SASL enabled, you must manually add any SASL configuration to the Redpanda resource. For help with configuration, see the Redpanda CRD reference.
-
If the
additionalConfiguration
section of your Cluster resource includesredpanda.empty_seed_starts_cluster: true
, make sure that this configuration is not present in the migratedredpanda.yaml
file. The Redpanda Helm chart includes this configuration by default, so if your Redpanda resource also includes it, Redpanda will throw an error due to the duplicated configuration. -
Make sure that
resources.memory.container.min
andresources.memory.container.max
are both set to at least 2.5Gi. Otherwise, Redpanda will be unable to start.
-
-
Deploy the Redpanda resource:
kubectl apply -f redpanda-cluster.yaml --namespace <namespace>
bashThe updated Redpanda Operator will delete the Pods sequentially causing them to be redeployed using Helm and your Redpanda resource.
-
Wait for the Redpanda resource to successfully reach a
deployed
state:kubectl get redpanda <cluster-name> --namespace <namespace> --watch
bashExample output:
NAME READY STATUS redpanda True Redpanda reconciliation succeeded
Troubleshooting
While the deployment process can sometimes take a few minutes, a prolonged 'not ready' status may indicate an issue.
HelmRelease is not ready
If you are using the Redpanda Operator, you may see the following message while waiting for a Redpanda custom resource to be deployed:
NAME READY STATUS
redpanda False HelmRepository 'redpanda/redpanda-repository' is not ready
redpanda False HelmRelease 'redpanda/redpanda' is not ready
While the deployment process can sometimes take a few minutes, a prolonged 'not ready' status may indicate an issue. Follow the steps below to investigate:
-
Check the status of the HelmRelease:
kubectl describe helmrelease <redpanda-resource-name> --namespace <namespace>
bash -
Review the Redpanda Operator logs:
kubectl logs -l app.kubernetes.io/name=operator -c manager --namespace <namespace>
bash
HelmRelease retries exhausted
The HelmRelease retries exhausted
error occurs when the Helm Controller has tried to reconcile the HelmRelease a number of times, but these attempts have failed consistently.
The Helm Controller watches for changes in HelmRelease objects. When changes are detected, it tries to reconcile the state defined in the HelmRelease with the state in the cluster. The process of reconciliation includes installation, upgrade, testing, rollback or uninstallation of Helm releases.
You may see this error due to:
-
Incorrect configuration in the HelmRelease.
-
Issues with the chart, such as a non-existent chart version or the chart repository not being accessible.
-
Missing dependencies or prerequisites required by the chart.
-
Issues with the underlying Kubernetes cluster, such as insufficient resources or connectivity issues.
To debug this error do the following:
-
Check the status of the HelmRelease:
kubectl describe helmrelease <cluster-name> --namespace <namespace>
bash -
Review the Redpanda Operator logs:
kubectl logs -l app.kubernetes.io/name=operator -c manager --namespace <namespace>
bash
When you find and fix the error, you must use the Flux CLI, fluxctl
, to suspend and resume the reconciliation process:
-
Suspend the HelmRelease:
flux suspend helmrelease <cluster-name> --namespace <namespace>
bash -
Resume the HelmRelease:
flux resume helmrelease <cluster-name> --namespace <namespace>
bash
Crash loop backoffs
If a broker crashes after startup, or gets stuck in a crash loop, it could produce progressively more stored state that uses additional disk space and takes more time for each restart to process.
To prevent infinite crash loops, the Redpanda Helm chart sets the crash_loop_limit
node property to 5. The crash loop limit is the number of consecutive crashes that can happen within one hour of each other. After Redpanda reaches this limit, it will not start until its internal consecutive crash counter is reset to zero. In Kubernetes, the Pod running Redpanda remains in a CrashLoopBackoff
state until its internal consecutive crash counter is reset to zero.
To troubleshoot a crash loop backoff:
-
Check the Redpanda logs from the most recent crashes:
kubectl logs <pod-name> --namespace <namespace>
bashKubernetes retains logs only for the current and the previous instance of a container. This limitation makes it difficult to access logs from earlier crashes, which may contain vital clues about the root cause of the issue. Given these log retention limitations, setting up a centralized logging system is crucial. Systems such as Loki or Datadog can capture and store logs from all containers, ensuring you have access to historical data. -
Resolve the issue that led to the crash loop backoff.
-
Reset the crash counter to zero to allow Redpanda to restart. You can do any of the following to reset the counter:
-
Update the redpanda.yaml configuration file. You can make changes to any of the following sections in the Redpanda Helm chart to trigger an update:
-
config.cluster
-
config.node
-
config.tunable
-
-
Delete the
startup_log
file in the broker’s data directory.kubectl exec <pod-name> --namespace <namespace> -- rm /var/lib/redpanda/data/startup_log
bashIt might be challenging to execute this command within a Pod that is in a CrashLoopBackoff
state due to the limited time during which the Pod is available before it restarts. Wrapping the command in a loop might work. -
Wait one hour since the last crash. The crash counter resets after one hour.
-
To avoid future crash loop backoffs and manage the accumulation of small segments effectively:
-
Monitor the size and number of segments regularly.
-
Optimize your Redpanda configuration for segment management.
-
Consider implementing Tiered Storage to manage data more efficiently.
StatefulSet never rolls out
If the StatefulSet Pods remain in a pending state, they are waiting for resources to become available.
To identify the Pods that are pending, use the following command:
kubectl get pod --namespace <namespace>
The response includes a list of Pods in the StatefulSet and their status.
To view logs for a specific Pod, use the following command.
kubectl logs -f <pod-name> --namespace <namespace>
You can use the output to debug your deployment.
Unable to mount volume
If you see volume mounting errors in the Pod events or in the Redpanda logs, ensure that each of your Pods has a volume available in which to store data.
-
If you’re using StorageClasses with dynamic provisioners (default), ensure they exist:
kubectl get storageclass
bash -
If you’re using PersistentVolumes, ensure that you have one PersistentVolume available for each Redpanda broker, and that each one has the storage capacity that’s set in
storage.persistentVolume.size
:kubectl get persistentvolume --namespace <namespace>
bash
To learn how to configure different storage volumes, see Configure Storage.
Failed to pull image
When deploying the Redpanda Helm chart, you may encounter Docker rate limit issues because the default registry URL is not recognized as a Docker Hub URL. The domain docker.redpanda.com
is used for statistical purposes, such as tracking the number of downloads. It mirrors Docker Hub’s content while providing specific analytics for Redpanda.
Failed to pull image "docker.redpanda.com/redpandadata/redpanda:v<version>": rpc error: code = Unknown desc = failed to pull and unpack image "docker.redpanda.com/redpandadata/redpanda:v<version>": failed to copy: httpReadSeeker: failed open: unexpected status code 429 Too Many Requests - Server message: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
To fix this error, do one of the following:
-
Replace the
image.repository
value in the Helm chart withdocker.io/redpandadata/redpanda
. Switching to Docker Hub avoids the rate limit issues associated withdocker.redpanda.com
.-
Helm + Operator
-
Helm
redpanda-cluster.yaml
apiVersion: cluster.redpanda.com/v1alpha1 kind: Redpanda metadata: name: redpanda spec: chartRef: {} clusterSpec: image: repository: docker.io/redpandadata/redpanda
yamlkubectl apply -f redpanda-cluster.yaml --namespace <namespace>
bash-
--values
-
--set
docker-repo.yaml
image: repository: docker.io/redpandadata/redpanda
yamlhelm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \ --values docker-repo.yaml --reuse-values
bashhelm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \ --set image.repository=docker.io/redpandadata/redpanda
bash -
-
Authenticate to Docker Hub by logging in with your Docker Hub credentials. The
docker.redpanda.com
site acts as a reflector for Docker Hub. As a result, when you log in with your Docker Hub credentials, you will bypass the rate limit issues.
Dig not defined
This error means that you are using an unsupported version of Helm:
Error: parse error at (redpanda/templates/statefulset.yaml:203): function "dig" not defined
To fix this error, ensure that you are using the minimum required version: 3.10.0.
helm version
Repository name already exists
If you see this error, remove the redpanda
chart repository, then try installing it again.
helm repo remove redpanda
helm repo add redpanda https://charts.redpanda.com
helm repo update
Fatal error during checker "Data directory is writable" execution
This error appears when Redpanda does not have write access to your configured storage volume under storage
in the Helm chart.
Error: fatal error during checker "Data directory is writable" execution: open /var/lib/redpanda/data/test_file: permission denied
To fix this error, set statefulset.initContainers.setDataDirOwnership.enabled
to true
so that the initContainer can set the correct permissions on the data directories.
Cannot patch "redpanda" with kind StatefulSet
This error appears when you run helm upgrade
with the --values
flag but do not include all your previous overrides.
Error: UPGRADE FAILED: cannot patch "redpanda" with kind StatefulSet: StatefulSet.apps "redpanda" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy', 'persistentVolumeClaimRetentionPolicy' and 'minReadySeconds' are forbidden
To fix this error, do one of the following:
-
Include all the value overrides from the previous installation or upgrade using either the
--set
or the--values
flags. -
Use the
--reuse-values
flag.Do not use the --reuse-values
flag to upgrade from one version of the Helm chart to another. This flag stops Helm from using any new values in the upgraded chart.
Cannot patch "redpanda-console" with kind Deployment
This error appears if you try to upgrade your deployment and you already have console.enabled
set to true
.
Error: UPGRADE FAILED: cannot patch "redpanda-console" with kind Deployment: Deployment.apps "redpanda-console" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/instance":"redpanda", "app.kubernetes.io/name":"console"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable
To fix this error, set console.enabled
to false
so that Helm doesn’t try to deploy Redpanda Console again.
Helm is in a pending-rollback state
An interrupted Helm upgrade process can leave your Helm release in a pending-rollback
state. This state prevents further actions like upgrades, rollbacks, or deletions through standard Helm commands. To fix this:
-
Identify the Helm release that’s in a
pending-rollback
state:helm list --namespace <namespace> --all
bashLook for releases with a status of
pending-rollback
. These are the ones that need intervention. -
Verify the Secret’s status to avoid affecting the wrong resource:
kubectl --namespace <namespace> get secret --show-labels
bashIdentify the Secret associated with your Helm release by its
pending-rollback
status in the labels.Ensure you have correctly identified the Secret to avoid unintended consequences. Deleting the wrong Secret could impact other deployments or services. -
Delete the Secret to clear the
pending-rollback
state:kubectl --namespace <namespace> delete secret -l status=pending-rollback
bash
After clearing the pending-rollback
state:
-
Retry the upgrade: Restart the upgrade process. You should investigate the initial failure to avoid getting into the
pending-rollback
state again. -
Perform a rollback: If you need to roll back to a previous release, use
helm rollback <release-name> <revision>
to revert to a specific, stable release version.
Resources aren’t being updated
If you are deleting, annotating, or labeling resources and they appear unchanged, the Redpanda Operator may still be managing your Cluster or Console resource.
Make sure the following annotation is set on your Cluster and Console resources:
redpanda.vectorized.io/managed="false"
kubectl describe cluster <cluster-name> --namespace <namespace>
kubectl describe console <cluster-name> --namespace <namespace>
Open an issue
If you cannot solve the issue or you need assistance during the migration process, open a GitHub issue in the Redpanda repository. Before opening a new issue, search the existing issues on GitHub to see if someone has already reported a similar problem or if any relevant discussions that can help you.
Rollback to the deprecated Redpanda Operator
If you still have the Cluster resource you may undo and revert your changes, but there may be downtime depending on how far you have moved into the migration process.
-
Delete the Redpanda resource:
kubectl delete redpanda <cluster-name> --namespace <namespace>
bashThis step triggers a deletion of all resources created by the HelmRelease
-
Enable the deprecated Redpanda Operator to manage your Cluster and Console resources:
kubectl --namespace <namespace> annotate cluster <cluster-name> redpanda.vectorized.io/managed=”true” kubectl --namespace <namespace> annotate console <console-name> redpanda.vectorized.io/managed=”true”
bash
The deprecated Redpanda Operator is now managing your resources. Any changes that the Redpanda Operator made to your deployment will be undone and any resources that you deleted will be reapplied.
Next steps
For information about the updated Redpanda Operator and the Redpanda custom resource, see Redpanda in Kubernetes.