Automated Deployment
If you use automation tools like Terraform and Ansible in your environment, you can use them to quickly provision a Redpanda cluster. Terraform can set up the infrastructure and output a properly-formatted hosts.ini
file, and Ansible can use that hosts.ini
file as input to install Redpanda.
If you already have an infrastructure provisioning framework, you can supply your own hosts file (without using Terraform), and you can use Ansible to install Redpanda.
This recommended automated deployment provides a production-usable way to deploy and maintain a cluster. For unique configurations, you can work directly with the Ansible and Terraform modules to integrate them into your environment.
Prerequisites
-
Install Terraform following the Terraform documentation.
-
Install Ansible following the Ansible documentation. Different operating systems may have specific Ansible dependencies.
-
Clone the
deployment-automation
GitHub repository:git clone https://github.com/redpanda-data/deployment-automation.git
-
Change into the directory:
cd deployment-automation
Use Terraform to set up infrastructure
-
AWS
-
GCP
The recommended Terraform module for Redpanda deploys virtual machines on AWS EC2. To create an AWS Redpanda cluster, review the default variables and make any edits necessary for your environment.
-
In the
deployment-automation
folder, change into theaws
directory:cd aws
-
Set AWS credentials. Terraform provides multiple ways to set the AWS secret and key. See the Terraform documentation.
-
Initialize Terraform:
terraform init
-
Create the cluster with
terraform apply
:terraform apply -var='public_key_path=~/.ssh/id_rsa.pub' -var='subnet_id=<subnet-id>' -var='vpc_id=<vpc-id>'
-
Terraform configures
public_key_path
on the brokers to remotely connect with SSH. If the public key path isn’t the default~/.ssh/id_rsa.pub
, then you need to set it. -
If you don’t have a default VPC defined, then you need to set
subnet_id
andvpc_id
.
-
Configuration options (for the full list, see the Terraform module):
Property | Description |
---|---|
|
The AWS region to use for deploying the infrastructure. Default: |
|
The number of nodes to base the cluster on. Default: |
|
Creates a Prometheus/Grafana instance for monitoring the cluster. Default: |
|
The instance type on which Redpanda is deployed. Default: |
|
The instance type on which Prometheus and Grafana are deployed. Default: |
|
Path to the public key of the keypair used to access the nodes. Default: |
|
Linux distribution to install. (This affects the |
|
AWS AMI to use for each available distribution. These must be changed according to the chosen AWS region. |
|
User used to ssh into the created EC2 instances. |
For acceptable
|
-
In the
deployment-automation
folder, change into thegcp
directory:cd gcp
-
You need an existing subnet in which to deploy the virtual machines (VMs). The subnet’s attached firewall should allow inbound traffic on ports 22, 3000, 8082, 8888, 8889, 9090, 9092, 9644, and 33145. This module adds the
rp-node
tag to the deployed VMs, which can be used as the target tag for the firewall rule. -
Initialize Terraform:
terraform init
-
Create the cluster:
terraform apply
The following example creates a three-broker cluster using the subnet named redpanda-cluster-subnet
:
terraform apply -var nodes=3 -var subnet=redpanda-cluster-subnet -var public_key_path=~/.ssh/id_rsa.pub -var ssh_user=$USER
Configuration options (for the full list, see the Terraform module):
Property | Description |
---|---|
|
The region to use for deploying the infrastructure. Default: |
|
The region’s zone to deploy the infrastructure. Default: |
|
The name of an existing subnet to deploy the infrastructure. |
|
The number of nodes to base the cluster on. Keep in mind that one node is used as a monitoring node. Default: |
|
The number of local disks to deploy on each machine. Default = |
|
The OS image running on the VMs. Default: |
|
The machine type. Default: |
|
Path to the public key of the keypair used to access the nodes. |
|
The ssh user. Must match the one in the public ssh key’s comments. |
Use Ansible to install Redpanda
-
From the
deployment-automation
folder, set the required Ansible variables:export CLOUD_PROVIDER=<aws-or-gcp> export ANSIBLE_COLLECTIONS_PATHS=${PWD}/artifacts/collections export ANSIBLE_ROLES_PATH=${PWD}/artifacts/roles export ANSIBLE_INVENTORY=${PWD}/${CLOUD_PROVIDER}/hosts.ini
-
Install the roles required by Ansible:
ansible-galaxy install -r ansible/requirements.yml
Configure a hosts file
Redpanda Data recommends incorporating variables into your hosts.ini file for every host. Edits made to properties outside of the playbook may be overwritten.
|
If you used Terraform to deploy the instances, the hosts.ini
is configured automatically in the artifacts
directory.
If you didn’t use Terraform, then you must manually update the [redpanda]
section. When you open the file, you see something like the following:
[redpanda]
ip ansible_user=ssh_user ansible_become=True private_ip=pip id=0
ip ansible_user=ssh_user ansible_become=True private_ip=pip id=1
[monitor]
ip ansible_user=ssh_user ansible_become=True private_ip=pip id=1
Under the [redpanda]
section, replace the following:
Property | Description |
---|---|
|
The public IP address of the machine. |
|
The username for Ansible to use to SSH to the machine. |
|
The private IP address of the machine. This could be the same as the public IP address. |
You can add additional properties to configure features like rack awareness and Tiered Storage.
The [monitor]
section is only required if you want the playbook to install and configure a basic Prometheus and Grafana setup for observability. If you have a centralized monitoring setup or if you don’t require monitoring, then remove this section.
Run a playbook
Use the Ansible Collection for Redpanda to build a Redpanda cluster. The recommended Redpanda playbook enables TLS encryption and Tiered Storage.
If you prefer, you can download the modules and required roles and create your own playbook. For example, if you want to handle your own data directory, you can toggle that part off, and Redpanda ensures that the permissions are correct. If you want to generate your own security certificates, you can.
To install and start a Redpanda cluster in one command with the Redpanda playbook, run:
ansible-playbook --private-key <your-private-key> -v ansible/playbooks/provision-basic-cluster.yml
|
Custom configuration
You can specify any available Redpanda configuration value, or set of values, by passing a JSON dictionary as an Ansible extra-var
. These values are spliced with the calculated configuration and only override the values that you specify. Values must be unset manually with rpk
. There are two sub-dictionaries you can specify: redpanda.cluster
and redpanda.node
. For more information, see Cluster Configuration Properties and Node Configuration Properties.
export JSONDATA='{"cluster":{"auto_create_topics_enabled":"true"},"node":{"developer_mode":"false"}}'
ansible-playbook ansible/<playbook-name>.yml --private-key artifacts/testkey -e redpanda="${JSONDATA}"
Adding whitespace to the JSON breaks configuration merging. |
Use rpk
and standard Kafka tools to produce and consume from the Redpanda cluster.
Configure Prometheus and Grafana
Include a [monitor]
section in your hosts file if you want the playbook to install and configure a basic Prometheus and Grafana setup for observability. Redpanda emits Prometheus metrics that can be scrapped with a central collector. If you already have a centralized monitoring setup or if you don’t require monitoring, then this is unnecessary.
To run the deploy-prometheus-grafana.yml
playbook:
ansible-playbook ansible/deploy-prometheus-grafana.yml \
--private-key '<path-to-a-private-key-with-ssh-access-to-the-hosts>'
Configure Redpanda Console
To install Redpanda Console, add the redpanda_broker
role to a group with install_console: true
. The standard playbooks automatically install Redpanda Console on hosts in the [client]
group.
Build the cluster with TLS enabled
Configure TLS with externally-provided and signed certificates. Then run the provision-tls-cluster
playbook, specifying the certificate locations on new hosts. You can either pass the variables in the command line or edit the file and pass them there. Consider whether you want public access to the Kafka API and Admin API endpoints. For example:
ansible-playbook ansible/provision-tls-cluster.yml \
--private-key '<path-to-a-private-key-with-ssh-access-to-the-hosts>' \
--extra-vars create_demo_certs=false \
--extra-vars advertise_public_ips=false \
--extra-vars handle_certs=false \
--extra-vars redpanda_truststore_file='<path-to-ca.crt-file>'
It is important to use a signed certificate from a valid CA for production environments. The playbook uses locally-signed certificates that are not recommended for production use. Provide a valid certificate using these variables:
redpanda_certs_dir: /etc/redpanda/certs
redpanda_csr_file: "{{ redpanda_certs_dir }}/node.csr"
redpanda_key_file: "{{ redpanda_certs_dir }}/node.key"
redpanda_cert_file: "{{ redpanda_certs_dir }}/node.crt"
redpanda_truststore_file: "{{ redpanda_certs_dir }}/truststore.pem"
For testing, you could deploy a local CA to generate private keys and signed certificates:
ansible-playbook ansible/provision-tiered-storage-cluster.yml \
--private-key '<path-to-a-private-key-with-ssh-access-to-the-hosts>'
Add brokers to an existing cluster
To add brokers to a cluster, you must add them to the hosts file and run the relevant playbook again. You can add skip_node=true
to the existing hosts to avoid the playbooks being rerun on them.
Upgrade a cluster
The playbook is designed to be idempotent, so it should be suitable for running as part of a CI/CD pipeline or through Ansible Tower. The playbook upgrades the packages and then performs a rolling upgrade, where one broker at a time is upgraded and safely restarted. For all upgrade requirements and recommendations, see Upgrade Redpanda. It is important to test that your upgrade path is safe before using it in production.
To upgrade a cluster, run the playbook with a specific target version:
ansible-playbook --private-key ~/.ssh/id_rsa ansible/<playbook-name>.yml -e redpanda_version=22.3.10-1
By default, the playbook selects the latest version of the Redpanda packages, but an upgrade is only performed if the redpanda_install_status
variable is set to latest
:
ansible-playbook --private-key ~/.ssh/id_rsa ansible/<playbook-name>.yml -e redpanda_install_status=latest
To upgrade clusters with SASL authentication:
export JSONDATA='{"cluster":{"auto_create_topics_enabled":"true"},"node":{"developer_mode":"false"}}'
ansible-playbook ansible/<playbook-name>.yml --private-key artifacts/testkey -e redpanda="${JSONDATA}"
Similarly, you can put the redpanda_rpk_opts
into a YAML file protected with Ansible vault.
ansible-playbook --private-key ~/.ssh/id_rsa ansible/<playbook-name>.yml --extra-vars=redpanda_install_status=latest --extra-vars @vault-file.yml --ask-vault-pass
Redpanda Ansible Collection values
You can pass the following variables as -e var=value
when running Ansible:
Property | Default value | Description |
---|---|---|
|
|
Set this to identify your organization in the asset management system. |
|
|
This helps identify the cluster. |
|
|
Configure Redpanda to advertise the broker’s public IPs for client communication instead of private IPs. This enables using the cluster from outside its subnet. Note: This is not recommended for production deployments, because your brokers will be public. |
|
|
Grafana admin user’s password. |
|
|
Enable filesystem check for attached disk. This is useful when using attached disks in instances with ephemeral operating system disks like Azure L Series. This allows a filesystem repair at boot time and ensures that the drive is remounted automatically after a reboot. |
|
|
Enables hardware optimization. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Enables access to unstable builds. |
|
|
Version; for example, 22.2.2-1 or 22.3.1~rc1-1. If this value is set, then the package is upgraded if the installed version is lower than what has been specified. |
|
Command line options to be passed to instances where |
|
|
|
If |
|
|
Path where Redpanda keeps its data. |
|
|
TLS: Path to private key. |
|
|
TLS: Path to signed certificate. |
|
|
TLS: Path to truststore. |
|
|
Set to |
|
|
Node configuration to prevent the redpanda_broker role being applied to this specific broker. Use carefully when adding new brokers to avoid existing brokers from being reconfigured. |
|
|
Node configuration to prevent Redpanda brokers from being restarted after updating. Use with care: This can cause |
|
|
Node configuration to enable rack awareness. Rack awareness is enabled cluster-wide if at least one broker has this set. |
|
Set bucket name to enable Tiered Storage. |
|
|
1 |
The replication factor of Schema Registry’s internal storage topic. |
|
The region to be used if Tiered Storage is enabled. |
Troubleshooting
On Mac OS X, Python may be unable to fork workers. You may see something like the following:
ok: [34.209.26.177] => {“changed”: false, “stat”: {“exists”: false}}
objc[57889]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.
objc[57889]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
ERROR! A worker was found in a dead state
Try setting an environment variable to resolve the error:
export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES