This Debezium series: deploying Debezium on Kubernetes
K8s related knowledge can read the blogger's following technical blogs:
- K8s series: Detailed steps to build a highly available K8s v1.23.5 cluster, 3 master nodes, 3 Node nodes
- K8s series: basic usage of Pod
- k8s series: detailed explanation of kubectl subcommands
- k8s series: detailed explanation of kubectl subcommands 2
- For more K8s knowledge points, please refer to blogger K8s series articles
For more Debezium content, please read the blogger's Debezium column, and the blogger will continue to update the Debezium column:
I. Overview
Debezium can be easily deployed on the open source container management platform Kubernetes. The deployment leverages the Strimzi project, which aims to simplify the deployment of Apache Kafka on Kubernetes through custom resources.
To test your deployment, you can use minikube, which starts a Kubernetes cluster on your local machine. If you want to fully test the Debezium deployment described in this document on minikube, you need to set up an insecure container image registry on minikube. To do this, you need to start minikube with the --insecure-registry flag:
$ minikube start --insecure-registry "10.0.0.0/24"
10.0.0.1 is the default service cluster IP, so this setting allows pulling images across the cluster. You also need to enable the registry minikube plugin:
minikube addons enable registry
2. Prerequisites
To keep containers separate from other workloads on the cluster, create a dedicated namespace for Debezium. In the rest of this document, the debezium-example namespace will be used:
kubectl create ns debezium-example
Deploying the Strimzi Operator
As mentioned above, for the Debezium deployment we will use Strimzi, which manages the Kafka deployment on Kubernetes.
For the detailed steps of deploying minikube, please refer to the blogger's following technical blog:
The easiest way to install Strimzi is through the Operator Lifecycle Manager (OLM). If OLM is not installed on your cluster, you can install it by running:
curl -sL https://github.com/operator-framework/operator-lifecycle-manager/releases/download/v0.20.0/install.sh | bash -s v0.20.0
Now, install the Strimzi operator itself:
kubectl create -f https://operatorhub.io/install/strimzi-kafka-operator.yaml
3. Create Secrets for the database
Later, when deploying the Debezium Kafka connector, we need to provide the username and password of the connector to connect to the database. For security reasons, it is best not to provide credentials directly, but to keep them in a separate secure location. Kubernetes provides Secret objects for this purpose. In addition to creating the Secret object itself, we must also create a role and role binding so that Kafka can access the credentials.
Let's create the Secret object first:
$ cat << EOF | kubectl create -n debezium-example -f -
apiVersion: v1
kind: Secret
metadata:
name: debezium-secret
namespace: debezium-example
type: Opaque
data:
username: ZGViZXppdW0=
password: ZGJ6
EOF
The username and password contain Base64-encoded credentials (debezium/dbz) to connect to the MySQL database, which we will deploy later.
Now, we can create a role that references the secret created in the previous step:
$ cat << EOF | kubectl create -n debezium-example -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: connector-configuration-role
namespace: debezium-example
rules:
- apiGroups: [""]
resources: ["secrets"]
resourceNames: ["debezium-secret"]
verbs: ["get"]
EOF
We must also bind this role to the Kafka Connect cluster service account so that Kafka Connect can access the key:
cat << EOF | kubectl create -n debezium-example -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: connector-configuration-role-binding
namespace: debezium-example
subjects:
- kind: ServiceAccount
name: debezium-connect-cluster-connect
namespace: debezium-example
roleRef:
kind: Role
name: connector-configuration-role
apiGroup: rbac.authorization.k8s.io
EOF
Once we deploy Kafka Connect, the service account will be created by Strimzi. The name of the service account takes the form $KafkaConnectName-connect. Later, we will create a Kafka Connect cluster called debezium-connect-cluster, so we use debezium-connect-cluster-connect here as subject.name.
4. Deploy Apache Kafka
Next, deploy a (single-node) Kafka cluster:
$ cat << EOF | kubectl create -n debezium-example -f -
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: debezium-cluster
spec:
kafka:
replicas: 1
listeners:
- name: plain
port: 9092
type: internal
tls: false
- name: tls
port: 9093
type: internal
tls: true
authentication:
type: tls
- name: external
port: 9094
type: nodeport
tls: false
storage:
type: jbod
volumes:
- id: 0
type: persistent-claim
size: 100Gi
deleteClaim: false
config:
offsets.topic.replication.factor: 1
transaction.state.log.replication.factor: 1
transaction.state.log.min.isr: 1
default.replication.factor: 1
min.insync.replicas: 1
zookeeper:
replicas: 1
storage:
type: persistent-claim
size: 100Gi
deleteClaim: false
entityOperator:
topicOperator: {}
userOperator: {}
EOF
Wait for it to be ready:
$ kubectl wait kafka/debezium-cluster --for=condition=Ready --timeout=300s -n debezium-example
5. Deploy the data source
The following will use MySQL as the data source. In addition to running the pod with MySQL, you need a proper service to point to the pod with the DB itself. It can be created, for example, as follows:
$ cat << EOF | kubectl create -n debezium-example -f -
apiVersion: v1
kind: Service
metadata:
name: mysql
spec:
ports:
- port: 3306
selector:
app: mysql
clusterIP: None
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: mysql
spec:
selector:
matchLabels:
app: mysql
strategy:
type: Recreate
template:
metadata:
labels:
app: mysql
spec:
containers:
- image: quay.io/debezium/example-mysql:2.3
name: mysql
env:
- name: MYSQL_ROOT_PASSWORD
value: debezium
- name: MYSQL_USER
value: mysqluser
- name: MYSQL_PASSWORD
value: mysqlpw
ports:
- containerPort: 3306
name: mysql
EOF
6. Deploy the Debezium connector
To deploy a Debezium connector, you need to deploy a Kafka Connect cluster with the required connector plugins before instantiating the actual connector itself. As a first step, a Kafka Connect container image with plugins must be created. You can skip this step if you already have a container image built and available in the registry. This document uses the MySQL connector as an example.
$ cat << EOF | kubectl create -n debezium-example -f -
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnect
metadata:
name: debezium-connect-cluster
annotations:
strimzi.io/use-connector-resources: "true"
spec:
version: 3.1.0
replicas: 1
bootstrapServers: debezium-cluster-kafka-bootstrap:9092
config:
config.providers: secrets
config.providers.secrets.class: io.strimzi.kafka.KubernetesSecretConfigProvider
group.id: connect-cluster
offset.storage.topic: connect-cluster-offsets
config.storage.topic: connect-cluster-configs
status.storage.topic: connect-cluster-status
# -1 means it will use the default replication factor configured in the broker
config.storage.replication.factor: -1
offset.storage.replication.factor: -1
status.storage.replication.factor: -1
build:
output:
type: docker
image: 10.110.154.103/debezium-connect-mysql:latest
plugins:
- name: debezium-mysql-connector
artifacts:
- type: tgz
url: https://repo1.maven.org/maven2/io/debezium/debezium-connector-mysql/{debezium-version}/debezium-connector-mysql-{debezium-version}-plugin.tar.gz
EOF
You must replace the registry's IP address 10.110.154.103 with a registry that can push images. If you run it on minikube with the registry plugin, you can push the image to the internal minikube registry. The IP address of the registry can be obtained by running
kubectl -n kube-system get svc registry -o jsonpath='{.spec.clusterIP}'
For simplicity, we skip checksum verification of downloaded artifacts. If you want to ensure that an artifact was downloaded correctly, specify its checksum via the sha512sum attribute.
If you already have a suitable container image in a local or remote registry (such as quay.io or DockerHub), you can use this simplified version:
$ cat << EOF | kubectl create -n debezium-example -f -
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnect
metadata:
name: debezium-connect-cluster
annotations:
strimzi.io/use-connector-resources: "true"
spec:
version: 3.1.0
image: 10.110.154.103/debezium-connect-mysql:latest
replicas: 1
bootstrapServers: debezium-cluster-kafka-bootstrap:9092
config:
config.providers: secrets
config.providers.secrets.class: io.strimzi.kafka.KubernetesSecretConfigProvider
group.id: connect-cluster
offset.storage.topic: connect-cluster-offsets
config.storage.topic: connect-cluster-configs
status.storage.topic: connect-cluster-status
# -1 means it will use the default replication factor configured in the broker
config.storage.replication.factor: -1
offset.storage.replication.factor: -1
status.storage.replication.factor: -1
EOF
Also note that we have already configured the Strimzi secret provider. This secret provider will create a service account (which we have bound to the appropriate role) for this Kafka Connect cluster and allow Kafka Connect to access our Secret object.
7. Create a Debezium connector
To create a Debezium connector, you just need to create a KafkaConnector with the appropriate configuration, in this case MySQL:
$ cat << EOF | kubectl create -n debezium-example -f -
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnector
metadata:
name: debezium-connector-mysql
labels:
strimzi.io/cluster: debezium-connect-cluster
spec:
class: io.debezium.connector.mysql.MySqlConnector
tasksMax: 1
config:
tasks.max: 1
database.hostname: mysql
database.port: 3306
database.user: ${secrets:debezium-example/debezium-secret:username}
database.password: ${secrets:debezium-example/debezium-secret:password}
database.server.id: 184054
topic.prefix: mysql
database.include.list: inventory
schema.history.internal.kafka.bootstrap.servers: debezium-cluster-kafka-bootstrap:9092
schema.history.internal.kafka.topic: schema-changes.inventory
EOF
You can notice that instead of using a plain text username and password in the connector configuration, we are referencing the Secret object we created earlier.
8. Verify deployment
To verify that everything is working, you can e.g. start observing the mysql.inventory.customers Kafka Topic:
kubectl run -n debezium-example -it --rm
--image=quay.io/debezium/tooling:1.2
--restart=Never watcher
-- kcat -b debezium-cluster-kafka-bootstrap:9092
-C -o beginning -t mysql.inventory.customers
Connect to the MySQL database:
kubectl run -n debezium-example -it --rm
--image=mysql:8.0 --restart=Never
--env MYSQL_ROOT_PASSWORD=debezium mysqlterm
-- mysql -hmysql -P3306 -uroot -pdebezium
Make some changes in the customer table:
sql> update customers set first_name="Sally Marie" where id=1001;
You should now be able to observe change events on the Kafka topic:
{
...
"payload": {
"before": {
"id": 1001,
"first_name": "Sally",
"last_name": "Thomas",
"email": "[email protected]"
},
"after": {
"id": 1001,
"first_name": "Sally Marie",
"last_name": "Thomas",
"email": "[email protected]"
},
"source": {
"version": "{debezium-version}",
"connector": "mysql",
"name": "mysql",
"ts_ms": 1646300467000,
"snapshot": "false",
"db": "inventory",
"sequence": null,
"table": "customers",
"server_id": 223344,
"gtid": null,
"file": "mysql-bin.000003",
"pos": 401,
"row": 0,
"thread": null,
"query": null
},
"op": "u",
"ts_ms": 1646300467746,
"transaction": null
}
}