Deploy the Kafka Connect cluster based on Confluent Kafka, and the Kafka Connect cluster loads the debezium plug-in
1. Download Confluent Kafka
Download address of Confluent Kafka:
Download the community free version:
Two, the configuration file connect-distributed.properties
The core parameters are as follows:
- /data/src/confluent-7.3.3/etc/schema-registry/connect-distributed.properties
bootstrap.servers=realtime-kafka-001:9092,realtime-kafka-003:9092,realtime-kafka-002:9092
group.id=datasight-confluent-test-debezium-cluster-status
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true
config.storage.topic=offline_confluent_test_debezium_cluster_connect_configs
offset.storage.topic=offline_confluent_test_debezium_cluster_connect_offsets
status.storage.topic=offline_confluent_test_debezium_cluster_connect_statuses
config.storage.replication.factor=3
offset.storage.replication.factor=3
status.storage.replication.factor=3
offset.storage.partitions=25
status.storage.partitions=5
config.storage.partitions=1
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=true
internal.value.converter.schemas.enable=true
#rest.host.name=0.0.0.0
#rest.port=8083
#rest.advertised.host.name=0.0.0.0
#rest.advertised.port=8083
plugin.path=/data/service/debezium/connectors2
3. The startup script connect-distributed
-
/data/src/confluent-7.3.3/bin/connect-distributed
-
The script content of connect-distributed is as follows, no need to modify
-
If you need to export the jmx of kafka connector, you need to set the jmx export port and jmx exporter. For detailed deployment methods, please refer to the blogger's following technical blog:
if [ $# -lt 1 ];
then
echo "USAGE: $0 [-daemon] connect-distributed.properties"
exit 1
fi
base_dir=$(dirname $0)
###
### Classpath additions for Confluent Platform releases (LSB-style layout)
###
#cd -P deals with symlink from /bin to /usr/bin
java_base_dir=$( cd -P "$base_dir/../share/java" && pwd )
# confluent-common: required by kafka-serde-tools
# kafka-serde-tools (e.g. Avro serializer): bundled with confluent-schema-registry package
for library in "confluent-security/connect" "kafka" "confluent-common" "kafka-serde-tools" "monitoring-interceptors"; do
dir="$java_base_dir/$library"
if [ -d "$dir" ]; then
classpath_prefix="$CLASSPATH:"
if [ "x$CLASSPATH" = "x" ]; then
classpath_prefix=""
fi
CLASSPATH="$classpath_prefix$dir/*"
fi
done
if [ "x$KAFKA_LOG4J_OPTS" = "x" ]; then
LOG4J_CONFIG_DIR_NORMAL_INSTALL="/etc/kafka"
LOG4J_CONFIG_NORMAL_INSTALL="${LOG4J_CONFIG_DIR_NORMAL_INSTALL}/connect-log4j.properties"
LOG4J_CONFIG_DIR_ZIP_INSTALL="$base_dir/../etc/kafka"
LOG4J_CONFIG_ZIP_INSTALL="${LOG4J_CONFIG_DIR_ZIP_INSTALL}/connect-log4j.properties"
if [ -e "$LOG4J_CONFIG_NORMAL_INSTALL" ]; then # Normal install layout
KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:${LOG4J_CONFIG_NORMAL_INSTALL} -Dlog4j.config.dir=${LOG4J_CONFIG_DIR_NORMAL_INSTALL}"
elif [ -e "${LOG4J_CONFIG_ZIP_INSTALL}" ]; then # Simple zip file layout
KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:${LOG4J_CONFIG_ZIP_INSTALL} -Dlog4j.config.dir=${LOG4J_CONFIG_DIR_ZIP_INSTALL}"
else # Fallback to normal default
KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:$base_dir/../config/connect-log4j.properties -Dlog4j.config.dir=$base_dir/../config"
fi
fi
export KAFKA_LOG4J_OPTS
if [ "x$KAFKA_HEAP_OPTS" = "x" ]; then
export KAFKA_HEAP_OPTS="-Xms256M -Xmx2G"
fi
EXTRA_ARGS=${EXTRA_ARGS-'-name connectDistributed'}
COMMAND=$1
case $COMMAND in
-daemon)
EXTRA_ARGS="-daemon "$EXTRA_ARGS
shift
;;
*)
;;
esac
export CLASSPATH
exec $(dirname $0)/kafka-run-class $EXTRA_ARGS org.apache.kafka.connect.cli.ConnectDistributed "$@"
4. Start the Kafka Connect cluster
The start command looks like this:
/data/src/confluent-7.3.3/bin/connect-distributed /data/src/confluent-7.3.3/etc/schema-registry/connect-distributed.properties
The full output of a normal Kafka Connect cluster startup is as follows:
[2023-06-21 16:43:01,249] INFO EnrichedConnectorConfig values:
config.action.reload = restart
connector.class = io.debezium.connector.mysql.MySqlConnector
errors.log.enable = false
errors.log.include.messages = false
errors.retry.delay.max.ms = 60000
errors.retry.timeout = 0
errors.tolerance = none
exactly.once.support = requested
header.converter = null
key.converter = null
name = mysql-dw-valuekey-test
offsets.storage.topic = null
predicates = []
tasks.max = 1
topic.creation.default.exclude = []
topic.creation.default.include = [.*]
topic.creation.default.partitions = 12
topic.creation.default.replication.factor = 3
topic.creation.groups = []
transaction.boundary = poll
transaction.boundary.interval.ms = null
transforms = [unwrap, moveFieldsToHeader, moveHeadersToValue, Reroute]
transforms.Reroute.key.enforce.uniqueness = true
transforms.Reroute.key.field.regex = null
transforms.Reroute.key.field.replacement = null
transforms.Reroute.logical.table.cache.size = 16
transforms.Reroute.negate = false
transforms.Reroute.predicate =
transforms.Reroute.topic.regex = debezium-dw-encryption-test.dw.(.*)
transforms.Reroute.topic.replacement = debezium-test-dw-encryption-all3
transforms.Reroute.type = class io.debezium.transforms.ByLogicalTableRouter
transforms.moveFieldsToHeader.fields = [cdc_code, product]
transforms.moveFieldsToHeader.headers = [product_code, productname]
transforms.moveFieldsToHeader.negate = false
transforms.moveFieldsToHeader.operation = copy
transforms.moveFieldsToHeader.predicate =
transforms.moveFieldsToHeader.type = class org.apache.kafka.connect.transforms.HeaderFrom$Value
transforms.moveHeadersToValue.fields = [product_code2, productname2]
transforms.moveHeadersToValue.headers = [product_code, productname]
transforms.moveHeadersToValue.negate = false
transforms.moveHeadersToValue.operation = copy
transforms.moveHeadersToValue.predicate =
transforms.moveHeadersToValue.type = class io.debezium.transforms.HeaderToValue
transforms.unwrap.add.fields = []
transforms.unwrap.add.headers = []
transforms.unwrap.delete.handling.mode = drop
transforms.unwrap.drop.tombstones = true
transforms.unwrap.negate = false
transforms.unwrap.predicate =
transforms.unwrap.route.by.field =
transforms.unwrap.type = class io.debezium.transforms.ExtractNewRecordState
value.converter = null
(org.apache.kafka.connect.runtime.ConnectorConfig$EnrichedConnectorConfig:376)
[2023-06-21 16:43:01,253] INFO [mysql-dw-valuekey-test|task-0] Loading the custom topic naming strategy plugin: io.debezium.schema.DefaultTopicNamingStrategy (io.debezium.config.CommonConnectorConfig:849)
Jun 21, 2023 4:43:01 PM org.glassfish.jersey.internal.Errors logErrors
WARNING: The following warnings have been detected: WARNING: The (sub)resource method listLoggers in org.apache.kafka.connect.runtime.rest.resources.LoggingResource contains empty path annotation.
WARNING: The (sub)resource method listConnectors in org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource contains empty path annotation.
WARNING: The (sub)resource method createConnector in org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource contains empty path annotation.
WARNING: The (sub)resource method listConnectorPlugins in org.apache.kafka.connect.runtime.rest.resources.ConnectorPluginsResource contains empty path annotation.
WARNING: The (sub)resource method serverInfo in org.apache.kafka.connect.runtime.rest.resources.RootResource contains empty path annotation.
[2023-06-21 16:43:01,482] INFO Started o.e.j.s.ServletContextHandler@2b80497f{
/,null,AVAILABLE} (org.eclipse.jetty.server.handler.ContextHandler:921)
[2023-06-21 16:43:01,482] INFO REST resources initialized; server is started and ready to handle requests (org.apache.kafka.connect.runtime.rest.RestServer:324)
[2023-06-21 16:43:01,482] INFO Kafka Connect started (org.apache.kafka.connect.runtime.Connect:56)
5. Load the debezium plugin
- Download the debezium plugin to the directory set by plugin.path=/data/service/debezium/connectors2
- Then restart the Kafka Connect cluster to successfully load the debezium plugin
Restart the Kafka Connect cluster to check whether the debezium plug-in is loaded successfully, as shown below: The debezium plug-in is successfully loaded
[{
"class":"io.debezium.connector.mysql.MySqlConnector",
"type":"source",
"version":"2.2.1.Final"},
{
"class":"org.apache.kafka.connect.mirror.MirrorCheckpointConnector","type":"source","version":"7.3.3-ce"},
{
"class":"org.apache.kafka.connect.mirror.MirrorHeartbeatConnector","type":"source","version":"7.3.3-ce"},
{
"class":"org.apache.kafka.connect.mirror.MirrorSourceConnector","type":"source","version":"7.3.3-ce"}]
6. Summary and extension
Summarize:
- So far, a Kafka Connect cluster with one node has been successfully deployed. If more nodes are needed, Kafka Connect needs to be started on multiple servers to form a multi-node Kafka Connect cluster.
For more information on loading the debezium plug-in based on Kafka Connect, please refer to the blogger's following technical blogs or Debezium columns:
- Debezium series: detailed steps to install and deploy debezium, and host the debezium service to systemctl
- Debezium series: get through the use technology of Debezium2.0 and above
- Debezium series: detailed steps for installing and deploying debezium2.0 and above
- Debezium series: Debezium clusters that access thousands of Mysql, Sqlserver, MongoDB, and Postgresql databases are upgraded from Debezium1.X to Debezium2.X
- Debezium series: install jmx exporter to monitor debezium indicators
- Debezium series: Debezium UI deployment detailed steps
- Debezium column address
extend:
- After forming a Kafka Connect cluster, you need to start multiple connectors to test the stability and reliability of the Kafka Connect cluster.
- The Kafka Connect cluster UI can be further deployed