kafka-connect-- importing and exporting data - to build standalone mode

table of Contents

(1) Configuration Memory allocation connect the linux:

(2) connect-standalone configuration parameters:

A, broker access address:

B, offset storage path:

C, plugin plugin path:

(3) Source configuration parameters:

A, jdbc source configuration parameters:

B, source configuration file parameters:

C, the official website to find plug-ins:

(4) configuration data object parameters:

A, object jdbc configuration parameters:

B, object configuration file parameters:

C, the official website to find plug-ins:

(5) Check the connect.class:

(6) Check the jdbc driver and other links:

(7) starts connect stand-alone:

A, windows start:

B, Linux start:

(8) dynamic management connect:

(9) Test connect:


 

(1) Configuration Memory allocation connect the linux:

Modified bin / connect-standalone.sh kafka installation path in the file can be modified to connect the allocated memory:

(2) connect-standalone configuration parameters:

A, broker access address:

Parameters: bootstrap.servers, kafka single access address value.

B, offset storage path:

Parameters: offset.storage.file.filename, file storage path offset value.

C, plugin plugin path:

Parameters: plugin.path, is the path to save the plug connect.class.

(3) Source configuration parameters:

A, jdbc source configuration parameters:

Parameters: name, configure the connection name only can be.

Parameters: connect.class, configure the connection type, jdb should: io.confluent.connect.jdbc.JdbcSourceConnector.

Parameters: tasks.max, configure the maximum number of task.

Parameters: connection.url, arranged jdbc connection address. E.g:

jdbc:mysql://localhost:3306/A?user=***&password=***

Parameters: table.whitelist, need to access the configuration table, multiple tables separated by commas.

Parameters: mode, the configuration model, divided into three incrementing (increment), timestamp (timestamp), timestamp + incrementing (+ timestamp increment).

Parameters: incrementing.column.name, configure auto-increment field, if there is no increment is not required.

Parameters: timestamp.column.name, configuration timestamp field, if no timestamp is not required.

Parameters: topic.prefix, the topic name prefix kafka configuration, topic all name: prefix + table name.

B, source configuration file parameters:

Parameters: name, configure the connection name only can be.

Parameters: connect.class, configure the connection type, file should be: FileStreamSource.

Parameters: tasks.max, configure the maximum number of task.

Parameters: topic, configuration kafka the topic name.

Parameters: file, configuration file storage file path.

C, the official website to find plug-ins:

Can Baidu: confluent, then the product --- Confluent Hub, search for the corresponding need to connect plug-in, and then view the document corresponding to the document.

(4) configuration data object parameters:

A, object jdbc configuration parameters:

Parameters: name, configure the connection name only can be.

Parameters: connect.class, configure the connection type, jdb should: io.confluent.connect.jdbc.JdbcSinkConnector.

Parameters: topics, configuration kafka the topic name.

Parameters: tasks.max, configure the maximum number of task.

Parameters: connection.url, arranged jdbc connection address.

Parameters: auto.create, configure whether to automatically create a table, if the table is automatically created, then the name of the creation of the table topic name.

Parameters: insert.mode, the configuration data into the table mode, update or insert. Values ​​are upsert, insert.

Parameters: pk.mode = record_value, parameters: pk.fields, two exist, specify the name of the primary key ID field, used to specify the data stored kafka specify the primary key, data export, if convenient to do incremental synchronization data.

Parameters: table.name.format, exemplar configuration name.

B, object configuration file parameters:

Parameters: name, configure the connection name only can be.

Parameters: connect.class, configure the connection type, file should be: FileStreamSink.

Parameters: tasks.max, configure the maximum number of task.

Parameters: topic, configuration kafka the topic name.

Parameters: file, configuration file storage file path.

C, the official website to find plug-ins:

Can Baidu: confluent, then the product --- Confluent Hub, search for the corresponding need to connect plug-in, and then view the document corresponding to the document.

(5) Check the connect.class:

Is there a source, sink configured plug corresponding to connect.class connect-standalone.properties examining the configuration file in the configuration plugin.path. If not to download the corresponding plug according to connect online support link, and then extract it to the specified folder can plugin.path.

(6) Check the jdbc driver and other links:

If the link is jdbc and other ways, need to check in advance whether there is a corresponding access driver jar libs under kafka installation path, if there is no need to manually upload a corresponding driver jar package.

(7) starts connect stand-alone:

A, windows start:

connect-standalone.bat ../config/connect-standalone.properties ../config/connect-source.properties ../config/connect-sink.properties

Description:

connect-standalone.properties: configuration file for a stand-alone connect.

connect-source.properties: custom data source configuration file, variable file names Yes.

connect-sink.properties: configuration file for the purpose of custom data, variable file names Yes.

B, Linux start:

./connect-standalone.sh ../config/connect-standalone.properties ../config/connect-source.properties ../config/connect-sink.properties

Description:

connect-standalone.properties: configuration file for a stand-alone connect.

connect-source.properties: custom data source configuration file, variable file names Yes.

connect-sink.properties: configuration file for the purpose of custom data, variable file names Yes.

(8) dynamic management connect:

Under connect stand-alone mode is also supported by the REST API to dynamically manage connect, but connect via REST API dynamic management configuration information only temporarily saved information just once connect the process to shut down and then restart the dynamic configuration management will not exist.

(9) Test connect:

Add dynamic data in place of data sources, in kafka corresponding topic, the purpose and the corresponding data where you can see the newly added data. This result represents a real-time data synchronization via the kafka-connect.

Published 131 original articles · won praise 23 · views 10000 +

Guess you like

Origin blog.csdn.net/LSY_CSDN_/article/details/103723275