First, the installation environment
Vertica provides a method for official consumption kafka, you need to pay attention to the corresponding version of
the consumer kafka principle, it is provided Udx Vertica
We first need to install the appropriate environment
/${vertica}/packages/kafka/ddl/install.sql
Determine whether the installation is successful
/${vertica}/packages/kafka/ddl/isinstalled.sql
Second, a single consumer kafka
refer to the official documentation Using COPY with Kafka
COPY schema.target_table SOURCE KafkaSource (stream='topic1|1|1,topic2|2|2', brokers='host1:9092,
host2:9092',duration= INTERVAL'timeslice',stop_on_eof=TRUE,
eof_timeout= INTERVAL'timeslice')
PARSER KafkaJSONParser(flatten_arrays=False, flatten_maps=False)
REJECTED DATA AS TABLE schema.rejection_table TRICKLE;
Third, real-time consumer kafka
refer to the official documentation Using Kafka with Vertica
- First create a Scheduler
/opt/vertica/packages/kafka/bin/vkconfig scheduler --add --config-schema myScheduler --operator user1
Conf package using Vertica database logon information
kafka_config=”—cinfig-schema kafka01 –dbhoust 172.17.12.1 –username dbadmin –password pass1”
- Creating Scheduler script
/opt/vertica/packages/kafka/bin/vkconfig scheduler –add ${ kafka_config } –config-schema kafka_config --operator dbadmin
- Creating kafka cluster information
BROKERS=”172.17.12.2:9099, 172.17.12.3:9099, 172.17.12.4:9099”
/opt/vertica/packages/kafka/bin/vkconfig kafka-cluster –add ${ kafka_config } --onfig-schema kafka_config --cluster KafkaCluster –brokers $ BROKERS
- Read topic
/opt/vertica/packages/kafka/bin/vkconfig topic –add ${ kafka_config } –target public.kafka_tgt –rejection-table public.kafka_rej –cluster KafkaCluster –topic web_pagelogs –number-partitions 1
- Published Scheduler
/opt/vertica/packages/kafka/bin/vkconfig launch ${ kafka_config } -- onfig-schema kafka_config –instance-name webpagelogs
- Delete scheduler
/opt/vertica/packages/kafka/bin/vkconfig scheduler ${kafka_config} –remove –config-schema kafka_config
- Delete topic reception
/opt/vertica/packages/kafka/bin/vkconfig topic ${kafka_config} –remove –target public.kafka_tgt
PS:
By the use of the latest Vertica consumption kafka found this function relatively tasteless. Multiple topic can only be executed into a scheduler inside consumption, but each modification will increase the need to stop the process of consumption of all topic. Also found in the course of the phenomenon of data loss.