The actual operation of mysql binlog log synchronization based on MaxWell

Three formats of mysql biglog:

  • 1) STATMENT mode: SQL statement-based replication (statement-based replication, SBR), each SQL statement that will modify the data will be recorded in the binlog
    • Advantages: binlog has fewer logs, reduces disk IO, and improves performance

    • Disadvantages: The following will cause inconsistent data in the master-slave (such as sleep() function, last_insert_id(), and user-defined functions(udf), etc., there will be problems)

  • 2) Row-based replication (RBR) : Record changes in the final data

    • Advantages: There are no disadvantages of statement mode.

    • Disadvantages: A large number of logs will be generated, especially when altering the table, the logs will skyrocket.

  • 3) Mixed-based replication (MBR): The above two modes are mixed. General replication uses the STATEMENT mode to save the binlog. For operations that cannot be replicated in the STATEMENT mode, the ROW mode is used to save the binlog. MySQL will follow the executed SQL The statement selects the log saving method.

  • Note: Because the statement only has sql, there is no data, and the original change log cannot be obtained, so it is generally recommended to use ROW mode

Comparison between various ways of parsing biglog:

Turn on the binlog function of mysql and configure maxwell:

  • 1) Add mysql ordinary user maxwell:
##进入mysql客户端,然后执行以下命令,进行授权
mysql -uroot  -p  
set global validate_password_policy=LOW;
set global validate_password_length=6;
CREATE USER 'maxwell'@'%' IDENTIFIED BY '123456';
GRANT ALL ON maxwell.* TO 'maxwell'@'%';
GRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE on *.* to 'maxwell'@'%'; 
flush privileges;
  • 2) Turn on the binlog mechanism of mysql:
## shell 进入一台节点编辑 mysql 的配置文件 my.cnf
sudo vim /etc/my.cnf

## 指定 biglog 格式 为row
log-bin= /var/lib/mysql/mysql-bin
binlog-format=ROW
server_id=1


## shell 重启mysql 服务
sudo service mysqld restart


## shell 验证
mysql -uroot -p
mysql> show variables like '%log_bin%';

## 编辑配置文件
cd xx/maxwell-1.21.1 
cp config.properties.example config.properties
vim config.properties

## 指定数据 sink 为 kafka
producer=kafka
##根据实际的 kafka 配置
kafka.bootstrap.servers=node01:9092,node02:9092,node03:9092
host=node03.liz.com

port=3306
user=maxwell
password=123456

## 这个 topic 要手动创建
kafka_topic=maxwell_kafka
  • 4) Start the service:
    • Start zookeeper and kafka successively (ignored during installation)
    • 创建 topic:kafka-topics.sh  --create --topic maxwell_kafka --partitions 3 --replication-factor 2 --zookeeper node01:2181
    • Dynamic maxwell: cd xx / maxwell-1.21.1 bin / maxwell
  • 5) Insert data and test
    • Create a table in mysql:
      • CREATE DATABASE IF NOT EXISTS `test`  DEFAULT CHARACTER SET utf8;
        USE `test`;
        
        /*Table structure for table `myuser` */
        
        DROP TABLE IF EXISTS `myuser`;
        
        CREATE TABLE `myuser` (
          `id` int(12) NOT NULL,
          `name` varchar(32) DEFAULT NULL,
          `age` varchar(32) DEFAULT NULL,
          PRIMARY KEY (`id`)
        ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
        
        /*Data for the table `myuser` */
        
        insert  into `myuser`(`id`,`name`,`age`) values (1,'zhangsan',NULL),(2,'xxx',NULL),(3,'ggg',NULL),(5,'xxxx',NULL),(8,'skldjlskdf',NULL),(10,'ggggg',NULL),(99,'ttttt',NULL),(114,NULL,NULL),(121,'xxx',NULL);
        
    • Start the kafka consumer thread to consume the data in kafka ( --bootstrap-server modify according to the zookeeper address installed by yourself ): kafka-console-consumer.sh --topic maxwell_kafka --from-beginning --bootstrap-server node01:9092, node02:9092,node03:9092
  • 6) Sample configuration of maxwell-kafka in the project (/xxx/maxwell-1.22.1/my_project.properties):
    • ​​​​​​​
      ## my_project.properties
      
      log_level=INFO
      producer=kafka
      kafka.bootstrap.servers=node01:9092,node02:9092,node03:9092
      
      ## maxwell 所在机器
      host=yourhost.com
      
      user=maxwell
      password=123456
      producer_ack_timeout = 600000
      port=3306
      
      ######### output format stuff ###############
      output_binlog_position=ture
      output_server_id=true
      output_thread_id=ture
      output_commit_info=true
      output_row_query=true
      output_ddl=false
      output_nulls=true
      output_xoffset=true
      output_schema_id=true
      ######### output format stuff ###############
      
      kafka_topic= yourtopic
      
      ## kafka recorde key 的生成方式,支持 array 和 hash
      kafka_key_format=hash
      kafka.compression.type=snappy
      kafka.retries=5
      kafka.acks=all
      
      ## kafka 分区方式:表主键
      producer_partition_by=primary_key
      kafka_partition_hash=murmur3
      ############ kafka stuff #############
      
      ## 在处理bootstrap时,是否会阻塞正常的binlog解析 async(异步)不会阻塞
      ############## misc stuff ###########
      bootstrapper=async
      ############## misc stuff ##########
      
      ## 配置过滤:库名.表名, 下面的配置为:只采集db1 的 三张表
      ## 支持正则过滤:include:db1:/table\\d{1}/
      ############## filter ###############
      filter=exclude:*.*, include: db1.table1,include: db1.table2,include: db1.table3
      ############## filter ###############
      
      
  • 7) Maxwell-kafka startup script:
    • #!/bin/bash
      case $1 in 
      "start" ){
       nohup /xxx/maxwell-1.22.1/bin/maxwell --daemon --config /xxx/maxwell-1.22.1/my_project.properties 2>&1 >> /xxx/maxwell-1.22.1/maxwell.log &
      
      };;
      "stop"){
        ps -ef | grep Maxwell | grep -v grep |awk '{print $2}' | xargs kill
      };;
      esac
      

       

Guess you like

Origin blog.csdn.net/weixin_41346635/article/details/114253906