Chapter 1 Introduction
On this festive occasion, first of all wish all readers a happy National Day and Mid-Autumn Festival!
After the introduction of the previous articles, I believe you already understand the deployment and use of Flink in the production environment. Then the question arises: Does every Flink developer have to deploy and install a cluster in order to do development? The answer is of course no. If every developer needs a cluster, the development cost is too high. This article takes Flink Table API/SQL as an example, uses Kafka as a data source, Mysql as a data sink, and uses a container to introduce the establishment and use of the Flink development environment. (Note: The author takes windows as an example, mac system is similar)
Chapter 2 Installing Components
2.1 Install docker on official website
I won't introduce it in detail here, you can download and install it on the official website ( https://www.docker.com ).
2.2 Set the domestic mirror warehouse address
2.3 Query docker version
C:\Users\zihao>docker --version
Docker version 19.03.12, build 48a66213fe
2.4 Afka
安装前可先查询kafka相关镜像
docker search kafka
Pull Kafka mirror
C:\Users\zihao>docker pull wurstmeister/kafka:2.12-2.5.0
2.12-2.5.0: Pulling from wurstmeister/kafka
Image docker.io/wurstmeister/kafka:2.12-2.5.0 uses outdated schema1 manifest format. Please upgrade to a schema2 image for better future compatibility. More information at https://docs.docker.com/registry/spec/deprecated-schema-v1/
e7c96db7181b: Retrying in 1 second f910a506b6cb: Retrying in 1 second b6abafe80f63: Downloading 2e9c2caa5758: Waiting 1b29071c565f: Waiting c81626d038e3: Waiting 2.12-2.5.0: Pulling from wurstmeister/kafka
e7c96db7181b: Pull complete f910a506b6cb: Pull complete b6abafe80f63: Pull complete 2e9c2caa5758: Pull complete 1b29071c565f: Pull complete c81626d038e3: Pull complete Digest: sha256:71aa89afe97d3f699752b6d80ddc2024a057ae56407f6ab53a16e9e4bedec04c
Status: Downloaded newer image for wurstmeister/kafka:2.12-2.5.0
docker.io/wurstmeister/kafka:2.12-2.5.0
2.5 install zookeeper
Pull zookeeper mirror
C:\Users\zihao>docker pull zookeeper:3.5.7
3.5.7: Pulling from library/zookeeper
afb6ec6fdc1c: Pulling fs layer ee19e84e8bd1: Pulling fs layer 6ac787417531: Pulling fs layer f3f781d4d83e: Waiting 424c9e43d19a: Waiting f0929561e8a7: Waiting f1cf0c087cb3: Waiting 2f47bb4dd07a: Waiting 3.5.7: Pulling from library/zookeeper
afb6ec6fdc1c: Pulling fs layer ee19e84e8bd1: Pulling fs layer 6ac787417531: Pulling fs layer f3f781d4d83e: Waiting 424c9e43d19a: Waiting f0929561e8a7: Waiting f1cf0c087cb3: Waiting 2f47bb4dd07a: Waiting 3.5.7: Pulling from library/zookeeper
afb6ec6fdc1c: Pull complete ee19e84e8bd1: Pull complete 6ac787417531: Pull complete f3f781d4d83e: Pull complete 424c9e43d19a: Pull complete f0929561e8a7: Pull complete f1cf0c087cb3: Pull complete 2f47bb4dd07a: Pull complete Digest: sha256:883b014b6535574503bda8fc6a7430ba009c0273242f86d401095689652e5731
Status: Downloaded newer image for zookeeper:3.5.7
docker.io/library/zookeeper:3.5.7
2.6 Install Mysql
C:\Users\zihao>docker pull mysql:5.7
5.7: Pulling from library/mysql
Image docker.io/library/mysql:5.7 uses outdated schema1 manifest format. Please upgrade to a schema2 image for better future compatibility. More information at https://docs.docker.com/registry/spec/deprecated-schema-v1/
d121f8d1c412: Downloading f3cebc0b4691: Downloading 1862755a0b37: Downloading 489b44f3dbb4: Downloading 690874f836db: Downloading baa8be383ffb: Downloading 55356608b4ac: Downloading 277d8f888368: Downloading 21f2da6feb67: Downloading 2c98f818bcb9: Downloading 031b0a770162: Downloading 5.7: Pulling from library/mysql
d121f8d1c412: Pull complete f3cebc0b4691: Pull complete 1862755a0b37: Pull complete 489b44f3dbb4: Pull complete 690874f836db: Pull complete baa8be383ffb: Pull complete 55356608b4ac: Pull complete 277d8f888368: Pull complete 21f2da6feb67: Pull complete 2c98f818bcb9: Pull complete 031b0a770162: Pull complete Digest: sha256:14fd47ec8724954b63d1a236d2299b8da25c9bbb8eacc739bb88038d82da4919
Status: Downloaded newer image for mysql:5.7
docker.io/library/mysql:5.7
2.7 View installed mirror images
C:\Users\zihao>docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
mysql 5.7 ef08065b0a30 3 days ago 448MB
wurstmeister/kafka 2.12-2.5.0 caa449bd6c28 3 weeks ago 431MB
zookeeper 3.5.7 6bd990489b09 4 months ago 245MB
Chapter 3 Starting the Container
3.1 Start zookeeper
C:\Users\zihao>docker run -d --name zookeeper -p 2181:2181 -t zookeeper:3.5.7
5a29cfcb77c1bbd33eedf41add46da89b91468d09702acde5caa3086b6cc467e
3.2 Start Kafka
C:\Users\zihao>docker run -d --name kafka --publish 9092:9092 --link zookeeper --env KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181 --env KAFKA_ADVERTISED_HOST_NAME=127.0.0.1 --env KAFKA_ADVERTISED_PORT=9092 wurstmeister/kafka:2.12-2.5.0
e45499f971d1efd59227031f41631d1cf6e9e13b7950872efbf96dcbb5f998fb
3.3 Start Mysql
C:\Users\zihao>docker run -p 3306:3306 --name mysql -v D:/21docker/mysql/conf:/etc/mysql/conf.d -v D:/21docker/mysql/logs:/logs -v D:/21docker/mysql/data:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=123456 -d mysql:5.7
2bea9be2ccdbea6f41c9a4dedc7d3bf8efa5538b39e19dea43eb66f4df100bb7
Note the mounting path with the -v parameter
3.4 View the started container
C:\Users\zihao>docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5f4d03dee765 mysql:5.7 "docker-entrypoint.s…" 4 seconds ago Up 3 seconds 0.0.0.0:3306->3306/tcp, 33060/tcp mysql
e45499f971d1 wurstmeister/kafka:2.12-2.5.0 "start-kafka.sh" 47 minutes ago Up 47 minutes 0.0.0.0:9092->9092/tcp kafka
5a29cfcb77c1 zookeeper:3.5.7 "/docker-entrypoint.…" 52 minutes ago Up 52 minutes 2888/tcp, 3888/tcp, 0.0.0.0:2181->2181/tcp, 8080/tcp zookeeper
3.5 Check the operation of Kafka
Enter the kafka container
C:\Users\zihao>docker exec -it kafka /bin/bash
To the kafka directory
bash-4.4# cd opt/kafka
New topic
bash-4.4# bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --topic test
Created topic test.
Command to start the producer
bash-4.4# bin/kafka-console-producer.sh --topic=test --broker-list localhost:9092
>
Start a consumer
bash-4.4# bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 -from-beginning --topic test
Send data to see if the consumption is normal
Producer sends data
bash-4.4# bin/kafka-console-producer.sh --topic=test --broker-list localhost:9092
>hello
>flink
>
Consumer consumption data
bash-4.4# bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 -from-beginning --topic test
hello
flink
You can see that the message generation and consumption of Kafka are normal, which proves that the installation of Kafka has been normal.
3.6 Check the operation of mysql
C:\Users\zihao>docker exec -it mysql bash
root@5f4d03dee765:/# mysql -h localhost -u root -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.31 MySQL Community Server (GPL)
Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| sys |
+--------------------+
4 rows in set (0.02 sec)
You can see the database in mysql, which proves that mysql is also installed and running normally.
Chapter 4 Running flink
4.1 Create Kafka topic
bash-4.4# bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --topic country-log
Created topic country-log.
4.2 Create Mysql table
Build a library
mysql> create database flinkdb;
Build a table
mysql> create table country_log(country_id bigint,country_msg varchar(100));
4.3 Start Kafka producer
bash-4.4# bin/kafka-console-producer.sh --topic=country-log --broker-list localhost:9092
4.4 Idea writing assignment
package sql;
import org.apache.flink.api.common.restartstrategy.RestartStrategies;
import org.apache.flink.api.common.time.Time;
import org.apache.flink.runtime.state.filesystem.FsStateBackend;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.EnvironmentSettings;
import org.apache.flink.table.api.Table;
import org.apache.flink.table.api.TableEnvironment;
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
import java.util.concurrent.TimeUnit;
/**
* Kafka生产的数据:
* {"country_id":"1","country_msg":"hello flink!"}
* MySql建表语句:
* create table country(country_id bigint,country_msg varchar(100));
*/
public class Kafka2Mysql {
public static void main(String[] args) throws Exception {
// Kafka source
String sourceSQL="CREATE TABLE demo_source (country_id BIGINT,country_msg STRING)\n" +
" WITH (\n" +
" 'connector' = 'kafka-0.11',\n" +
" 'topic'='country-log',\n" +
" 'properties.bootstrap.servers'='localhost:9092',\n" +
" 'format' = 'json',\n" +
" 'scan.startup.mode' = 'latest-offset'\n" +
")";
//Mysql sink
String sinkSQL="CREATE TABLE demo_sink (country_id BIGINT,country_msg STRING)\n" +
" WITH (" +
" 'connector' = 'jdbc',\n" +
" 'url' = 'jdbc:mysql://localhost:3306/flinkdb?characterEncoding=utf-8&useSSL=false',\n" +
" 'table-name' = 'country_log',\n" +
" 'username' = 'root',\n" +
" 'password' = '123456',\n" +
" 'sink.buffer-flush.max-rows' = '1',\n" +
" 'sink.buffer-flush.interval' = '1s'\n" +
")";
// 创建执行环境
EnvironmentSettings settings=EnvironmentSettings
.newInstance()
.useBlinkPlanner()
.inStreamingMode()
.build();
//TableEnvironment tEnv = TableEnvironment.create(settings);
StreamExecutionEnvironment sEnv = StreamExecutionEnvironment.getExecutionEnvironment();
sEnv.setRestartStrategy(RestartStrategies.fixedDelayRestart(3, Time.of(1, TimeUnit.SECONDS)));
//sEnv.enableCheckpointing(1000);
//sEnv.setStateBackend(new FsStateBackend("file:///tmp/chkdir",false));
StreamTableEnvironment tEnv= StreamTableEnvironment.create(sEnv,settings);
//注册souuce
tEnv.executeSql(sourceSQL);
//注册sink
tEnv.executeSql(sinkSQL);
//数据提取
Table sourceTable=tEnv.from("demo_source");
//发送数据
sourceTable.executeInsert("demo_sink");
//执行作业
tEnv.execute("Hello Flink");
}
}
Run the above program directly in Idea.
At this point, Kafka, Mysql, and Flink operating programs have all been running, and then we start to generate data and view the data.
4.5 Command line production data
Go to the kafka command line to manually produce messages
bash-4.4# bin/kafka-console-producer.sh --topic=country-log --broker-list localhost:9092
>{"country_id":"1","country_msg":"hello flink!"}
>{"country_id":"2","country_msg":"hello flink!"}
>
4.6 View the data in Mysql
The data arrives at Mysql through Kafka-Flink, check the data in Mysql
mysql> select * from country_log;
+------------+--------------+
| country_id | country_msg |
+------------+--------------+
| 1 | hello flink! |
| 2 | hello flink! |
+------------+--------------+
2 rows in set (0.01 sec)
So far, we have successfully migrated from Kafka to downstream Mysql through Flink, and completed the establishment and use of Flink in the local development environment. But this article simply flows data through Flink, and there is no complete end-to-end business. The next article will introduce you to a complete end-to-end case covering Flink computing . Looking forward to your continued attention!