Chapter 1 Introduction
In the previous article, I introduced the preparation of Flink related environment, and completed the establishment of a simple Flink development environment; this article introduces a complete end-to-end case covering Flink computing : client=>Web API service= >Kafka=>Flink=>MySQL. This time we still take Flink Table API/SQL as an example, and deploy it in docker-compose. (Only the key part of the code is given in the article. For the complete code, please refer to the follow-up github uploaded by the author).
Chapter 2 docker-compose
2.1 Add docker-compose.yml file
version: '2'
services:
jobmanager:
image: zihaodeng/flink:1.11.1
volumes:
- D:/21docker/flinkDeploy:/opt/flinkDeploy
hostname: "jobmanager"
expose:
- "6123"
ports:
- "4000:4000"
command: jobmanager
environment:
- JOB_MANAGER_RPC_ADDRESS=jobmanager
taskmanager:
image: zihaodeng/flink:1.11.1
volumes:
- D:/21docker/flinkDeploy:/opt/flinkDeploy
expose:
- "6121"
- "6122"
depends_on:
- jobmanager
command: taskmanager
links:
- jobmanager:jobmanager
environment:
- JOB_MANAGER_RPC_ADDRESS=jobmanager
zookeeper:
container_name: zookeeper
image: zookeeper:3.6.1
ports:
- "2181:2181"
kafka:
container_name: kafka
image: wurstmeister/kafka:2.12-2.5.0
volumes:
- D:/21docker/var/run/docker.sock:/var/run/docker.sock
ports:
- "9092:9092"
depends_on:
- zookeeper
environment:
#KAFKA_ADVERTISED_HOST_NAME: kafka
HOSTNAME_COMMAND: "route -n | awk '/UG[ \t]/{print $$2}'"
KAFKA_CREATE_TOPICS: "order-log:1:1"
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
#KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://127.0.0.1:9092
#KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
mysql:
image: mysql:5.7
container_name: mysql
volumes:
- D:/21docker/mysql/data/db:/var/lib/mysql/
- D:/21docker/mysql/mysql-3346.sock:/var/run/mysql.sock
- D:/21docker/mysql/data/conf:/etc/mysql/conf.d
ports:
- 3306:3306
command:
--default-authentication-plugin=mysql_native_password
--lower_case_table_names=1
environment:
MYSQL_ROOT_PASSWORD: 123456
TZ: Asia/Shanghai
2.2 docker-compose start
$ docker-compose up -d
Check operation
In this section, docker-compose is ready, and it will be very convenient to start the working environment later. Next, start to prepare the corresponding program.
Chapter 3 Creating WebApi Project
3.1 Create WebApi (Restful API) interface project
Use springboot to quickly build an API project. The author here uses the Restful Api interface format; part of the code is as follows (see the author github for the complete code).
Create a Post interface for the client to call
@RestController
@RequestMapping("/order")
public class OrderController {
@Autowired
private Sender sender;
@PostMapping
public String insertOrder(@RequestBody Order order) {
sender.producerKafka(order);
return "{\"code\":0,\"message\":\"insert success\"}";
}
}
Create the Sender class and send data to Kafka
public class Sender {
@Autowired
private KafkaTemplate<String,String> kafkaTemplate;
private static Random rand = new Random();
public void producerKafka(Order order){
order.setPayTime(String.valueOf(new Timestamp(System.currentTimeMillis()+ rand.nextInt(100))));//EventTime
kafkaTemplate.send("order-log", JSON.toJSONString(order));
}
}
Chapter 4 Creating Flink Jobs
Here, Flink Table API/SQL is used to implement a tumble window calculation: the data read in Kafka is written into MySQL after the window calculation is summarized.
public class Kafka2MysqlByEnd2End {
public static void main(String[] args) throws Exception {
// Kafka source
String sourceSQL="CREATE TABLE order_source (\n" +
" payTime VARCHAR,\n" +
" rt as TO_TIMESTAMP(payTime),\n" +
" orderId BIGINT,\n" +
" goodsId INT,\n" +
" userId INT,\n" +
" amount DECIMAL(23,10),\n" +
" address VARCHAR,\n" +
" WATERMARK FOR rt as rt - INTERVAL '2' SECOND\n" +
" ) WITH (\n" +
" 'connector' = 'kafka-0.11',\n" +
" 'topic'='order-log',\n" +
" 'properties.bootstrap.servers'='kafka:9092',\n" +
" 'format' = 'json',\n" +
" 'scan.startup.mode' = 'latest-offset'\n" +
")";
//Mysql sink
String sinkSQL="CREATE TABLE order_sink (\n" +
" goodsId BIGINT,\n" +
" goodsName VARCHAR,\n" +
" amount DECIMAL(23,10),\n" +
" rowtime TIMESTAMP(3),\n" +
" PRIMARY KEY (goodsId) NOT ENFORCED\n" +
" ) WITH (\n" +
" 'connector' = 'jdbc',\n" +
" 'url' = 'jdbc:mysql://mysql:3306/flinkdb?characterEncoding=utf-8&useSSL=false',\n" +
" 'table-name' = 'good_sale',\n" +
" 'username' = 'root',\n" +
" 'password' = '123456',\n" +
" 'sink.buffer-flush.max-rows' = '1',\n" +
" 'sink.buffer-flush.interval' = '1s'\n" +
")";
// 创建执行环境
EnvironmentSettings settings = EnvironmentSettings
.newInstance()
.useBlinkPlanner()
.inStreamingMode()
.build();
//TableEnvironment tEnv = TableEnvironment.create(settings);
StreamExecutionEnvironment sEnv = StreamExecutionEnvironment.getExecutionEnvironment();
//sEnv.setRestartStrategy(RestartStrategies.fixedDelayRestart(3, Time.of(1, TimeUnit.SECONDS)));
//sEnv.enableCheckpointing(1000);
//sEnv.setStateBackend(new FsStateBackend("file:///tmp/chkdir",false));
StreamTableEnvironment tEnv= StreamTableEnvironment.create(sEnv,settings);
Configuration configuration = tEnv.getConfig().getConfiguration();
//设置并行度为1
configuration.set(CoreOptions.DEFAULT_PARALLELISM, 1);
//注册souuce
tEnv.executeSql(sourceSQL);
//注册sink
tEnv.executeSql(sinkSQL);
//UDF 在作业中定义UDF
tEnv.createFunction("exchangeGoods", ExchangeGoodsName.class);
String strSQL=" SELECT " +
" goodsId," +
" exchangeGoods(goodsId) as goodsName, " +
" sum(amount) as amount, " +
" tumble_start(rt, interval '5' seconds) as rowtime " +
" FROM order_source " +
" GROUP BY tumble(rt, interval '5' seconds),goodsId";
//查询数据 插入数据
tEnv.sqlQuery(strSQL).executeInsert("order_sink");
//执行作业
tEnv.execute("统计各商品营业额");
}
}
Chapter 5 Creating MySQL Database Tables
5.1 Build a library
mysql> create database flinkdb;
5.2 Create a table
mysql> create table good_sale(goodsId bigint primary key, goodsName varchar(100) CHARACTER SET utf8 COLLATE utf8_general_ci, amount decimal(23,10), rowtime timestamp);
Note: The primary key is the same as the primary key defined by the flink job ddl. The primary key is defined in the flink ddl, and the connector is in the upsert mode (Flink will determine whether to insert a new row or update an existing row according to the primary key to ensure idempotence), if not The definition is append mode (Insert insert mode, if the primary key conflicts, the insert will fail).
At this point, the environment and code have been prepared, and then start to run verification.
Chapter 6 Running Jobs
6.1 Local debugging verification
6.1.1 Start the cluster
Using the method of Chapter 2, start our prepared cluster environment:
$ docker-compose up -d
6.1.2 Start job
Start the WebApi project and Flink job project in the local IDEA .
6.1.3 Initiate a request
Here an http tool is used to simulate the client to initiate a request and send data in json format:
6.1.4 View the running status of flink jobs
View the log directly in the IDEA console
6.1.5 View results
View the results in MySQL
At this step, you can see that the job debugged by the local IDE has successfully pulled the data from the source, and after the calculation of flink, the data is written to MySQL. Next, we will complete the job submission to the cluster to run.
6.2 Cluster operation verification
6.2.1 Packaging
Packaging the WebAPI project jar package
Go to the lotemall-webapi (author's WebApi project) directory to execute the packaging command
mvn clean package -DskipTests
Prepare the Dockerfile and place it in the same directory as the jar package. Run the following command to create Images
$ docker build -t lotemall-webapi .
Package the Flink job jar package
Go to the flink-kafka2mysql (author's Flink job project) directory to execute the packaging command
Note that because we submit to the container for use, the configuration IP of the source and sink connection should be changed from the original localhost to the container name
mvn clean package -DskipTests
Put the packaged jar package in the mount directory of docker
6.2.2 Start the cluster
Using the method of Chapter 2, start our prepared cluster environment:
$ docker-compose up -d
6.2.3 Start WebAPI project
Run the docker run command to start the WebAPI project
$ docker run --link kafka:kafka --net flink-online_default -e TZ=Asia/Shanghai -d -p 8090:8080 lotemall-webapi
6.2.4 Run Flink job
Enter the flink jobmanager container and run the job
$ docker exec -it flink-online_jobmanager_1 /bin/bash
$ bin/flink run -c sql.Kafka2MysqlByEnd2End /opt/flinkDeploy/flink-kafka2mysql-0.0.1.jar -d
Or submit the job jar package through the web interface
After the job is submitted, check the Flink web interface, you can see that the job we submitted has started running
6.2.5 Initiate a request
Also use the http tool to simulate the client to initiate a request and send data in json format:
6.2.6 View the running status of Flink jobs
Submit to the cluster to run, you can directly view the job running status on the Flink web interface
View the generated Watermark
6.2.7 View results
View the results in MySQL
The record whose goodsName is "cake" in the first line is the result of Flink calculation on this cluster.
In summary , this article introduces the convenient deployment of clusters using docker-compose and completes a complete Flink stream computing case.