Summary
This article mainly introduces the implementation plan of Haicheng Bangda, a supply chain logistics service provider, in the process of digital transformation using Paimon to realize the streaming data warehouse. We provide an easy-to-use production operation manual suitable for the k8s environment, designed to help readers quickly master the use of Paimon.
Company business introduction
Pain points and selection of big data technologies
Production Practice
Troubleshooting Analysis
future plan
01
Company business introduction
Haicheng Bangda Group has been focusing on the field of supply chain logistics, providing customers with end-to-end one-stop intelligent supply chain logistics services by creating an excellent international logistics platform. The group currently has more than 2,000 employees, an annual turnover of more than 12 billion yuan, a network covering more than 200 ports around the world, and more than 80 branches and subsidiaries at home and abroad, helping Chinese companies connect with the world.
Business background:
With the continuous expansion of the company's scale and the increase of business complexity, in order to better achieve resource optimization and process improvement, the company's operations and process management department needs to monitor the company's business operations in real time to ensure the stability and efficiency of business processes .
The company's operation and process management department is responsible for supervising the execution of various business processes of the company, including the order volume of each region and business department of sea, air, and rail transport, the order volume of major customers, the volume of airline orders, customs affairs, warehousing, and land transportation. The amount of entrustment of the operation site, the actual income and expenditure of each region and business department of the company on the day, etc. Through the monitoring and analysis of these processes, the company can identify potential problems and bottlenecks, and propose improvement measures and suggestions to optimize the company's operational efficiency.
Data warehouse batch processing architecture:
Real-time data warehouse architecture:
The current system requires the collection of real-time data directly from the production system, but there are multiple data sources that need to be associated with queries, and Fanruan reports are not friendly enough when processing multiple data sources, and cannot aggregate multiple data sources again. Regularly querying the production system will put pressure on the production system database and affect the stable operation of the production system. Therefore, we need to introduce a data warehouse that can implement streaming processing through Flink CDC technology to solve the problem of real-time data processing. This data warehouse needs to be able to collect real-time data from multiple data sources and implement complex associated SQL queries, machine learning and other operations on this basis, and avoid querying the production system from time to time, thereby reducing the pressure on the production system and ensuring the reliability of the production system. Stable operation.
02
Pain points and selection of big data technologies
Since the establishment of Haicheng Bangda's big data team, it has been using efficient operation and maintenance tools or platforms to realize efficient allocation of personnel, optimize repetitive labor, and manual work.
Under the condition that the offline batch processing can already support the basic cockpit and management reports of the group, the operation and management department of the group put forward the demand for real-time statistics of order quantity and operation order quantity, and the demand of the financial department for real-time display of cash flow. In the context of the current situation, the stream-batch integration solution based on big data is imperative.
Although the big data department has used Apache Doris to realize the integrated storage and computing of lakes and warehouses, and has previously published an article on the construction of lakes and warehouses in the Doris community, there are still some problems to be solved. Streaming data storage cannot be reused, and middle layer data cannot Check, can not do real-time aggregation calculation problem.
Sorted by architecture evolution time, the common architecture solutions in recent years are as follows:
hadoop architecture:
The dividing point between traditional data warehouses and Internet data warehouses, in the early days of the Internet, people did not have high requirements for data analysis, mainly to make reports with low real-time performance and support decision-making, and the corresponding offline data analysis solutions came into being.
Advantages: Rich data type support, support for massive calculations, low machine configuration requirements, low timeliness, fault tolerance
Disadvantages: does not support real-time; complex operation and maintenance; query optimizer is not as good as MPP, slow response
Selection basis: does not support real-time; O&M is complex, does not comply with the principle of thin staffing; poor performance
lambda architecture:
The Lambda architecture is a real-time big data processing framework proposed by Nathan Marz, the author of Storm. Marz developed the famous real-time big data processing framework Storm during his work at Twitter. The Lambda architecture is based on his years of experience in distributed big data systems.
Data flow processing is divided into three layers: ServingLayer, SpeedLayer, and BatchLayer:
The Batch layer mainly processes offline data, and finally provides view services to the business;
In the Speed layer, it mainly processes real-time incremental data, and finally provides view services to the business;
In the Serving layer, it mainly responds to user requests, realizes the aggregation calculation of offline and incremental data, and finally provides services;
The advantages are: offline and real-time calculations are separated, two sets of frameworks are used, and the structure is stable
The disadvantage is: it is difficult to maintain consistency between offline and real-time data, operation and maintenance personnel need to maintain two sets of framework three-tier architecture, developers need to write three sets of code
Selection basis: Data consistency is uncontrollable; O&M and development workloads are heavy, which does not meet the principle of staff reduction;
kappa architecture:
The kappa architecture only uses a set of data stream processing architecture to solve offline and real-time data, and uses real-time streams to solve all problems, aiming to provide fast and reliable query access results. It is ideal for a variety of data processing workloads, including continuous data pipelines, real-time data processing, machine learning models and real-time data analytics, IoT systems, and many other use cases with a single technology stack.
It is usually implemented using a stream processing engine, such as Apache Flink, Apache Storm, Apache Kinesis, Apache Kafka, designed to process large data streams and provide fast and reliable query access results.
The advantage is: single data stream processing framework
The disadvantages are: although its architecture is simpler than the lamabda architecture, the setup and maintenance of the streaming processing framework is relatively complicated, and it does not have the ability to process offline data in the true sense; the cost of storing big data in the streaming platform is high
Selection basis: Offline data processing capabilities need to be retained to control costs
Iceberg
For this reason, we also investigated Iceberg. Its snapshot function can realize stream-batch integration to a certain extent, but its problem is that the middle layer of the real-time table based on Kafka cannot be checked or the existing table cannot be reused, and it has a strong dependence on Kafka. , it is necessary to use kafka to write the intermediate results to the iceberg table, which increases the complexity and maintainability of the system.
Selection basis: No kafka real-time architecture has been implemented, and the intermediate data cannot be checked and reused
Streaming data warehouse (continuation of kappa architecture)
The Haicheng Bangda big data team has participated in the construction of streaming data warehouses since version FTS0.3.0, aiming to further reduce the complexity of the data processing framework and streamline the allocation of personnel. The purpose of the early stage is to participate in the trend since it is a trend, and to learn continuously Sophistication, moving closer to the most cutting-edge technology, the team agreed that if there is a pit, step on the pit, and cross the river by feeling the stone. Fortunately, after several iterations of the version, with the efficient cooperation of the community, the initial problems have gradually disappeared. be resolved
The streaming data warehouse architecture is as follows:
Continuing the characteristics of the kappa architecture, a set of stream processing architecture has the advantages. The technical support of the underlying paimon enables data to be checked in the entire link, and the layered architecture of the data warehouse can be reused. At the same time, it takes into account offline and real-time processing capabilities. Reduce waste of storage and computation
03
Production Practice
This solution adopts the Flink Application On K8s cluster, and Flink CDC ingests the relational database data of the business system in real time, submits the Flink+Paimon Streaming Data Warehouse task through the streampark task platform, and finally uses the Trino engine to access Finereport to provide services and queries from developers. The underlying storage of paimon supports the S3 protocol. Because the company's big data services depend on Alibaba Cloud, it uses object storage OSS as the data file system.
Form a full-link real-time flow, checkable, layered and reusable Pipeline
Architecture diagram:
The main component versions are as follows:
hefty-1.16.0-scala-2.12
paimon-flink-1.16-0.4-20230424.001927-40.jar
apache-streampark_2.12-2.0.0
kubernetes v1.18.3
Environment construction
Download flink-1.16.0-scala-2.12.tar.gz You can download the corresponding version of the installation package from the flink official website to the streampark server
#解压
tar zxvf flink-1.16.0-scala-2.12.tar.gz
#修改 flink-conf 配置文件并启动集群
vim flink-1.16.0-scala-2.12/conf/flink-conf.yaml 文件,按如下配置修改
jobmanager.rpc.address: localhost
jobmanager.rpc.port: 6123
jobmanager.bind-host: localhost
jobmanager.memory.process.size: 4096m
taskmanager.bind-host: localhost
taskmanager.host: localhost
taskmanager.memory.process.size: 4096m
taskmanager.numberOfTaskSlots: 4
parallelism.default: 4
akka.ask.timeout: 100s
web.timeout: 1000000
#checkpoints&&savepoints
state.checkpoints.dir: file:///opt/flink/checkpoints
state.savepoints.dir: file:///opt/flink/savepoints
execution.checkpointing.interval: 2min
#当作业手动取消/暂停时,将会保留作业的 Checkpoint 状态信息
execution.checkpointing.externalized-checkpoint-retention: RETAIN_ON_CANCELLATION
state.backend: rocksdb
已完成的 cp 保存个数
state.checkpoints.num-retained: 2000
state.backend.incremental: true
execution.checkpointing.checkpoints-after-tasks-finish.enabled: true
#OSS
fs.oss.endpoint: oss-cn-zhangjiakou-internal.aliyuncs.com
fs.oss.accessKeyId: xxxxxxxxxxxxxxxxxxxxxxx
fs.oss.accessKeySecret: xxxxxxxxxxxxxxxxxxxxxxx
fs.oss.impl: org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem
jobmanager.execution.failover-strategy: region
rest.port: 8081
rest.address: localhost
It is recommended to add FLINK_HOME locally to facilitate local troubleshooting before using k8s
vim /etc/profile
#FLINK
export FLINK_HOME=/data/src/flink-1.16.0-scala-2.12
export PATH=$PATH:$FLINK_HOME/bin
source /etc/profile
Add flink conf to streampark
Build the flink1.16.0 base image and pull the corresponding version of the image from dockerhub
#拉取镜像
docker pull flink:1.16.0-scala_2.12-java8
#打上 tag
docker tagflink:1.16.0-scala_2.12-java8 registry-vpc.cn-zhangjiakou.aliyuncs.com/xxxxx/flink:1.16.0-scala_2.12-java8
#push 到公司仓库
docker pushregistry-vpc.cn-zhangjiakou.aliyuncs.com/xxxxx/flink:1.16.0-scala_2.12-java8
Create a Dockerfile & target directory and place the flink-oss-fs-hadoop JAR package in this directory Shaded Hadoop OSS file system jar package download address: https://repository.apache.org/snapshots/org/apache/paimon/paimon- oss/
.
├── Dockerfile
└── target
└── flink-oss-fs-hadoop-1.16.0.jar
touch Dockerfile
mkdir target
#vim Dockerfile
FROM registry-vpc.cn-zhangjiakou.aliyuncs.com/xxxxx/flink:1.16.0-scala_2.12-java8
RUN mkdir /opt/flink/plugins/oss-fs-hadoop
COPY target/flink-oss-fs-hadoop-1.16.0.jar /opt/flink/plugins/oss-fs-hadoop
#build base image
docker build -t flink-table-store:v1.16.0 .
docker tag flink-table-store:v1.16.0 registry-vpc.cn-zhangjiakou.aliyuncs.com/xxxxx/flink-table-store:v1.16.0
docker push registry-vpc.cn-zhangjiakou.aliyuncs.com/xxxxx/flink-table-store:v1.16.0
#Prepare the paimon jar package
You can download the corresponding version from the Apache Repository. It should be noted that it must be consistent with the major version of flink
Use the streampark platform to submit paimon tasks
prerequisites:
Kubernetes client connection configuration
Kubernetes RBAC configuration
Container mirror warehouse configuration (Alibaba Cloud's container mirror service personal free version is used in the case)
Create a pvc resource to mount checkpoint/savepoint
Kubernetes client connection configuration:
Copy the ~/.kube/config configuration of the k8s master node directly to the directory of the streampark server, and then execute the following command on the streampark server to display that the k8s cluster running represents successful authorization and network verification.
kubectl cluster-info
Kubernetes RBAC configuration
Create streamx namespace
kubectl create ns streamx
Create a clusterrolebinding resource using the default account
kubectl create clusterrolebinding flink-role-binding-default --clusterrole=edit --serviceaccount=streamx:default
Container mirror warehouse configuration
In this case, the Alibaba Cloud container mirroring service ACR is used, or the self-built mirroring service harbor can be used instead.
Create a namespace streampark (security settings need to be set to private)
Configure the mirror warehouse in streampark, and the task build image will be pushed to the mirror warehouse
Create a k8s secret key to pull the image streamparksecret in ACR to customize the key name
kubectl create secret docker-registry streamparksecret --docker-server=registry-vpc.cn-zhangjiakou.aliyuncs.com --docker-username=xxxxxx --docker-password=xxxxxx -n streamx
Create a pvc resource to mount checkpoint/savepoint
Persistence of K8S based on Alibaba Cloud's object storage OSS
OSS CSI plugin:
An OSS CSI plugin can be used to help simplify storage management. You can use csi configuration to create pv, and pvc, pod are defined as usual, yaml file reference: https://bondextest.oss-cn-zhangjiakou.aliyuncs.com/ossyaml.zip
Configuration requirements:
- Create a service account with the required RBAC permissions
Reference: https://github.com/kubernetes-sigs/alibaba-cloud-csi-driver/blob/master/docs/oss.md
kubectl -f rbac.yaml
- Deploy the OSS CSI plugin
kubectl -f oss-plugin.yaml
- Create CP&SP PV
kubectl -f checkpoints_pv.yaml kubectl -f savepoints_pv.yaml
- Create CP&SP PVC
kubectl -f checkpoints_pvc.yaml kubectl -f savepoints_pvc.yaml
After configuring the dependent environment, we will start using paimon for layered development of streaming data warehouses.
case:
Statistics of real-time orders for sea and air freight
Task submission:
Initialize paimon catalog configuration
SET 'execution.runtime-mode' = 'streaming';
set 'table.exec.sink.upsert-materialize' = 'none';
SET 'sql-client.execution.result-mode' = 'tableau';
-- 创建并使用 FTS Catalog 底层存储方案采用阿里云oss
CREATE CATALOG `table_store` WITH (
'type' = 'paimon',
'warehouse' = 'oss://xxxxx/xxxxx' #自定义oss存储路径
);
USE CATALOG `table_store`;
A task simultaneously extracts table data from three databases of postgres, mysql, and sqlserver and writes them to paimon
Development Mode:Flink SQL
Execution Mode :kubernetes application
Flink Version :flink-1.16.0-scala-2.12
Kubernetes Namespace :streamx
Kubernetes ClusterId: (you can customize the task name)
Flink Base Docker Image: registry-vpc.cn-zhangjiakou.aliyuncs.com/xxxxx/flink-table-store:v1.16.0 #The base image uploaded to the Alibaba Cloud mirror warehouse
Rest-Service Exposed Type:NodePort
paimon basic dependency package:
paimon-flink-1.16-0.4-20230424.001927-40.jar
flink-shaded-hadoop-2-uber-2.8.3-10.0.jar
Flinkcdc dependent package download address:
https://github.com/ververica/flink-cdc-connectors/releases/tag/release-2.2.0
pod template
apiVersion: v1
kind: Pod
metadata:
name: pod-template
spec:
containers:
- name: flink-main-container
volumeMounts:
- name: flink-checkpoints-csi-pvc
mountPath: /opt/flink/checkpoints
- name: flink-savepoints-csi-pvc
mountPath: /opt/flink/savepoints
volumes:
- name: flink-checkpoints-csi-pvc
persistentVolumeClaim:
claimName: flink-checkpoints-csi-pvc
- name: flink-savepoints-csi-pvc
persistentVolumeClaim:
claimName: flink-savepoints-csi-pvc
imagePullSecrets:
- name: streamparksecret
flink sql:
1. Build the relationship between the source table and the ods table in paimon, here is the one-to-one mapping between the source table and the target table
-- postgre数据库 示例
CREATE TEMPORARY TABLE `shy_doc_hdworkdochd` (
`doccode` varchar(50) not null COMMENT '主键',
`businessmodel` varchar(450) COMMENT '业务模式',
`businesstype` varchar(450) COMMENT '业务性质',
`transporttype` varchar(50) COMMENT '运输类型',
......
`bookingguid` varchar(50) COMMENT '操作编号',
PRIMARY KEY (`doccode`) NOT ENFORCED
) WITH (
'connector' = 'postgres-cdc',
'hostname' = '数据库服务器IP地址',
'port' = '端口号',
'username' = '用户名',
'password' = '密码',
'database-name' = '数据库名',
'schema-name' = 'dev',
'decoding.plugin.name' = 'wal2json',,
'table-name' = 'doc_hdworkdochd',
'debezium.slot.name' = 'hdworkdochdslotname03'
);
CREATE TEMPORARY TABLE `shy_base_enterprise` (
`entguid` varchar(50) not null COMMENT '主键',
`entorgcode` varchar(450) COMMENT '客户编号',
`entnature` varchar(450) COMMENT '客户类型',
`entfullname` varchar(50) COMMENT '客户名称',
PRIMARY KEY (`entguid`,`entorgcode`) NOT ENFORCED
) WITH (
'connector' = 'postgres-cdc',
'hostname' = '数据库服务器IP地址',
'port' = '端口号',
'username' = '用户名',
'password' = '密码',
'database-name' = '数据库名',
'schema-name' = 'dev',
'decoding.plugin.name' = 'wal2json',
'table-name' = 'base_enterprise',
'debezium.snapshot.mode'='never', -- 增量同步(全量+增量忽略该属性)
'debezium.slot.name' = 'base_enterprise_slotname03'
);
-- 根据源表结构在paimon上ods层创建对应的目标表
CREATE TABLE IF NOT EXISTS ods.`ods_shy_jh_doc_hdworkdochd` (
`o_year` BIGINT NOT NULL COMMENT '分区字段',
`create_date` timestamp NOT NULL COMMENT '创建时间',
PRIMARY KEY (`o_year`, `doccode`) NOT ENFORCED
) PARTITIONED BY (`o_year`)
WITH (
'changelog-producer.compaction-interval' = '2m'
) LIKE `shy_doc_hdworkdochd` (EXCLUDING CONSTRAINTS EXCLUDING OPTIONS);
CREATE TABLE IF NOT EXISTS ods.`ods_shy_base_enterprise` (
`create_date` timestamp NOT NULL COMMENT '创建时间',
PRIMARY KEY (`entguid`,`entorgcode`) NOT ENFORCED
)
WITH (
'changelog-producer.compaction-interval' = '2m'
) LIKE `shy_base_enterprise` (EXCLUDING CONSTRAINTS EXCLUDING OPTIONS);
-- 设置作业名,执行作业任务将源表数据实时写入到paimon对应表中
SET 'pipeline.name' = 'ods_doc_hdworkdochd';
INSERT INTO
ods.`ods_shy_jh_doc_hdworkdochd`
SELECT
*
,YEAR(`docdate`) AS `o_year`
,TO_TIMESTAMP(CONVERT_TZ(cast(CURRENT_TIMESTAMP as varchar), 'UTC', 'Asia/Shanghai')) AS `create_date`
FROM
`shy_doc_hdworkdochd` where `docdate` is not null and `docdate` > '2023-01-01';
SET 'pipeline.name' = 'ods_shy_base_enterprise';
INSERT INTO
ods.`ods_shy_base_enterprise`
SELECT
*
,TO_TIMESTAMP(CONVERT_TZ(cast(CURRENT_TIMESTAMP as varchar), 'UTC', 'Asia/Shanghai')) AS `create_date`
FROM
`shy_base_enterprise` where entorgcode is not null and entorgcode <> '';
-- mysql数据库 示例
CREATE TEMPORARY TABLE `doc_order` (
`id` BIGINT NOT NULL COMMENT '主键',
`order_no` varchar(50) NOT NULL COMMENT '订单号',
`business_no` varchar(50) COMMENT 'OMS服务号',
......
`is_deleted` int COMMENT '是否作废',
PRIMARY KEY (`id`) NOT ENFORCED
) WITH (
'connector' = 'mysql-cdc',
'hostname' = '数据库服务器地址',
'port' = '端口号',
'username' = '用户名',
'password' = '密码',
'database-name' = '库名',
'table-name' = 'doc_order'
);
-- 根据源表结构在paimon上ods层创建对应的目标表
CREATE TABLE IF NOT EXISTS ods.`ods_bondexsea_doc_order` (
`o_year` BIGINT NOT NULL COMMENT '分区字段',
`create_date` timestamp NOT NULL COMMENT '创建时间',
PRIMARY KEY (`o_year`, `id`) NOT ENFORCED
) PARTITIONED BY (`o_year`)
WITH (
'changelog-producer.compaction-interval' = '2m'
) LIKE `doc_order` (EXCLUDING CONSTRAINTS EXCLUDING OPTIONS);
-- 设置作业名,执行作业任务将源表数据实时写入到paimon对应表中
SET 'pipeline.name' = 'ods_bondexsea_doc_order';
INSERT INTO
ods.`ods_bondexsea_doc_order`
SELECT
*
,YEAR(`gmt_create`) AS `o_year`
,TO_TIMESTAMP(CONVERT_TZ(cast(CURRENT_TIMESTAMP as varchar), 'UTC', 'Asia/Shanghai')) AS `create_date`
FROM `doc_order` where gmt_create > '2023-01-01';
-- sqlserver数据库 示例
CREATE TEMPORARY TABLE `OrderHAWB` (
`HBLIndex` varchar(50) NOT NULL COMMENT '主键',
`CustomerNo` varchar(50) COMMENT '客户编号',
......
`CreateOPDate` timestamp COMMENT '制单日期',
PRIMARY KEY (`HBLIndex`) NOT ENFORCED
) WITH (
'connector' = 'sqlserver-cdc',
'hostname' = '数据库服务器地址',
'port' = '端口号',
'username' = '用户名',
'password' = '密码',
'database-name' = '数据库名',
'schema-name' = 'dbo',
-- 'debezium.snapshot.mode' = 'initial' -- 全量增量都抽取
'scan.startup.mode' = 'latest-offset',-- 只抽取增量数据
'table-name' = 'OrderHAWB'
);
-- 根据源表结构在paimon上ods层创建对应的目标表
CREATE TABLE IF NOT EXISTS ods.`ods_airsea_airfreight_orderhawb` (
`o_year` BIGINT NOT NULL COMMENT '分区字段',
`create_date` timestamp NOT NULL COMMENT '创建时间',
PRIMARY KEY (`o_year`, `HBLIndex`) NOT ENFORCED
) PARTITIONED BY (`o_year`)
WITH (
'changelog-producer.compaction-interval' = '2m'
) LIKE `OrderHAWB` (EXCLUDING CONSTRAINTS EXCLUDING OPTIONS);
-- 设置作业名,执行作业任务将源表数据实时写入到paimon对应表中
SET 'pipeline.name' = 'ods_airsea_airfreight_orderhawb';
INSERT INTO
ods.`ods_airsea_airfreight_orderhawb`
SELECT
RTRIM(`HBLIndex`) as `HBLIndex`
......
,`CreateOPDate`
,YEAR(`CreateOPDate`) AS `o_year`
,TO_TIMESTAMP(CONVERT_TZ(cast(CURRENT_TIMESTAMP as varchar), 'UTC', 'Asia/Shanghai')) AS `create_date`
FROM `OrderHAWB` where CreateOPDate > '2023-01-01';
The effect of writing business table data to paimon ods table in real time is as follows:
2. Write the data of the ods layer table into the dwd layer. In fact, this is to merge the relevant business tables of the ods layer into the dwd. Here, the value of the count_order field is mainly processed, because the data in the source table has logical deletion and There will be problems with the count function for physical deletion, so we use sum aggregation to calculate the single quantity here. The count_order corresponding to each reference_no is 1. If the logic is invalidated, it will be processed into 0 through SQL, and the physical deletion of paimon will be automatically processed.
We directly use the dimension table processed in doris to use the dim dimension table. The update frequency of the dimension table is low, so no secondary development is carried out in paimon.
-- 在paimon-dwd层创建宽表
CREATE TABLE IF NOT EXISTS dwd.`dwd_business_order` (
`reference_no` varchar(50) NOT NULL COMMENT '委托单号主键',
`bondex_shy_flag` varchar(8) NOT NULL COMMENT '区分',
`is_server_item` int NOT NULL COMMENT '是否已经关联订单',
`order_type_name` varchar(50) NOT NULL COMMENT '业务分类',
`consignor_date` DATE COMMENT '统计日期',
`consignor_code` varchar(50) COMMENT '客户编号',
`consignor_name` varchar(160) COMMENT '客户名称',
`sales_code` varchar(32) NOT NULL COMMENT '销售编号',
`sales_name` varchar(200) NOT NULL COMMENT '销售名称',
`delivery_center_op_id` varchar(32) NOT NULL COMMENT '交付编号',
`delivery_center_op_name` varchar(200) NOT NULL COMMENT '交付名称',
`pol_code` varchar(100) NOT NULL COMMENT '起运港代码',
`pot_code` varchar(100) NOT NULL COMMENT '中转港代码',
`port_of_dest_code` varchar(100) NOT NULL COMMENT '目的港代码',
`is_delete` int not NULL COMMENT '是否作废',
`order_status` varchar(8) NOT NULL COMMENT '订单状态',
`count_order` BIGINT not NULL COMMENT '订单数',
`o_year` BIGINT NOT NULL COMMENT '分区字段',
`create_date` timestamp NOT NULL COMMENT '创建时间',
PRIMARY KEY (`o_year`,`reference_no`,`bondex_shy_flag`) NOT ENFORCED
) PARTITIONED BY (`o_year`)
WITH (
-- 每个 partition 下设置 2 个 bucket
'bucket' = '2',
'changelog-producer' = 'full-compaction',
'snapshot.time-retained' = '2h',
'changelog-producer.compaction-interval' = '2m'
);
-- 设置作业名,将ods层的相关业务表合并写入到dwd层
SET 'pipeline.name' = 'dwd_business_order';
INSERT INTO
dwd.`dwd_business_order`
SELECT
o.doccode,
......,
YEAR (o.docdate) AS o_year
,TO_TIMESTAMP(CONVERT_TZ(cast(CURRENT_TIMESTAMP as varchar), 'UTC', 'Asia/Shanghai')) AS `create_date`
FROM
ods.ods_shy_jh_doc_hdworkdochd o
INNER JOIN ods.ods_shy_base_enterprise en ON o.businessguid = en.entguid
LEFT JOIN dim.dim_hhl_user_code sales ON o.salesguid = sales.USER_GUID
LEFT JOIN dim.dim_hhl_user_code op ON o.bookingguid = op.USER_GUID
UNION ALL
SELECT
business_no,
......,
YEAR ( gmt_create ) AS o_year
,TO_TIMESTAMP(CONVERT_TZ(cast(CURRENT_TIMESTAMP as varchar), 'UTC', 'Asia/Shanghai')) AS `create_date`
FROM
ods.ods_bondexsea_doc_order
UNION ALL
SELECT
HBLIndex,
......,
YEAR ( CreateOPDate ) AS o_year
,TO_TIMESTAMP(CONVERT_TZ(cast(CURRENT_TIMESTAMP as varchar), 'UTC', 'Asia/Shanghai')) AS `create_date`
FROM
ods.`ods_airsea_airfreight_orderhawb`
;
Flink ui can see that the ods data is cleaned to the table dwd_business_order by real-time join of paimon
3. Lightly aggregate dwd layer data into dwm layer and write related data
In the dwm.`dwm_business_order_count` table, the data in this table will sum the aggregated fields according to the primary key, and the sum_orderCount field is the aggregated result. Paimon will automatically process the sum of physically deleted data.
-- 创建dwm层轻度汇总表,根据日期、销售、操作、业务类别、客户、起运港、目的港汇总单量
CREATE TABLE IF NOT EXISTS dwm.`dwm_business_order_count` (
`l_year` BIGINT NOT NULL COMMENT '统计年',
`l_month` BIGINT NOT NULL COMMENT '统计月',
`l_date` DATE NOT NULL COMMENT '统计日期',
`bondex_shy_flag` varchar(8) NOT NULL COMMENT '区分',
`order_type_name` varchar(50) NOT NULL COMMENT '业务分类',
`is_server_item` int NOT NULL COMMENT '是否已经关联订单',
`customer_code` varchar(50) NOT NULL COMMENT '客户编号',
`sales_code` varchar(50) NOT NULL COMMENT '销售编号',
`delivery_center_op_id` varchar(50) NOT NULL COMMENT '交付编号',
`pol_code` varchar(100) NOT NULL COMMENT '起运港代码',
`pot_code` varchar(100) NOT NULL COMMENT '中转港代码',
`port_of_dest_code` varchar(100) NOT NULL COMMENT '目的港代码',
`customer_name` varchar(200) NOT NULL COMMENT '客户名称',
`sales_name` varchar(200) NOT NULL COMMENT '销售名称',
`delivery_center_op_name` varchar(200) NOT NULL COMMENT '交付名称',
`sum_orderCount` BIGINT NOT NULL COMMENT '订单数',
`create_date` timestamp NOT NULL COMMENT '创建时间',
PRIMARY KEY (`l_year`, `l_month`,`l_date`,`order_type_name`,`bondex_shy_flag`,`is_server_item`,`customer_code`,`sales_code`,`delivery_center_op_id`,`pol_code`,`pot_code`,`port_of_dest_code`) NOT ENFORCED
) WITH (
'changelog-producer' = 'full-compaction',
'changelog-producer.compaction-interval' = '2m',
'merge-engine' = 'aggregation', -- 使用 aggregation 聚合计算 sum
'fields.sum_orderCount.aggregate-function' = 'sum',
'fields.create_date.ignore-retract'='true',
'fields.sales_name.ignore-retract'='true',
'fields.customer_name.ignore-retract'='true',
'snapshot.time-retained' = '2h',
'fields.delivery_center_op_name.ignore-retract'='true'
);
-- 设置作业名
SET 'pipeline.name' = 'dwm_business_order_count';
INSERT INTO
dwm.`dwm_business_order_count`
SELECT
YEAR(o.`consignor_date`) AS `l_year`
,MONTH(o.`consignor_date`) AS `l_month`
......,
,TO_TIMESTAMP(CONVERT_TZ(cast(CURRENT_TIMESTAMP as varchar), 'UTC', 'Asia/Shanghai')) AS create_date
FROM
dwd.`dwd_business_order` o
;
The Flink UI effect is as follows dwd_business_orders data aggregation is written to dwm_business_order_count:
4. Aggregate the dwm layer data into the dws layer, and the dws layer is a summary of smaller dimensions
-- 创建根据操作人、业务类型聚合当天的单量
CREATE TABLE IF NOT EXISTS dws.`dws_business_order_count_op` (
`l_year` BIGINT NOT NULL COMMENT '统计年',
`l_month` BIGINT NOT NULL COMMENT '统计月',
`l_date` DATE NOT NULL COMMENT '统计日期',
`order_type_name` varchar(50) NOT NULL COMMENT '业务分类',
`delivery_center_op_id` varchar(50) NOT NULL COMMENT '交付编号',
`delivery_center_op_name` varchar(200) NOT NULL COMMENT '交付名称',
`sum_orderCount` BIGINT NOT NULL COMMENT '订单数',
`create_date` timestamp NOT NULL COMMENT '创建时间',
PRIMARY KEY (`l_year`, `l_month`,`l_date`,`order_type_name`,`delivery_center_op_id`) NOT ENFORCED
) WITH (
'merge-engine' = 'aggregation', -- 使用 aggregation 聚合计算 sum
'fields.sum_orderCount.aggregate-function' = 'sum',
'fields.create_date.ignore-retract'='true',
'snapshot.time-retained' = '2h',
'fields.delivery_center_op_name.ignore-retract'='true'
);
-- 设置作业名
SET 'pipeline.name' = 'dws_business_order_count_op';
INSERT INTO
dws.`dws_business_order_count_op`
SELECT
o.`l_year`
,o.`l_month`
,o.`l_date`
,o.`order_type_name`
,o.`delivery_center_op_id`
,o.`delivery_center_op_name`
,o.`sum_orderCount`
,TO_TIMESTAMP(CONVERT_TZ(cast(CURRENT_TIMESTAMP as varchar), 'UTC', 'Asia/Shanghai')) AS create_date
FROM
dwm.`dwm_business_order_count` o
;
The Flink UI effect is as follows dws_business_order_count_op data is written to dws_business_order_count_op:
Overall Data Flow Example
Source table:
paimon-mosquito:
paimon-dwd:
paimon-dwm:
paimon-dws:
It is specially reminded that if the amount of data in the source table is too large when extracting the sqlserver database, the table will be locked when the data is extracted in large quantities. It is recommended to use incremental extraction when the business permits. For the full extraction of sqlserver, you can use the transfer method to import the full amount of data from sqlserver to mysql, from mysql to paimon-ods, and then use sqlserever to do incremental extraction.
04
Troubleshooting Analysis
1. Inaccurate calculation of aggregated data
sqlserver cdc collects data to paimon table
illustrate:
dwd table:
'changelog-producer' = 'input'
ads table:
'merge-engine' = 'aggregation', -- use aggregation to calculate sum
'fields.sum_amount.aggregate-function' = 'sum'
If the ADS layer aggregation table adopts agg sum, the dwd data stream will not generate update_before, and an error data stream update_after will be generated. For example, the upstream source table update 10 to 30. The dwd layer data will be changed to 30, and the ads aggregation layer data will also be changed to 30, but now it changes In order to append the data becomes 10+30=40 wrong data.
Solution:
By specifying 'changelog-producer' = 'full-compaction', Table Store will compare the results between full compactions and produce the differences as changelog. The latency of changelog is affected by the frequency of full compactions.
By specifying changelog-producer.compaction-interval table property (default value 30min), users can define the maximum interval between two full compactions to ensure latency. This table property does not affect normal compactions and they may still be performed once in a while by writers to reduce reader costs.
This can solve the above problem. But then a new problem arose. The default changelog-producer.compaction-interval is 30 minutes, which means that the upstream change to ads query needs to be 30 minutes apart. During the production process, it is found that if the compaction interval is changed to 1 minute or 2 minutes, the above-mentioned ADS layer aggregation data does not appear again. standard situation.
'changelog-producer.compaction-interval' = '2m'
It is necessary to configure table.exec.sink.upsert-materialize=none when writing to the Flink Table Store to avoid the Upsert flow, so as to ensure that the complete changelog can be saved in the Flink Table Store and prepare for subsequent stream read operations.
set 'table.exec.sink.upsert-materialize' = 'none'
2. The same sequence.field causes the dwd detail wide table to fail to receive update data update
mysql cdc collects data to paimon table
illustrate:
Execute update on the MySQL source
After the data modification is successful, the dwd_orders table data can be synchronized successfully
However, the data in the dwd_enriched_orders table cannot be synchronized, and the stream mode is started to view the data, and it is found that there is no data flow direction
solve:
The investigation found that it was caused by configuring the parameter 'sequence.field' = 'o_orderdate' (using o_orderdate to generate sequence id, and selecting a record with a larger sequence id when merging the same primary key), because the time of the o_orderdate field does not change when the price is modified, Then the 'sequence.field' is the same, resulting in an uncertain order, so ROW1 and ROW2, their o_orderdate is the same, so they will be randomly selected when updating, all this parameter can be removed, after removal, it will normally follow the order of input Sequence, a sequence number is automatically generated, which will not affect the synchronization result.
3. Aggregate function 'last_non_null_value' does not support retraction
报错:Caused by: java.lang.UnsupportedOperationException: Aggregate function 'last_non_null_value' does not support retraction, If you allow this function to ignore retraction messages, you can configure 'fields.${field_name}.ignore-retract'='true'.
An explanation can be found in the official documentation:
Only sum supports retraction (UPDATE_BEFORE and DELETE), others aggregate functions do not support retraction.
It can be understood as: Except for the SUM function, other Agg functions do not support Retraction. In order to avoid receiving DELETE and UPDATEBEFORE messages, you need to configure the specified field with 'fields.${field_name}.ignore-retract'='true' to ignore , to solve this error
WITH (
'merge-engine' = 'aggregation', -- use aggregation to calculate sum
'fields.sum_orderCount.aggregate-function' = 'sum',
'fields.create_date.ignore-retract'='true' #create_date 字段
);
4. Paimon task interruption failed
The task is interrupted abnormally and the pod hangs up
Viewing the loki log shows akka.pattern.AskTimeoutException: Ask timed out on
java.util.concurrent.TimeoutException: Invocation of [RemoteRpcInvocation(JobMasterGateway.updateTaskExecutionState(TaskExecutionState))] at recipient [akka.tcp://[email protected]:6123/user/rpc/jobmanager_2] timed out. This is usually caused by: 1) Akka failed sending the message silently, due to problems like oversized payload or serialization failures. In that case, you should find detailed error information in the logs. 2) The recipient needs more time for responding, due to problems like slow machines or network jitters. In that case, you can try to increase akka.ask.timeout.\n"
The preliminary judgment should be that the akka timeout mechanism is triggered due to the above two reasons, then adjust the akka timeout configuration of the cluster and split a single task or increase the resource configuration.
Let's first look at how to modify the parameters:
key |
default |
describe |
akka.ask.timeout |
10s |
Timeout used for all futures and blocking Akka calls. If Flink fails due to timeouts then you should try to increase this value. Timeouts can be caused by slow machines or a congested network. The timeout value requires a time-unit specifier (ms/s/min/h/d). |
web.timeout |
600000 |
Timeout for asynchronous operations by the web monitor in milliseconds. |
Add the following parameters at the end of conf/flink-conf.yaml
akka.ask.timeout: 100s
web.timeout:1000000
Then manually refresh flink-conf.yaml in streampark to verify whether the parameters are synchronized successfully.
5. snapshot no such file or director
It was found that cp failed
The log at the corresponding time point shows that the Snapshot is lost, and the task is displayed as running, but the source table mysql data cannot be written into the paimon ods table
The reason for the failure of locating cp is: a large amount of calculation and CPU-intensive, causing the thread in the TM to processElement all the time, and there is no time to do CP
The reason why the snapshot cannot be read is: the flink cluster resources are insufficient, the Writer and the Committer compete, and the incomplete snapshot that has expired is read during the Full-Compaction. Currently, the official has fixed this problem
https://github.com/apache/incubator-paimon/pull/1308
The solution to cp failure is to increase parallelism, increase deploymenttaskmanager slot and jobmanager cpu
-D kubernetes.jobmanager.cpu=0.8
-D kubernetes.jobmanager.cpu.limit-factor=1
-D taskmanager.numberOfTaskSlots=8
-D jobmanager.adaptive-batch-scheduler.default-source-parallelism=2
In complex real-time tasks, resources can be increased by modifying dynamic parameters.
05
future plan
The self-built data platform bondata is integrating paimon's metadata information, data index system, blood relationship, one-key pipeline and other functions to form Haicheng Bonda's data assets, and will carry out one-stop data governance on this basis
Later, it will be connected to Doris based on trino Catalog to realize the one service of real offline data and real-time data
Adopt the architecture scheme of doris+paimon to continue to promote the pace of the construction of the group's internal flow batch integrated data warehouse
If this article is helpful to you, don't forget to "Like", "Like", and "Favorite" three times!
The Internet's worst era may indeed be here
I am studying in university at Bilibili, majoring in big data
What are we learning when we are learning Flink?
193 articles beat Flink violently, you need to pay attention to this collection
Flink production environment TOP problems and optimization, Alibaba Tibetan Scripture Pavilion YYDS
Flink CDC I'm sure Jesus can't keep him! | Flink CDC online problem inventory
What are we learning when we are learning Spark?
Among all Spark modules, I would like to call SparkSQL the strongest!
Hard Gang Hive | 40,000-word Basic Tuning Interview Summary
A Small Encyclopedia of Data Governance Methodologies and Practices
A small guide to user portrait construction under the label system
40,000-word long text | ClickHouse basics & practice & tuning full perspective analysis
Another decade begins in the direction of big data | The first edition of "Hard Gang Series" ends
Articles I have written about growth/interview/career advancement
What are we learning when we are learning Hive? "Hard Hive Sequel"