Apache Doris detailed tutorial (1)

1. Introduction to Doris

1.1. Overview of doris

Apache Doris was developed by Baidu Big Data Department (previously called Baidu Palo, and after being contributed to the Apache community in 2018,

Renamed Doris), more than 200 product lines are in use within Baidu, and more than 1,000 machines are deployed. A single

The maximum business size can reach hundreds of TB.

Apache Doris is a modern MPP (Massively Parallel Processing)

Analytical (OLAP) database products. Query results can be obtained with only sub-second response time, effectively supporting real-time data analysis.

The distributed architecture of Apache Doris is very simple, easy to operate and maintain, and can support very large data sets of more than 10PB.

Apache Doris can meet a variety of data analysis needs, such as fixed historical reports, real-time data analysis, interactive data analysis, and exploratory data analysis.
Insert image description here

1.2. The difference between OLAP and OLTP

Insert image description hereOnline transaction processing OLTP (On-Line Transaction Processing)

  • In the scenario where the company's business system uses a database, there are a large number of random additions, deletions, modifications and queries to the business system database.
  • High concurrency
  • Be fast
  • support matters

Insert image description here
On-Line Analytical Processing OLAP (On-Line Analytical Processing)
The company's data analysis uses the database scenario to perform statistical analysis on the generated data

  • One operation is performed on the entire data set.
  • Only check this action, not add, delete or modify
  • The query response speed is relatively slow, which is acceptable.
  • The concurrency requirements are not too high

Insert image description hereInsert image description here
Common OLAP engines

Insert image description here
Insert image description here

1.3. Usage scenarios

1. Report analysis

  • Real-time dashboards (Dashboards) ==》I gave him a sql in rolap, and it responded to the results I wanted in sub-seconds.
  • Reports for internal analysts and managers
  • Highly concurrent report analysis for users or customers (Customer Facing Analytics). For example, site analysis for website owners and advertising reports for advertisers usually require thousands of QPS for concurrency, and query latency requires millisecond response. The famous e-commerce company JD.com uses Apache Doris in advertising reports, writing 10 billion rows of data every day, with tens of thousands of concurrent query QPS, and the 99th percentile query delay is 150ms.

2. Ad-hoc Query
Self-service analysis for analysts, the query mode is not fixed and requires high throughput. Xiaomi has built a growth analysis platform (Growing Analytics, GA) based on Doris, which uses user behavior data to conduct business growth analysis. The average query delay is 10s, the 95th percentile query delay is within 30s, and the daily SQL query volume is tens of thousands. strip.

3. Unified data warehouse construction
One platform meets the needs of unified data warehouse construction and simplifies the cumbersome big data software stack. Haidilao's unified data warehouse built on Doris has replaced the old architecture consisting of Spark, Hive, Hbase, and Phoenix, and the architecture has been greatly simplified.

4. Data Lake Federated Query
Federate analysis of data in Hive and Hudi through external methods. Query performance is greatly improved without data copying

1.4. Advantages of doris

Insert image description here

1.5. Architecture

The architecture of Doris is very simple. It only has two roles: FE (Frontend) front-end process, BE (Backend) back-end process, and two background service processes. It does not rely on external components and is convenient for deployment and operation and maintenance. Both FE and BE can be used. Online scaling.

1. FE (Frontend): stores and maintains cluster metadata; is responsible for receiving and parsing query requests, planning query plans, scheduling query execution, and returning query results.

There are three main roles:

  • Leader and Follower: Mainly used to achieve high availability of metadata, ensuring that in the event of a single node failure, metadata can be restored online in real time without affecting the entire service.
  • Observer: used to expand query nodes and also function as metadata backup. If you find that the cluster pressure is very high and you need to expand the entire query capability, you can add observer nodes. The observer does not participate in any writing, only reading.

2. BE (Backend): Responsible for the storage and calculation of physical data; executes queries in a distributed manner based on the physical plan generated by FE. The reliability of the data is guaranteed by BE, which stores multiple or three copies of the entire data. The number of copies can be dynamically adjusted according to demand.

3. MySQL Client: Doris With the MySQL protocol, users can directly access Doris using any MySQL ODBC/JDBC and MySQL client.

4. Broker: an independent stateless process. It encapsulates the file system interface and provides Doris with the ability to read files in remote storage systems, including HDFS, S3, BOS, etc.

Insert image description here

1.6. Default port

Insert image description here

2. Doris installation

2.1. Preparation

Insert image description here
Insert image description here
Operating system environment requirements
Set the maximum number of open file handles in the system ==>When starting a program, the number of open files is the number of handles

1.打开文件
security  /sɪˈkjʊərəti/
vi /etc/security/limits.conf 
2.在文件最后添加下面几行信息(注意* 也要赋值进去)

* soft nofile 65535 
* hard nofile 65535 
* soft nproc 65535 
* hard nproc 65535

ulimit -n 65535 临时生效

修改完文件后需要重新启动虚拟机
重启永久生效,也可以用 。


如果不修改这个句柄数大于等于60000,回头启动doris的be节点的时候就会报如下的错
File descriptor number is less than 60000. Please use (ulimit -n) to set a value equal or greater than 60000
W1120 18:14:20.934705  3437 storage_engine.cpp:188] check fd number failed, error: Internal error: file descriptors limit is too small
W1120 18:14:20.934713  3437 storage_engine.cpp:102] open engine failed, error: Internal error: file descriptors limit is too small
F1120 18:14:20.935087  3437 doris_main.cpp:404] fail to open StorageEngine, res=file descriptors limit is too small

Clock synchronization
Doris’s metadata requires time accuracy less than 5000ms, so all machines in the cluster must synchronize their clocks to avoid service anomalies caused by inconsistencies in metadata caused by clock problems. .

如何时间同步??
首先安装 ntpdate   
# ntpdate是一个向互联网上的时间服务器进行时间同步的软件
[root@zuomm01 doris]# yum install ntpdate -y

然后开始三台机器自己同步时间

[root@node01 ~]# ntpdate ntp.sjtu.edu.cn

美国标准技术院时间服务器:time.nist.gov(192.43.244.18)
上海交通大学网络中心NTP服务器地址:ntp.sjtu.edu.cn(202.120.2.101)
中国国家授时中心服务器地址:cn.pool.ntp.org(210.72.145.44# 将当前时间写入bios,这样才能永久生效不变,不然reboot后还会恢复到原来的时间
clock -w 

Turn off the swap partition (swap)

The swap partition is a disk partition used by Linux as virtual memory;

Linux can use a disk partition as memory (virtual memory, swap partition);

The use of swap partitions in Linux will cause serious performance problems for Doris. It is recommended to disable the swap partition before installation;

1、查看 Linux 当前 Swap 分区
free -m
2、关闭 Swap 分区
swapoff -a

[root@zuomm01 app]# free -m
              total        used        free      shared  buff/cache   available
Mem:           5840         997        4176           9         666        4604
Swap:          6015           0        6015
[root@zuomm01 app]# swapoff -a

3.验证是否关闭成功
[root@zuomm01 app]# free -m   
              total        used        free      shared  buff/cache   available
Mem:           5840         933        4235           9         671        4667
Swap:             0           0           0

Precautions:

1. The disk space of FE is mainly used to store metadata, including logs and images. Typically ranges from a few hundred MB to several GB.

2. The disk space of BE is mainly used to store user data. The total disk space is calculated based on the total user data volume * 3 (3 copies), and then an additional 40% space is reserved for background compaction and the storage of some intermediate data.

3. Multiple BE instances can be deployed on a machine, but only one FE can be deployed. If 3 copies of data are required, at least 3 machines need to be deployed with one BE instance each (instead of 1 machine with 3 BE instances deployed). The clocks of the servers where multiple FEs are located must be consistent (clock deviations of up to 5 seconds are allowed)

4. The test environment can also only use one BE for testing. In actual production environments, the number of BE instances directly determines the overall query latency.

5. Close Swap on all deployment nodes.

6. The FE node data is at least 1 (1 Follower). When 1 Follower and 1 Observer are deployed, high read availability can be achieved. When 3 Followers are deployed, read and write high availability (HA) can be achieved.

7. The number of Followers must be an odd number, and the number of Observers is arbitrary.

8. Based on past experience, when cluster availability requirements are high (such as providing online services), 3 Followers and 1-3 Observers can be deployed. If it is an offline business, it is recommended to deploy 1 Follower and 1-3 Observers.

9. Broker is a process used to access external data sources (such as HDFS). Typically, deploying one broker instance on each machine is sufficient.

2.2. Install FE

1. Go to the official website to download the source code package, the official website address is: https://doris.apache.org
2. Click to download selectively according to your own configuration
Insert image description here3. Upload to linux, decompress, and modify the configuration file

-- 去自己的路劲中找到fe.conf文件
vi /opt/app/doris/fe/conf/fe.conf 
#配置文件中指定元数据路径: 注意这个文件夹要自己创建
meta_dir = /opt/data/dorisdata/doris-meta
#修改绑定 ip(每台机器修改成自己的 ip) 
priority_networks = 192.168.17.0/24 

4. Distribution cluster

[root@zuomm01 app]# for i in 2 3 
> do
> scp /et/profile linux0$i:/etc/profile
> scp -r /opt/app/doris/ linux0$i:/opt/app/
> done

5. Start

进入到fe的bin目录下执行
[root@zuomm01 bin]# ./start_fe.sh --daemon

In the production environment, it is strongly recommended to specify a separate directory instead of placing it in the Doris installation directory. It is best to use a separate disk (best if there is an SSD). If the machine has multiple IPs, such as internal and external networks, virtual machine docker, etc., IP binding is required to correctly identify them. JAVA_OPTS The default Java maximum heap memory is 4GB. It is recommended that the production environment be adjusted to more than 8G.

2.3. Install BE

1. Enter the conf directory of be and modify the configuration file.

vi be.conf  

#配置文件中指定数据存放路径: 
storage_root_path = /opt/data/dorisdata/bedata

#修改绑定 ip(每台机器修改成自己的 ip) 
priority_networks = 192.168.17.0/24 

Precautions:

storage_root_path defaults to be/storage and needs to be created manually. Use semicolons in English to separate multiple paths (do not add them after the last directory).

The medium on which the directory is stored, HDD or SSD, can be distinguished by the path. Capacity limits can be added at the end of each path, separated by commas, such as:

storage_root_path=/home/disk1/doris.HDD,50;/home/disk2/doris.SSD,10;/home/disk2/doris

illustrate:

/home/disk1/doris.HDD,50, means the storage limit is 50GB, HDD;

/home/disk2/doris.SSD,10, storage limit is 10GB, SSD;

/home/disk2/doris, the storage limit is the maximum disk capacity, the default is HDD

Is this good? Not

Because FE and BE are separate individuals, they do not know each other yet, so we need to connect them through the mysql client.

2. Install mysql

-- 安装yum
0 yum list  
-- 安装wget
1 yum -y install wget
-- 让wget直接去网页上安装mysql的安装包,可以解决一些依赖问题
2 wget -i -c http://dev.mysql.com/get/mysql57-community-release-el7-10.noarch.rpm
-- 安装刚才下载的mysql安装包
3 yum -y install mysql57-community-release-el7-10.noarch.rpm 
4-- 安装mysql的服务 
yum -y install mysql-community-server
这边有时候会报错
Failing package is: mysql-community-server-5.7.37-1.el7.x86_64
GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-mysql
解决办法 在yum install 版本后面加上 --nogpgcheck,即可绕过GPG验证成功安装。
比如yum install mysql-community-server --nogpgcheck
亲测没啥问题,安装成功
-- 启动mysql
5 service mysqld start
-- 查看mysql的3306端口,确认下mysql的服务是否打开
6 netstat -nltp | grep  3306
-- 去mysql的log日志中查找他的初始密码
7 grep "password" /var/log/mysqld.log   查看原始密码
grep 'password' /var/log/mysqld.log   
2020-06-24T07:21:25.731630Z 1 [Note] A temporary password is generated for root@localhost: Apd>;WYEc2Ir
-- 注释:在密码不是以空格开头的
-- 如果登录不进去的情况下,可以尝试 mysql -u root -p 然后回车在将密码复制进去
2020-06-24T07:21:48.097350Z 2 [Note] Access denied for user 'root'@'localhost' (using password: NO)
-- 登录mysql
8 登录  mysql -uroot -pWYEc2Ir
-- 修改mysql的密码,首先设置两个参数,这样密码就可以设置成123456这种简单的密码
9 修改密码
    mysql> set global validate_password_policy=0;
    mysql> set global validate_password_length=1;   这个两个设置以后 密码很简单不会报错
-- 修改mysql的密码
alter user user() identified by "XXXXXX";    xxxx 就是你的新密码

3. Use MySQL Client to connect to FE

mysql -h zuomm01 -P 9030 -uroot

这个只是用了mysql的客户端去连接doris的fe,不是启动的mysql哦!!!并且第一次进去的话,是不需要密码的
解释:
-h  连接地址
-P  端口号
-u  账号
-p  密码

--这个可以设置可以不设置啦,正常生产过程中都会设置一个相对比较复杂的密码,学习的时候就无所谓了
--如果想设置,下面的命令就可以
SET PASSWORD FOR 'root' = PASSWORD('123456');

4. After fe is started, you can check the running status of fe.

SHOW PROC '/frontends'\G;

5. Add BE node

ALTER SYSTEM ADD BACKEND "zuomm01:9050"; 
ALTER SYSTEM ADD BACKEND "zuomm02:9050"; 
ALTER SYSTEM ADD BACKEND "zuomm03:9050";

6. Check BE status

SHOW PROC '/backends';

Alive 为 false 表示该 BE 节点还是死的

7. Add environment variables

#doris_fe
export DORIS_FE_HOME=/opt/app/doris1.1.4/fe
export PATH=$PATH:$DORIS_FE_HOME/bin

#doris_be
export DORIS_BE_HOME=/opt/app/doris1.1.4/be
export PATH=$PATH:$DORIS_BE_HOME/bin

8. Start BE

启动 BE(每个节点) 
/opt/app/doris/be/bin/start_be.sh --daemon 

启动后再次查看BE的节点
mysql -h zuomm01 -P 9030 -uroot -p 123456
SHOW PROC '/backends'; 
Alive 为 true 表示该 BE 节点存活。

2.4. Deploy FS_Broker (optional)

Broker is deployed independently of Doris in the form of a plug-in. If you need to import data from a third-party storage system, you need to deploy the corresponding Broker. By default, fs_broker for reading HDFS, Baidu Cloud BOS, and Amazon S3 is provided. fs_broker is stateless, and it is recommended that each FE and BE node deploy a Broker.

Just start it! ! !

1. Start Broker

/opt/app/doris/fe/apache_hdfs_broker/bin/start_broker.sh --daemon 

2. Use mysql-client to connect to the started FE and execute the following command:

mysql -h zuomm01 -P 9030 -uroot -p 123456
ALTER SYSTEM ADD BROKER broker_name "zuomm01:8000","zuomm02:8000","zuomm03:8000";

当然你也可以一个个的加,并且 broker_name 这只是一个名字,可以自己取

3. Check Broker status

使用 mysql-client 连接任一已启动的 FE,执行以下命令查看 Broker 状态:
SHOW PROC "/brokers";  

2.5. Expansion and reduction

1. FE expansion and contraction

High availability of FE can be achieved by expanding FE to more than 3 nodes.

After logging in to the client using MySQL, you can use the sql command to check the FE status. Currently there is only one FE

mysql -h zuomm01 -P 9030 -uroot -p 
mysql> SHOW PROC '/frontends'\G;


*************************** 1. row ***************************
             Name: 192.168.17.3_9010_1661510658077
               IP: 192.168.17.3
         HostName: zuomm01
      EditLogPort: 9010
         HttpPort: 8030
        QueryPort: 9030
          RpcPort: 9020
             Role: FOLLOWER
         IsMaster: true
        ClusterId: 1133836578
             Join: true
            Alive: true
ReplayedJournalId: 2472
    LastHeartbeat: 2022-08-26 13:07:47
         IsHelper: true
           ErrMsg: 
          Version: 1.1.1-rc03-2dbd70bf9
 CurrentConnected: Yes
1 row in set (0.03 sec)

Add a new node of FE:

FE is divided into three roles: Leader, Follower and Observer. By default, a cluster can only have one Leader and multiple Followers and Observers. The Leader and Follower form a Paxos selection group. If the Leader goes down, the remaining Followers will automatically select a new Leader to ensure high write availability. Observer synchronizes Leader's data but does not participate in the election.

1. If only one FE is deployed, the FE will be the Leader by default. On this basis, several Followers and Observers can be added.

ALTER SYSTEM ADD FOLLOWER "zuomm02:9010"; 
ALTER SYSTEM ADD OBSERVER "zuomm03:9010";

2. Start the FE nodes on zuomm02 and zuomm03 respectively.

/opt/app/doris/fe/bin/start_fe.sh --helper  zuomm01:9010 --daemon

记住哦,如果是第一次添加的话,一定要加这两个参数  --helper  zuomm01:9010  
mysql> SHOW PROC '/frontends'\G;
*************************** 1. row ***************************
             Name: 192.168.17.4_9010_1661490723344
               IP: 192.168.17.4
         HostName: zuomm02
      EditLogPort: 9010
         HttpPort: 8030
        QueryPort: 0
          RpcPort: 0
             Role: FOLLOWER
         IsMaster: false
        ClusterId: 1133836578
             Join: false
            Alive: false
ReplayedJournalId: 0
    LastHeartbeat: NULL
         IsHelper: true
           ErrMsg: java.net.ConnectException: Connection refused (Connection refused)
          Version: NULL
 CurrentConnected: No
*************************** 2. row ***************************
             Name: 192.168.17.5_9010_1661490727316
               IP: 192.168.17.5
         HostName: zuomm03
      EditLogPort: 9010
         HttpPort: 8030
        QueryPort: 0
          RpcPort: 0
             Role: OBSERVER
         IsMaster: false
        ClusterId: 1133836578
             Join: false
            Alive: false
ReplayedJournalId: 0
    LastHeartbeat: NULL
         IsHelper: false
           ErrMsg: java.net.ConnectException: Connection refused (Connection refused)
          Version: NULL
 CurrentConnected: No
*************************** 3. row ***************************
             Name: 192.168.17.3_9010_1661510658077
               IP: 192.168.17.3
         HostName: zuomm01
      EditLogPort: 9010
         HttpPort: 8030
        QueryPort: 9030
          RpcPort: 9020
             Role: FOLLOWER
         IsMaster: true
        ClusterId: 1133836578
             Join: true
            Alive: true
ReplayedJournalId: 2577
    LastHeartbeat: 2022-08-26 13:13:33
         IsHelper: true
           ErrMsg: 
          Version: 1.1.1-rc03-2dbd70bf9
 CurrentConnected: Yes
3 rows in set (0.04 sec)

3. Delete FE node

ALTER SYSTEM DROP FOLLOWER[OBSERVER] "fe_host:edit_log_port"; 

ALTER SYSTEM DROP FOLLOWER "zuomm01:9010"; 
2. BE expansion and reduction

1. Add BE node

在 MySQL 客户端,通过 
ALTER SYSTEM ADD BACKEND 命令增加 BE 节点。 
ALTER SYSTEM ADD BACKEND "zuomm01:9050"; 

2. Delete BE nodes using DROP (not recommended)

ALTER SYSTEM DROP BACKEND "be_host:be_heartbeat_service_port";
ALTER SYSTEM DROP BACKEND "zuomm01:9050"; 

注意:DROP BACKEND 会直接删除该 BE,并且其上的数据将不能再恢复!!!所以我们强烈不推荐使用 DROP BACKEND 这种方式删除 BE 节点。当你使用这个语句时,会有对应的防误操作提示。

3. DECOMMISSION method to delete BE nodes (recommended)

ALTER SYSTEM DECOMMISSION BACKEND  "be_host:be_heartbeat_service_port"; 
ALTER SYSTEM DECOMMISSION BACKEND "zuomm01:9050"; 

1. This command is used to safely delete BE nodes. After the command is issued, Doris will try to migrate the data on the BE to other BE nodes. When all data migration is completed, Doris will automatically delete the node.

2. This command is an asynchronous operation. After execution, you can use SHOW PROC ‘/backends’; to see that the isDecommission status of the BE node is true. Indicates that the node is going offline.

3. This command may not be executed successfully. For example, when the remaining BE storage space is not enough to accommodate the data on the offline BE, or the number of remaining machines does not meet the minimum number of copies, the command cannot be completed, and BE will always be in a state where isDecommission is true.

4. The progress of DECOMMISSION can be viewed through TabletNum in SHOW PROC ‘/backends’;. If it is in progress, TabletNum will continue to decrease.

5. This operation can be canceled with the following command: CANCEL DECOMMISSION BACKEND “be_host:be_heartbeat_service_port”; After canceling 0, the data on the BE will maintain the current remaining data volume. Later, Doris re-balances the load.

3. Broker expansion and contraction

There is no hard requirement for the number of Broker instances. Usually one can be deployed per physical machine. Adding and deleting Brokers can be done with the following commands:

ALTER SYSTEM ADD BROKER broker_name "broker_host:broker_ipc_port";  
ALTER SYSTEM DROP BROKER broker_name "broker_host:broker_ipc_port";  
ALTER SYSTEM DROP ALL BROKER broker_name; 

Broker is a stateless process and can be started and stopped at will. Of course, after stopping, the job running on it will fail, just try again.

3. Data table design

3.1. Field type

Insert image description hereInsert image description here

3.2. Basic concepts of tables

1、Row & Column

A table includes rows and columns;

Row is a row of user data. Column is used to describe different fields in a row of data.

doris中的列分为两类:key列和valuekey列在doris中有两种作用:

聚合表模型中,key是聚合和排序的依据

其他表模型中,key是排序依据
2. Partitioning and bucketing

Partition: logically divides a table by rows (horizontally)

Tablet (also called bucket, bucket): physically divide a partition into rows (horizontally)

Insert image description here

3、Partition

1. The Partition column can specify one or more columns. In the aggregation model, the partition column must be a KEY column.

2. No matter what type the partition column is, you need to add double quotes when writing the partition value.

3. Theoretically there is no upper limit on the number of partitions.

4. When creating a table without using Partition, the system will automatically generate a Partition with the same name as the table name and a full value range. This Partition is invisible to users and cannot be deleted.

5. When creating a partition, you cannot add partitions with overlapping ranges.

range partition

range partition creation syntax

-- Range Partition
drop table if exists test.expamle_range_tbl;
CREATE TABLE IF NOT EXISTS test.expamle_range_tbl
(
    `user_id` LARGEINT NOT NULL COMMENT "用户id",
    `date` DATE NOT NULL COMMENT "数据灌入日期时间",
    `timestamp` DATETIME NOT NULL COMMENT "数据灌入的时间戳",
    `city` VARCHAR(20) COMMENT "用户所在城市",
    `age` SMALLINT COMMENT "用户年龄",
    `sex` TINYINT COMMENT "用户性别"
)
ENGINE=OLAP
DUPLICATE KEY(`user_id`, `date`) -- 表模型
-- 分区的语法
PARTITION BY RANGE(`date`) -- 指定分区类型和分区列
(
    -- 指定分区名称,分区的上界   前闭后开
    PARTITION `p201701` VALUES LESS THAN ("2017-02-01"), 
    PARTITION `p201702` VALUES LESS THAN ("2017-03-01"),
    PARTITION `p201703` VALUES LESS THAN ("2017-04-01")
)
DISTRIBUTED BY HASH(`user_id`) BUCKETS 1;

Insert image description here
1. The partition column is usually a time column to facilitate the management of old and new data.

2. Partition supports specifying only the upper bound through VALUES LESS THAN (...). The system will use the upper bound of the previous partition as the lower bound of the partition to generate an interval that is closed on the left and open on the right. At the same time, it also supports specifying upper and lower bounds through VALUES […) to generate an interval that is closed on the left and open on the right.

3. It is easier to understand to specify upper and lower bounds simultaneously through VALUES […). Here is an example to illustrate how the partition range changes when using the VALUES LESS THAN (…) statement to add or delete partitions:

As can be seen in the table creation statement of expamle_range_tbl above, when the table creation is completed, the following three partitions will be automatically generated:

-- 查看表中分区得情况
SHOW PARTITIONS FROM test.expamle_range_tbl \G;

mysql> SHOW PARTITIONS FROM test.expamle_range_tbl \G;
*************************** 1. row ***************************
             PartitionId: 12020
           PartitionName: p201701
          VisibleVersion: 1
      VisibleVersionTime: 2022-08-30 21:57:36
                   State: NORMAL
            PartitionKey: date
                   Range: [types: [DATE]; keys: [0000-01-01]; ..types: [DATE]; keys: [2017-02-01]; )
         DistributionKey: user_id
                 Buckets: 1
          ReplicationNum: 3
           StorageMedium: HDD
            CooldownTime: 9999-12-31 23:59:59
LastConsistencyCheckTime: NULL
                DataSize: 0.000 
              IsInMemory: false
       ReplicaAllocation: tag.location.default: 3
*************************** 2. row ***************************
             PartitionId: 12021
           PartitionName: p201702
          VisibleVersion: 1
      VisibleVersionTime: 2022-08-30 21:57:36
                   State: NORMAL
            PartitionKey: date
                   Range: [types: [DATE]; keys: [2017-02-01]; ..types: [DATE]; keys: [2017-03-01]; )
         DistributionKey: user_id
                 Buckets: 1
          ReplicationNum: 3
           StorageMedium: HDD
            CooldownTime: 9999-12-31 23:59:59
LastConsistencyCheckTime: NULL
                DataSize: 0.000 
              IsInMemory: false
       ReplicaAllocation: tag.location.default: 3
*************************** 3. row ***************************
             PartitionId: 12022
           PartitionName: p201703
          VisibleVersion: 1
      VisibleVersionTime: 2022-08-30 21:57:35
                   State: NORMAL
            PartitionKey: date
                   Range: [types: [DATE]; keys: [2017-03-01]; ..types: [DATE]; keys: [2017-04-01]; )
         DistributionKey: user_id
                 Buckets: 1
          ReplicationNum: 3
           StorageMedium: HDD
            CooldownTime: 9999-12-31 23:59:59
LastConsistencyCheckTime: NULL
                DataSize: 0.000 
              IsInMemory: false
       ReplicaAllocation: tag.location.default: 3
3 rows in set (0.00 sec)

Create three partitions

p201701: [MIN_VALUE,  2017-02-01)
p201702: [2017-02-01, 2017-03-01)
p201703: [2017-03-01, 2017-04-01)

When we add a partition p201705 VALUES LESS THAN ("2017-06-01"), the partition results are as follows:

ALTER TABLE test.expamle_range_tbl ADD PARTITION p201705 VALUES LESS THAN ("2017-06-01");
p201701: [MIN_VALUE,  2017-02-01)
p201702: [2017-02-01, 2017-03-01)
p201703: [2017-03-01, 2017-04-01)
p201705: [2017-04-01, 2017-06-01)

At this time, we delete partition p201703, and the partition results are as follows:

ALTER TABLE test.expamle_range_tbl DROP PARTITION p201703;
p201701: [MIN_VALUE,  2017-02-01)
p201702: [2017-02-01, 2017-03-01)
p201705: [2017-04-01, 2017-06-01)

Notice that the partition range of p201702 and p201705 has not changed, and a hole appears between these two partitions: [2017-03-01, 2017-04-01). That is, if the imported data range is within this hole range, it cannot be imported.

Deleting a partition will not change the scope of existing partitions. Deleting a partition may create holes. When adding a partition through the VALUES LESS THAN statement, the lower bound of the partition immediately follows the upper bound of the previous partition.

In addition to the single-column partitioning we saw above, Range partitioning also supports multi-column partitioning. Examples are as follows:

PARTITION BY RANGE(`date`, `id`)     前闭后开
(
    PARTITION `p201701_1000` VALUES LESS THAN ("2017-02-01", "1000"),
    PARTITION `p201702_2000` VALUES LESS THAN ("2017-03-01", "2000"),
    PARTITION `p201703_all`  VALUES LESS THAN ("2017-04-01")-- 默认采用id类型的最小值
)

In the above example, we specified date (DATE type) and id (INT type) as partitioning columns. The final partition obtained by the above example is as follows:

* p201701_1000:    [(MIN_VALUE,  MIN_VALUE), ("2017-02-01", "1000")   )
* p201702_2000:    [("2017-02-01", "1000"),  ("2017-03-01", "2000")   )
* p201703_all:     [("2017-03-01", "2000"),  ("2017-04-01", MIN_VALUE)) 

Note that the last partition user is missing and only the partition value of the date column is specified, so the partition value of the id column will be filled with MIN_VALUE by default. When the user inserts data, the partition column values ​​will be compared in order, and the corresponding partitions will eventually be obtained. Examples are as follows:

Insert image description here
List partition

1. The partition column supports BOOLEAN, TINYINT, SMALLINT, INT, BIGINT, LARGEINT, DATE, DATETIME, CHAR, VARCHAR data types, and the partition value is an enumeration value. A partition can be hit only if the data is one of the target partition enumeration values.

2. Partition supports specifying the enumeration value contained in each partition through VALUES IN (…).

The following uses an example to illustrate the changes in partitions when adding or deleting partitions.

List partition creation syntax

-- List Partition

CREATE TABLE IF NOT EXISTS test.expamle_list_tbl
(
    `user_id` LARGEINT NOT NULL COMMENT "用户id",
    `date` DATE NOT NULL COMMENT "数据灌入日期时间",
    `timestamp` DATETIME NOT NULL COMMENT "数据灌入的时间戳",
    `city` VARCHAR(20) NOT NULL COMMENT "用户所在城市",
    `age` SMALLINT NOT NULL COMMENT "用户年龄",
    `sex` TINYINT NOT NULL COMMENT "用户性别",
    `last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT "用户最后一次访问时间",
    `cost` BIGINT SUM DEFAULT "0" COMMENT "用户总消费",
    `max_dwell_time` INT MAX DEFAULT "0" COMMENT "用户最大停留时间",
    `min_dwell_time` INT MIN DEFAULT "99999" COMMENT "用户最小停留时间"
)
ENGINE=olap
AGGREGATE KEY(`user_id`, `date`, `timestamp`, `city`, `age`, `sex`)
PARTITION BY LIST(`city`)
(
    PARTITION `p_cn` VALUES IN ("Beijing", "Shanghai", "Hong Kong"),
    PARTITION `p_usa` VALUES IN ("New York", "San Francisco"),
    PARTITION `p_jp` VALUES IN ("Tokyo")
)
-- 指定分桶的语法
DISTRIBUTED BY HASH(`user_id`) BUCKETS 1
PROPERTIES
(
    "replication_num" = "3"
);

As shown in the example_list_tbl example above, when the table creation is completed, the following three partitions will be automatically generated:

p_cn: ("Beijing", "Shanghai", "Hong Kong")
p_usa: ("New York", "San Francisco")
p_jp: ("Tokyo")

List partitioning also supports multi-column partitioning, the example is as follows

PARTITION BY LIST(`id`, `city`)
(
    PARTITION `p1_city` VALUES IN (("1", "Beijing"), ("1", "Shanghai")),
    PARTITION `p2_city` VALUES IN (("2", "Beijing"), ("2", "Shanghai")),
    PARTITION `p3_city` VALUES IN (("3", "Beijing"), ("3", "Shanghai"))
)

In the above example, we specified id (INT type) and city (VARCHAR type) as partitioning columns. The final partition obtained by the above example is as follows:


* p1_city: [("1", "Beijing"), ("1", "Shanghai")]
* p2_city: [("2", "Beijing"), ("2", "Shanghai")]
* p3_city: [("3", "Beijing"), ("3", "Shanghai")]
4、Bucket

1. If Partition is used, the DISTRIBUTED... statement describes the rules for dividing data in each partition. If Partition is not used, it describes the partitioning rules for the data of the entire table.

2. The bucketing column can be multiple columns, but it must be a Key column. The bucketing column can be the same as or different from the Partition column.

3. The selection of bucketing columns is a trade-off between query throughput and query concurrency:

  • If you select multiple bucketing columns, the data is distributed more evenly. If a query condition does not contain equal conditions for all bucketed columns, the query will trigger simultaneous scanning of all buckets, so that the query throughput will increase and the latency of a single query will decrease< /span>. This method is suitable for query scenarios with high throughput and low concurrency.
  • If only one or a few bucketed columns are selected, the corresponding point query can trigger only one bucket scan. At this time, when multiple point queries are concurrent, these queries have a greater probability of triggering different bucket scans respectively, and the IO impact between each query is small (especially when different buckets are distributed on different disks), so this This method is suitable for high-concurrency point query scenarios.

4. Theoretically there is no upper limit on the number of buckets

Insert image description hereRecommendations on the number of Partitions and Buckets and the amount of data.

1. The total number of Tablets in a table is equal to (Partition num * Bucket num).

2. The number of tablets in a table, without considering expansion, is recommended to be slightly more than the number of disks in the entire cluster.

3. There is no theoretical upper or lower bound on the data volume of a single tablet, but it is recommended to be in the range of 1G - 10G. If the data volume of a single tablet is too small, the aggregation effect of the data will be poor and the metadata management pressure will be high. If the amount of data is too large, it will not be conducive to copy migration and completion, and will increase the cost of retrying failed Schema Change or Rollup operations (the granularity of failed retries for these operations is Tablet). Bucketing should control the amount of data in the bucket and prevent it from being too large or too small.

4. When the data volume principle and quantity principle of Tablet conflict, it is recommended to give priority to the data volume principle.

5. When creating a table, the number of Buckets for each partition is specified uniformly. However, when adding partitions dynamically (ADD PARTITION), the number of Buckets for the new partition can be specified individually. You can use this function to conveniently deal with data shrinkage or expansion.

6. Once the number of Buckets for a Partition is specified, it cannot be changed. Therefore, when determining the number of Buckets, cluster expansion needs to be considered in advance. For example, there are currently only 3 hosts, and each host has 1 disk. If the number of Buckets is only set to 3 or less, even if you add more machines later, the concurrency cannot be improved.

Example

假设在有10台BE,每台BE一块磁盘的情况下。

如果一个表总大小为 500MB,则可以考虑4-8个分片。

5GB:8-16个分片。

50GB:32个分片。

500GB:建议分区,每个分区大小在 50GB 左右,每个分区16-32个分片。

5TB:建议分区,每个分区大小在 500GB 左右,每个分区16-32个分片。

注:表的数据量可以通过 SHOW DATA命令查看,结果除以副本数,即表的数据量。
5. Choice of composite partition and single partition

Composite partition

  • The first level is called Partition. Users can specify a certain dimension column as a partition column (currently only integer and time type columns are supported), and specify the value range of each partition.

  • The second level is called Distribution, which means bucketing. Users can specify one or more dimension columns and the number of buckets to perform HASH distribution on the data.

It is recommended to use composite partitions in the following scenarios:

  • If there is a time dimension or similar dimensions with ordered values, such dimension columns can be used as partition columns. Partition granularity can be evaluated based on import frequency, partition data volume, etc. region, time

  • Historical data deletion requirements: If there is a need to delete historical data (for example, only the last N days of data will be retained). Using composite partitions, you can achieve your goal by deleting historical partitions. Data can also be deleted by sending a DELETE statement within a specified partition.

  • Solve the problem of data skew: Each partition can independently specify the number of buckets. For example, if partitioning is by day, when the amount of data varies greatly from day to day, you can reasonably divide the data in different partitions by specifying the number of buckets for the partition. It is recommended to select a column with a high degree of differentiation for the bucket column.

Users can also use single partitions instead of composite partitions. Then the data is only HASH distributed.

practise

6. Create a table and specify the number of partitions and buckets
需求:现在有如下数据需要插入到表中,请创建一个表,要求按照月份分区,2个分桶   ==》 
(用哪一个,几个列作为分桶字段)
-- 数据
uid      name    age     gender      province       birthday
  1       zss    18       male       jiangsu        2022-11-01
  2       lss    18       male       zhejiang       2022-11-10
  3       ww     18       male       jiangsu        2022-12-01
  4       zll    18       female     zhejiang       2022-09-11
  5       tqq    18       female     jiangsu        2022-09-02
  6       aa     18       female     jiangsu        2022-10-11
  7       bb     18       male       zhejiang       2022-11-08
  
  
  
CREATE TABLE IF NOT EXISTS test.user_info
(
uid  int,
name varchar(50),
age  int,
gender  varchar(20),
province  varchar(100),
birthday date
)
ENGINE=olap
duplicate KEY(uid,name)
PARTITION BY range(`birthday`)
(
partition `p202209`  values less than ('2022-10-01'),
partition `p202210`  values less than ('2022-11-01'),
partition `p202211`  values less than ('2022-12-01'),
partition `p202212`  values less than ('2023-01-01')
)
-- 指定分桶的语法
DISTRIBUTED BY HASH(uid) BUCKETS 1
PROPERTIES(
  "replication_num" = "2",
  ""
);
7、PROPERTIES

At the end of the table creation statement, you can use the PROPERTIES keyword to set some table property parameters (there are many parameters)

Number of shard copies

replication_num

The number of copies of each tablet. The default is 3, it is recommended to keep the default. In the table creation statement, the number of Tablet copies in all Partitions is specified uniformly. When adding a new partition, you can individually specify the number of copies of the Tablet in the new partition.

The number of replicas can be modified at runtime. It is highly recommended to keep an odd number.

The maximum number of replicas depends on the number of independent IPs in the cluster (note not the number of BEs). The principle of replica distribution in Doris is that replicas of the same Tablet are not allowed to be distributed on the same physical machine, and the physical machine is identified through IP. Therefore, even if 3 or more BE instances are deployed on the same physical machine, if the IPs of these BEs are the same, the number of replicas can still only be set to 1. For some dimension tables that are small and updated infrequently, you can consider setting more replicas. In this way, during Join query, there is a greater probability of performing local data Join.

Storage media and hot data cooling time

storage_medium

storage_cooldown_time datetime

When creating a table, you can uniformly specify the initial storage media of all Partitions and the cooling time of hot data, such as:

"storage_medium" = "SSD"
"storage_cooldown_time" = "2022-11-30 00:00:00"

The default initial storage medium can be specified by default_storage_medium=xxx in fe's configuration file fe.conf. If not specified, the default is HDD. If SSD is specified, the data is initially stored on the SSD. If storage_cooldown_time is not set, data will be automatically migrated from SSD to HDD after 30 days by default. If storage_cooldown_time is specified, data will not be migrated until storage_cooldown_time is reached.

Note that when storage_medium is specified, this parameter is only a "best effort" setting if the FE parameter enable_strict_storage_medium_check is False. Even if there is no SSD storage medium set up in the cluster, no error will be reported, but it will be automatically stored in the available data directory. Likewise, if the SSD media is inaccessible or lacks space, data may be initially stored directly on other available media. When the data expires and is migrated to HDD, if the HDD media is inaccessible and there is insufficient space, the migration may fail (but it will continue to try). If the FE parameter enable_strict_storage_medium_check is True, when no SSD storage medium is set up in the cluster, the error Failed to find enough host in all backends with storage medium is SSD will be reported.

3.3. Data table model

Doris' data models are mainly divided into three categories:

  • Aggregate aggregation model
  • Unique unique model
  • Duplicate detailed model
1. Aggregate model

is a table model that automatically aggregates data with the same key. Columns in the table are divided into Key (dimension column) and Value (metric column) according to whether AggregationType is set. Those without AggregationType are called Key, and those with AggregationType set are called Value. When we import data, rows with the same Key column will be aggregated into one row, and the Value column will be aggregated according to the set AggregationType. AggregationType currently has the following four aggregation methods:

  • SUM: Sum, the Values ​​of multiple rows are accumulated.
  • REPLACE: Replacement, the Value in the next batch of data will replace the Value in the previously imported row.
  • REPLACE_IF_NOT_NULL: Do not update when a null value is encountered.
  • MAX: Keep the maximum value.
  • MIN: Keep the minimum value.

There is the following scenario: a table needs to be created to record the consumption behavior information of each user of the company, with the following fields
Insert image description here Moreover, the company is particularly concerned about this data. Reports

Insert image description hereEvery time you want to see this report, you need to run a statistical sql on the "details table"

Select
    user_id,data,city,age,gender,
    max(visit_data) as last_visit_data,
    sum(cost) as cost,
    max(dwell_time) as max_dwell_time,
    min(dwell_time) as min_dwell_time
From  t
Group by  user_id,data,city,age,gender  -- 对应的是聚合模型型key

Aggregation model
Insert image description here
sql example

-- 这是一个用户消费和行为记录的数据表
CREATE TABLE IF NOT EXISTS test.ex_user
(
 `user_id` LARGEINT NOT NULL COMMENT "用户 id",
 `date` DATE NOT NULL COMMENT "数据灌入日期时间",
 `city` VARCHAR(20) COMMENT "用户所在城市",
 `age` SMALLINT COMMENT "用户年龄",
 `sex` TINYINT COMMENT "用户性别",
 
 `last_visit_date` DATETIME REPLACE  DEFAULT "1970-01-01 00:00:00" COMMENT "用户最后一次访问时间",
 `cost` BIGINT SUM DEFAULT "0" COMMENT "用户总消费",
 `max_dwell_time` INT MAX DEFAULT "0" COMMENT "用户最大停留时间",
 `min_dwell_time` INT MIN DEFAULT "99999" COMMENT "用户最小停留时间" 
 )
ENGINE=olap
AGGREGATE KEY(`user_id`, `date`, `city`, `age`, `sex`)
-- 分区
-- 分桶
DISTRIBUTED BY HASH(`user_id`) BUCKETS 1;

Insert data

insert into test.ex_user values\
(10000,'2017-10-01','北京',20,0,'2017-10-01 06:00:00',20,10,10),\
(10000,'2017-10-01','北京',20,0,'2017-10-01 07:00:00',15,2,2),\
(10001,'2017-10-01','北京',30,1,'2017-10-01 17:05:45',2,22,22),\
(10002,'2017-10-02','上海',20,1,'2017-10-02 12:59:12',200,5,5),\
(10003,'2017-10-02','广州',32,0,'2017-10-02 11:20:00',30,11,11),\
(10004,'2017-10-01','深圳',35,0,'2017-10-01 10:00:15',100,3,3),\
(10004,'2017-10-03','深圳',35,0,'2017-10-03 10:20:22',11,6,6);

When checking the data, I found that there are only 6 pieces of data left. This is because when the keys are the same, the subsequent results are aggregated.
Insert image description here
Data aggregation, in Doris, is as follows Three stages occur:

  • 1. The ETL stage of each batch of data import. This stage performs aggregation within each batch of imported data.
  • 2. The stage when the underlying BE performs data compaction. BE will further aggregate the imported data from different batches.
  • 3. Data query stage. During data query, corresponding aggregation will be performed on the data involved in the query.

Insert image description here

2. UNIQUE model

is a table model for automatic deduplication of data with the same key. In some multidimensional analysis scenarios, users are more concerned about how to ensure the uniqueness of the Key, that is, how to obtain the Primary Key uniqueness constraint. Therefore, the Uniq data model was introduced. This model is essentially a special case of the aggregate model and a simplified representation of the table structure.

drop table if exists test.user;
CREATE TABLE IF NOT EXISTS test.user
(
 `user_id` LARGEINT NOT NULL COMMENT "用户 id",
 `username` VARCHAR(50) NOT NULL COMMENT "用户昵称",
 `city` VARCHAR(20) COMMENT "用户所在城市",
 `age` SMALLINT COMMENT "用户年龄",
 `sex` TINYINT COMMENT "用户性别",
 `phone` LARGEINT COMMENT "用户电话",
 `address` VARCHAR(500) COMMENT "用户地址",
 `register_time` DATETIME COMMENT "用户注册时间" )
UNIQUE KEY(`user_id`, `username`)
DISTRIBUTED BY HASH(`user_id`) BUCKETS 1;

Insert data

insert into test.user values\
(10000,'zss','北京',18,0,12345678910,'北京朝阳区 ','2017-10-01 07:00:00'),\
(10000,'zss','北京',19,0,12345678910,'北京顺义区 ','2018-10-01 07:00:00'),\
(10000,'lss','北京',20,0,12345678910,'北京海淀区','2017-11-15 06:10:20');

insert into test.user1 values\
(10000,'zss','北京',18,0,12345678910,'北京朝阳区 ','2017-10-01 07:00:00'),\
(10000,'zss','北京',19,0,12345678910,'北京顺义区 ','2018-10-01 07:00:00'),\
(10000,'lss','北京',20,0,12345678910,'北京海淀区','2017-11-15 06:10:20');

After querying the results, it is found that the same data will be replaced.

Insert image description here
Uniq models can be completely replaced by the REPLACE method in aggregate models. Its internal implementation and data storage methods are exactly the same.

3. Duplicate model

is a table model that stores detailed data, and does neither aggregation nor deduplication. In some multidimensional analysis scenarios, the data has neither primary keys nor aggregation requirements. The Duplicate data model can meet these needs. The data is stored exactly as it is in the imported file, without any aggregation. Even if two rows of data are identical, they will be retained. The DUPLICATE KEY specified in the table creation statement is only used to indicate which columns the underlying data is sorted by.

Table creation statement:

CREATE TABLE IF NOT EXISTS test.log_detail
(
 `timestamp` DATETIME NOT NULL COMMENT "日志时间",
 `type` INT NOT NULL COMMENT "日志类型",
 `error_code` INT COMMENT "错误码",
 `error_msg` VARCHAR(1024) COMMENT "错误详细信息",
 `op_id` BIGINT COMMENT "负责人 id",
 `op_time` DATETIME COMMENT "处理时间" )
DUPLICATE KEY(`timestamp`, `type`)
DISTRIBUTED BY HASH(`timestamp`) BUCKETS 1;

Insert some data

insert into test.log_detail values\
('2017-10-01 08:00:05',1,404,'not found page', 101, '201e7-10-01 08:00:05'),\
('2017-10-01 08:00:05',1,404,'not found page', 101, '2017-10-01 08:00:05'),\
('2017-10-01 08:00:05',2,404,'not found page', 101, '2017-10-01 08:00:06'),\
('2017-10-01 08:00:06',2,404,'not found page', 101, '2017-10-01 08:00:07');

After querying the results, we found that all inserted data will be retained. Even if the two pieces of data are exactly the same, they will be retained. User behavior log data can be operated normally.

Insert image description here

4. Selection of data model

The data model is determined when the table is created and cannot be modified; therefore, it is very important to choose an appropriate data model.

  • The Aggregate model can greatly reduce the amount of data scanned and the amount of calculation required for aggregation queries through pre-aggregation, making it very suitable for report query scenarios with fixed patterns.

  • The Uniq model can guarantee primary key uniqueness constraints for scenarios that require unique primary key constraints. However, the query advantages brought by pre-aggregation such as ROLLUP cannot be used (because the essence is REPLACE and there is no aggregation method such as SUM).

  • Duplicate is suitable for queries of any dimension. Although it is also impossible to take advantage of the pre-aggregation feature, it is not constrained by the aggregation model and can take advantage of the column storage model (only read relevant columns, without reading all Key columns)

3.4. Index

Indexes are used to help filter or find data quickly.

Currently, Doris mainly supports two types of indexes:

  • Built-in smart index: including prefix index and ZoneMap index.

  • User-created secondary indexes: including Bloom Filter index and Bitmap inverted index.

The ZoneMap index is an index information automatically maintained for each column in column storage format, including Min/Max, the number of Null values, etc. This indexing is transparent to the user.

1. Prefix index

In doris, there are the following constraints on prefix indexes:

1. The maximum length of his index key is 36 bytes

2. When he encounters the varchar data type, it will be automatically truncated even if it does not exceed 36 bytes.

Example 1: In the following table we define: user_id, age, message as the key of the table;

Insert image description hereThen, when doris creates a prefix index for this table, the index keys it generates are as follows:

user_id(8 Bytes) + age(4 Bytes) + message(prefix 24 Bytes)

Example 2: In the following table we define: age, user_name, message as the key of the table

age(4 Bytes) +user_name(20 Bytes)   指定key的时候

Why is this result?

Although it has not exceeded 36 bytes, it has already encountered a varchar field, which is automatically truncated and will not be taken further.

When our query condition is the prefix of the prefix index, the query speed can be greatly accelerated. For example, in the first example, we execute the following query:

SELECT * FROM table WHERE user_id=1829239 and age=20

The efficiency of this query will be much higher than the following query:

SELECT * FROM table WHERE age=20

Therefore, when creating a table, correctly selecting the column order can greatly improve query efficiency.

2. Bloom Filter Index

Insert image description hereSummarize:

1. Bloom Filter is essentially a bitmap structure used to determine whether a value exists.

2. There will be a small probability of misjudgment due to the inherent collision of the hash algorithm.

3. In doris, tablets are created as granularity, and a Bloom filter index is created for each tablet.

3. How to create a BloomFilter index?

1. Specify when creating the table

PROPERTIES (
"bloom_filter_columns"="name,age"
)

CREATE TABLE IF NOT EXISTS sale_detail_bloom  (
    sale_date date NOT NULL COMMENT "销售时间",
    customer_id int NOT NULL COMMENT "客户编号",
    saler_id int NOT NULL COMMENT "销售员",
    sku_id int NOT NULL COMMENT "商品编号",
    category_id int NOT NULL COMMENT "商品分类",
    sale_count int NOT NULL COMMENT "销售数量",
    sale_price DECIMAL(12,2) NOT NULL COMMENT "单价",
    sale_amt DECIMAL(20,2)  COMMENT "销售总金额"
)
Duplicate  KEY(sale_date, customer_id,saler_id,sku_id,category_id)
PARTITION BY RANGE(sale_date)
(
PARTITION P_202111 VALUES [('2021-11-01'), ('2021-12-01'))
)
DISTRIBUTED BY HASH(saler_id) BUCKETS 1
PROPERTIES (
"replication_num" = "1",
"bloom_filter_columns"="saler_id,category_id",
"storage_medium" = "SSD"
);

2. Specify when alter modifies the table

ALTER TABLE sale_detail_bloom SET ("bloom_filter_columns" = "k1,k3");

ALTER TABLE sale_detail_bloom SET ("bloom_filter_columns" = "k1,k4");

3. View the BloomFilter index

mysql> SHOW CREATE TABLE sale_detail_bloom \G;
*************************** 1. row ***************************
       Table: sale_detail_bloom
Create Table: CREATE TABLE `sale_detail_bloom` (
  `sale_date` date NOT NULL COMMENT "销售时间",
  `customer_id` int(11) NOT NULL COMMENT "客户编号",
  `saler_id` int(11) NOT NULL COMMENT "销售员",
  `sku_id` int(11) NOT NULL COMMENT "商品编号",
  `category_id` int(11) NOT NULL COMMENT "商品分类",
  `sale_count` int(11) NOT NULL COMMENT "销售数量",
  `sale_price` decimal(12, 2) NOT NULL COMMENT "单价",
  `sale_amt` decimal(20, 2) NULL COMMENT "销售总金额"
) ENGINE=OLAP
DUPLICATE KEY(`sale_date`, `customer_id`, `saler_id`, `sku_id`, `category_id`)
COMMENT "OLAP"
PARTITION BY RANGE(`sale_date`)
(PARTITION P_202111 VALUES [('2021-11-01'), ('2021-12-01')),
PARTITION P_202208 VALUES [('2022-08-01'), ('2022-09-01')),
PARTITION P_202209 VALUES [('2022-09-01'), ('2022-10-01')),
PARTITION P_202210 VALUES [('2022-10-01'), ('2022-11-01')))
DISTRIBUTED BY HASH(`saler_id`) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 3",
"bloom_filter_columns" = "category_id, saler_id"
)
1 row in set (0.00 sec)
4. Modify/delete BloomFilter index
ALTER TABLE sale_detail_bloom SET ("bloom_filter_columns" = "");
5. Doris BloomFilter applicable scenarios

You can consider establishing a Bloom Filter index on a column when the following conditions are met:

1. BloomFilter is used to speed up queries in query scenarios where prefix indexes cannot be used.

2. The query will be filtered frequently based on this column, and most of the query conditions are in and = filtering.
Unlike Bitmap, BloomFilter is suitable for high cardinality columns. Such as UserID. Because if create

3. On low-cardinality columns, such as the "gender" column, each block will contain almost all values, causing the BloomFilter index to lose meaning. Fields are random

Precautions for using Doris BloomFilter

1. Building Bloom Filter indexes on columns of Tinyint, Float, and Double types is not supported.

2. Bloom Filter index only accelerates in and = filter queries.

3. You can use explain to check which index was hit - there is no way to check.

6. Bitmap index

Insert image description here1. Create an index

CREATE INDEX [IF NOT EXISTS] index_name ON table1 (siteid) USING BITMAP COMMENT 'balabala';
create index user_id_bitmap on sale_detail_bloom(sku_id) USING BITMAP COMMENT '使用user_id创建的bitmap索引';

2. View the index

SHOW INDEX FROM example_db.table_name;

3. Delete index

DROP INDEX [IF EXISTS] index_name ON [db_name.]table_name;

Precautions

1. Bitmap indexes are only created on a single column.

2. Bitmap index can be applied to all columns of Duplicate and Uniq data models and key columns of Aggregate model.

3. The data types supported by bitmap index are as follows: (old version only supports bitmap type)
TINYINT, SMALLINT, INT, BIGINT, CHAR, VARCHAR, DATE, DATETIME, LARGEINT, DECIMAL, BOOL

4. The bitmap index only takes effect under Segment V2 (Segment V2 is an upgraded version of the file format). When creating an index, the storage format of the table will be converted to V2 format by default

practise

-- 数据
uid      name    age     gender      province         term
  1       zss    18       male       jiangsu            1
  2       lss    16       male       zhejiang           2
  3       ww     19       male       jiangsu            1
  4       zll    18       female     zhejiang           3
  5       tqq    17       female     jiangsu            2
  6       aa     18       female     jiangsu            2
  7       bb     17       male       zhejiang           3

提要求:
这张表,以后需要经常按照如下条件查询
-- 前缀索引   key   ==》 term  province 
where term =??
where term =??  and  province = ??
 -- 布隆过滤器索引
where name=?? 
-- bitmap索引
where uid=??    

SET GLOBAL enable_profile=true;

--主要是索引怎么建
create table stu(
term int,
province varchar(100),
uid int,
name varchar(30),
age int,
gender varchar(30)
)
engine = olap 
duplicate key(term ,province)
distributed by hash(uid) buckets 2
properties(
"bloom_filter_columns"="name"
);

3.5、Rollup

ROLLUP means "rolling up" in multidimensional analysis, which means further aggregating data to a specified granularity.

Previous aggregation model:
Insert image description here

1. Find the total daily sales of each user in each city
select 
user_id,city,datesum(sum_cost) as sum_cost
from t
group by user_id,city,date
-- user_id      date             city      sum_cost
   10000        2017/10/2        北京        195
   10000        2017/10/1        上海        100
   10000        2017/10/2        上海        30 
   10000        2017/10/3        上海        55 
   10000        2017/10/4        上海        65 
   10001        2017/10/1        上海        30
   10001        2017/10/2        上海        10        
   10001        2017/10/2        天津        18         
   10001        2017/10/1        天津        46
   10002        2017/10/1        天津        55
   10002        2017/10/3        北京        55 
   10002        2017/10/2        天津        20        
   10002        2017/10/2        北京        35        
2. Find the total consumption of each user and each city
select 
user_id,city,
sum(sum_cost) as sum_cost
from t
group by user_id,city
user_id      city       sum_cost
10000        北京        195
10000        上海        100
10001        上海        40
10001        天津        64
10002        天津        75
10002        北京        90
3.求每个用户的总消费额

3. Find the total consumption of each user
select 
user_id,
sum(sum_cost) as sum_cost
from t
group by user_id
user_id        sum_cost
10000            295
10001            104
10002            165

Insert image description here

4. Basic concepts

The table created by the table creation statement is called Base Table (Base Table)

On top of the Base table, we can create as many ROLLUP tables as we want. These ROLLUP data are generated based on the Base table and are physically stored independently.

Benefits of Rollup tables:

1. Sharing the same table name with the base table, doris will select the appropriate data source (appropriate table) according to the specific query logic to calculate the results.

2. For additions, deletions and modifications of data in the base table, the rollup table will be automatically updated and synchronized.

3.6. Materialized view

It is a special table in which query results are stored in advance. The emergence of materialized views is mainly to satisfy users. It can not only analyze the original detailed data in any dimension, but also quickly perform analysis and query on fixed dimensions.

1. Benefits of materialized views

1. Precomputed results can be reused to improve query efficiency.

2. Automatically maintain the result data in the materialized view table in real time without additional labor costs (automatic maintenance will have the cost of computing resources)

3. When querying, the optimal materialized view will be automatically selected.

2. Materialized View VS Rollup

1. The difference between rollup and materialized view under the detailed model table:

  • Materialized views: pre-aggregation can be implemented, and a new set of prefix indexes is added.
  • rollup: For detailed models, add a new set of prefix indexes

2. Under the aggregation model, the functions are consistent

3. Create materialized views
CREATE MATERIALIZED VIEW [MV name] as 
[query]  -- sql逻辑

--[MV name]:雾化视图的名称
--[query]:查询条件,基于base表创建雾化视图的逻辑

After the materialized view is successfully created, the user's query does not need to make any changes, that is, it is still the base table of the query. Doris will automatically select an optimal materialized view based on the current query statement, read data from the materialized view, and calculate it.

Users can use the EXPLAIN command to check whether the current query uses materialized views.

4. Example

Create a Base table:

The user has a detailed sales record table, which stores the transaction ID, salesperson, sales store, sales time, and amount of each transaction.

create table sales_records(
record_id int, 
seller_id int, 
store_id int, 
sale_date date, 
sale_amt bigint) 
duplicate key (record_id,seller_id,store_id,sale_date)
distributed by hash(record_id) buckets 2
properties("replication_num" = "1");

-- 插入数据
insert into sales_records values \
(1,1,1,'2022-02-02',100),\
(2,2,1,'2022-02-02',200),\
(3,3,2,'2022-02-02',300),\
(4,3,2,'2022-02-02',200),\
(5,2,1,'2022-02-02',100),\
(6,4,2,'2022-02-02',200),\
(7,7,3,'2022-02-02',300),\
(8,2,1,'2022-02-02',400),\
(9,9,4,'2022-02-02',100);

If the user needs to frequently count the sales volume of different stores

1. Create a materialized view

-- 不同门店,看总销售额的一个场景
select store_id, sum(sale_amt)  
from sales_records  
group by store_id; 

CREATE MATERIALIZED VIEW store_id_sale_amonut as 
select store_id, sum(sale_amt)  
from sales_records  
group by store_id;

CREATE MATERIALIZED VIEW store_amt as 
select store_id, sum(sale_amt)  as sum_amount
from sales_records  
group by store_id; 


--针对上述场景做一个物化视图
create materialized view store_amt as  
select store_id, sum(sale_amt) as sum_amount 
from sales_records  
group by store_id; 

2. Check whether the materialized view is constructed (the creation of the materialized view is an asynchronous process)

show alter table materialized view from 库名  order by CreateTime desc limit 1;

show alter table materialized view from test order by CreateTime desc limit 1;


+-------+---------------+---------------------+---------------------+---------------+-----------------+----------+---------------+----------+------+----------+---------+
| JobId | TableName     | CreateTime          | FinishTime          | BaseIndexName | RollupIndexName | RollupId | TransactionId | State    | Msg  | Progress | Timeout |
+-------+---------------+---------------------+---------------------+---------------+-----------------+----------+---------------+----------+------+----------+---------+
| 15093 | sales_records | 2022-11-25 10:32:33 | 2022-11-25 10:32:59 | sales_records | store_amt       | 15094    | 3008          | FINISHED |      | NULL     | 86400   |
+-------+---------------+---------------------+---------------------+---------------+-----------------+----------+---------------+----------+------+----------+---------+


查看 Base 表的所有物化视图 
desc sales_records all;
+---------------+---------------+-----------+--------+------+-------+---------+-------+---------+
| IndexName     | IndexKeysType | Field     | Type   | Null | Key   | Default | Extra | Visible |
+---------------+---------------+-----------+--------+------+-------+---------+-------+---------+
| sales_records | DUP_KEYS      | record_id | INT    | Yes  | true  | NULL    |       | true    |
|               |               | seller_id | INT    | Yes  | true  | NULL    |       | true    |
|               |               | store_id  | INT    | Yes  | true  | NULL    |       | true    |
|               |               | sale_date | DATE   | Yes  | true  | NULL    |       | true    |
|               |               | sale_amt  | BIGINT | Yes  | false | NULL    | NONE  | true    |
|               |               |           |        |      |       |         |       |         |
| store_amt     | AGG_KEYS      | store_id  | INT    | Yes  | true  | NULL    |       | true    |
|               |               | sale_amt  | BIGINT | Yes  | false | NULL    | SUM   | true    |
+---------------+---------------+-----------+--------+------+-------+---------+-------+---------+

3. Query

See if it hits the materialized view we just created

EXPLAIN SELECT store_id, sum(sale_amt) FROM sales_records GROUP BY store_id;

+------------------------------------------------------------------------------------+
| Explain String                                                                     |
+------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0                                                                    |
|   OUTPUT EXPRS:<slot 2> `store_id` | <slot 3> sum(`sale_amt`)                      |
|   PARTITION: UNPARTITIONED                                                         |
|                                                                                    |
|   VRESULT SINK                                                                     |
|                                                                                    |
|   4:VEXCHANGE                                                                      |
|                                                                                    |
| PLAN FRAGMENT 1                                                                    |
|                                                                                    |
|   PARTITION: HASH_PARTITIONED: <slot 2> `store_id`                                 |
|                                                                                    |
|   STREAM DATA SINK                                                                 |
|     EXCHANGE ID: 04                                                                |
|     UNPARTITIONED                                                                  |
|                                                                                    |
|   3:VAGGREGATE (merge finalize)                                                    |
|   |  output: sum(<slot 3> sum(`sale_amt`))                                         |
|   |  group by: <slot 2> `store_id`                                                 |
|   |  cardinality=-1                                                                |
|   |                                                                                |
|   2:VEXCHANGE                                                                      |
|                                                                                    |
| PLAN FRAGMENT 2                                                                    |
|                                                                                    |
|   PARTITION: HASH_PARTITIONED: `default_cluster:study`.`sales_records`.`record_id` |
|                                                                                    |
|   STREAM DATA SINK                                                                 |
|     EXCHANGE ID: 02                                                                |
|     HASH_PARTITIONED: <slot 2> `store_id`                                          |
|                                                                                    |
|   1:VAGGREGATE (update serialize)                                                  |
|   |  STREAMING                                                                     |
|   |  output: sum(`sale_amt`)                                                       |
|   |  group by: `store_id`                                                          |
|   |  cardinality=-1                                                                |
|   |                                                                                |
|   0:VOlapScanNode                                                                  |
|      TABLE: sales_records(store_amt), PREAGGREGATION: ON                           |
|      partitions=1/1, tablets=10/10, tabletList=15095,15097,15099 ...               |
|      cardinality=7, avgRowSize=1560.0, numNodes=3                                  |
+------------------------------------------------------------------------------------+
5. Remove materialized view syntax
-- 语法:
DROP MATERIALIZED VIEW 物化视图名 on base_table_name; 

--示例:
drop materialized view store_amt on sales_records;
6. Calculate the pv and uv of advertisements

Users have a detailed table of clicks on ads

Requirements: A table that calculates detailed advertising data based on user clicks, calculating pv, uv for each day, each page, and each channel

pv: page view, page views or clicks

uv:unique view, a natural person who accesses and browses this webpage through the Internet

drop table if exists ad_view_record;
create table ad_view_record( 
dt date,  
ad_page varchar(10),  
channel varchar(10), 
refer_page varchar(10), 
user_id int 
)  
distributed by hash(dt)  
properties("replication_num" = "1");



select 
dt,ad_page,channel,
count(refer_page) as pv,  
count(distinct user_id ) as uv
from ad_view_record
group by dt,ad_page,channel

Insert data

insert into ad_view_record values \
('2020-02-02','a','app','/home',1),\
('2020-02-02','a','web','/home',1),\
('2020-02-02','a','app','/addbag',2),\
('2020-02-02','b','app','/home',1),\
('2020-02-02','b','web','/home',1),\
('2020-02-02','b','app','/addbag',2),\
('2020-02-02','b','app','/home',3),\
('2020-02-02','b','web','/home',3),\
('2020-02-02','c','app','/order',1),\
('2020-02-02','c','app','/home',1),\
('2020-02-03','c','web','/home',1),\
('2020-02-03','c','app','/order',4),\
('2020-02-03','c','app','/home',5),\
('2020-02-03','c','web','/home',6),\
('2020-02-03','d','app','/addbag',2),\
('2020-02-03','d','app','/home',2),\
('2020-02-03','d','web','/home',3),\
('2020-02-03','d','app','/addbag',4),\
('2020-02-03','d','app','/home',5),\
('2020-02-03','d','web','/addbag',6),\
('2020-02-03','d','app','/home',5),\
('2020-02-03','d','web','/home',4);

Create a materialized view

-- 怎么去计算pv,uv
select
dt,ad_page,channel,
count(ad_page) as pv,
count(distinct user_id) as uv
from ad_view_record 
group by dt,ad_page,channel;

-- 1.物化视图中,不能够使用两个相同的字段
-- 2.在增量聚合里面,不能够使用count(distinct) ==> bitmap_union
-- 3.count(字段)

create materialized view dpc_pv_uv as 
select
dt,ad_page,channel,
-- refer_page 没有null的情况
count(refer_page) as pv,
-- doris的物化视图中,不支持count(distint) ==> bitmap_union
-- count(distinct user_id) as uv
bitmap_union(to_bitmap(user_id)) uv_bitmap
from ad_view_record 
group by dt,ad_page,channel;



create materialized view tpc_pv_uv as  
select
dt,ad_page,channel,
count(refer_page) as pv,
-- refer_page 不能为null
-- count(user_id) as pv
-- count(1) as pv,
bitmap_union(to_bitmap(user_id)) as uv_bitmap
--count(distinct user_id) as uv
from ad_view_record 
group by dt,ad_page,channel;
--结论:在doris的物化视图中,一个字段不能用两次,并且聚合函数后面必须跟字段名称

In Doris, the result of count(distinct) aggregation is exactly the same as the result of bitmap_union_count aggregation. The result of bitmap_union_count equal to bitmap_union is count, so if count(distinct) is involved in the query, the query can be accelerated by creating a materialized view with bitmap_union aggregation. Because user_id itself is an INT type, in Doris, the field needs to be converted to the bitmap type through the function to_bitmap before bitmap_union aggregation can be performed.

Query automatic matching

explain 
select
dt,ad_page,channel,
count(refer_page) as pv,
count(distinct user_id) as uv
from ad_view_record
group by dt,ad_page,channel;

will be automatically converted to .

explain 
select
dt,ad_page,channel,
count(1) as pv,
bitmap_union_count(to_bitmap(user_id)) as uv
from ad_view_record
group by dt,ad_page,channel;

Summarize:

1. In the materialized view created by doris, the same field cannot be used twice, and the aggregation function must be followed by the field name (aggregation logic such as count(1) cannot be used)

2. When doris chooses which materialized view table to use, it follows the principle of dimension roll-up and selects the materialized view that is closest to the query dimension and whose indicators can be reused.

3. One base table can create multiple materialized views (which takes up more computing resources)

7. Adjust the prefix index

Scenario: The user's original table has three columns (k1, k2, k3). Among them k1, k2 are prefix index columns. At this time, if the user query conditions include where k1=1 and k2=2, the query can be accelerated through the index.

However, in some cases, the user's filter conditions cannot match the prefix index, such as where k3=3. Then the query speed cannot be improved through indexing.

Solution:

Creating a materialized view with k3 as the first column solves this problem.

Inquire

desc sales_records all;
+---------------+---------------+-----------+--------+------+-------+---------+-------+---------+
| IndexName     | IndexKeysType | Field     | Type   | Null | Key   | Default | Extra | Visible |
+---------------+---------------+-----------+--------+------+-------+---------+-------+---------+
| sales_records | DUP_KEYS      | record_id | INT    | Yes  | true  | NULL    |       | true    |
|               |               | seller_id | INT    | Yes  | true  | NULL    |       | true    |
|               |               | store_id  | INT    | Yes  | true  | NULL    |       | true    |
|               |               | sale_date | DATE   | Yes  | true  | NULL    |       | true    |
|               |               | sale_amt  | BIGINT | Yes  | false | NULL    | NONE  | true    |
+---------------+---------------+-----------+--------+------+-------+---------+-------+---------+
5 rows in set (0.00 sec)


--针对上面的前缀索引情况,执行下面的sql是无法利用前缀索引的
explain 
select record_id,seller_id,store_id from sales_records  
where store_id=3;

Create a materialized view

create materialized view sto_rec_sell as  
select  
 store_id, 
 record_id, 
 seller_id, 
 sale_date, 
 sale_amt 
from sales_records;

After creation through the above syntax, the materialized view retains complete detailed data, and the prefix index of the materialized view

Referenced as the store_id column.

View table structure

desc sales_records all; 
+---------------+---------------+-----------+--------+------+-------+---------+-------+---------+
| IndexName     | IndexKeysType | Field     | Type   | Null | Key   | Default | Extra | Visible |
+---------------+---------------+-----------+--------+------+-------+---------+-------+---------+
| sales_records | DUP_KEYS      | record_id | INT    | Yes  | true  | NULL    |       | true    |
|               |               | seller_id | INT    | Yes  | true  | NULL    |       | true    |
|               |               | store_id  | INT    | Yes  | true  | NULL    |       | true    |
|               |               | sale_date | DATE   | Yes  | true  | NULL    |       | true    |
|               |               | sale_amt  | BIGINT | Yes  | false | NULL    | NONE  | true    |
|               |               |           |        |      |       |         |       |         |
| sto_rec_sell  | DUP_KEYS      | store_id  | INT    | Yes  | true  | NULL    |       | true    |
|               |               | record_id | INT    | Yes  | true  | NULL    |       | true    |
|               |               | seller_id | INT    | Yes  | true  | NULL    |       | true    |
|               |               | sale_date | DATE   | Yes  | false | NULL    | NONE  | true    |
|               |               | sale_amt  | BIGINT | Yes  | false | NULL    | NONE  | true    |
+---------------+---------------+-----------+--------+------+-------+---------+-------+---------+

query matching

explain select record_id,seller_id,store_id from sales_records where store_id=3; 
+------------------------------------------------------------------------------------+
| Explain String                                                                     |
+------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0                                                                    |
|   OUTPUT EXPRS:`record_id` | `seller_id` | `store_id`                              |
|   PARTITION: UNPARTITIONED                                                         |
|                                                                                    |
|   VRESULT SINK                                                                     |
|                                                                                    |
|   1:VEXCHANGE                                                                      |
|                                                                                    |
| PLAN FRAGMENT 1                                                                    |
|                                                                                    |
|   PARTITION: HASH_PARTITIONED: `default_cluster:study`.`sales_records`.`record_id` |
|                                                                                    |
|   STREAM DATA SINK                                                                 |
|     EXCHANGE ID: 01                                                                |
|     UNPARTITIONED                                                                  |
|                                                                                    |
|   0:VOlapScanNode                                                                  |
|      TABLE: sales_records(sto_rec_sell), PREAGGREGATION: ON                        |
|      PREDICATES: `store_id` = 3                                                    |
|      partitions=1/1, tablets=10/10, tabletList=15300,15302,15304 ...               |
|      cardinality=0, avgRowSize=12.0, numNodes=1                                    |
+------------------------------------------------------------------------------------+

At this time, the query will read data directly from the sto_rec_sell materialized view just created. The materialized view has a prefix index for store_id, and the query efficiency will also be improved.

4. Data import and export

Divided according to usage scenarios

Insert image description here

4.1. Use Insert method to synchronize data

Users can use the MySQL protocol to import data using the INSERT statement.

The use of INSERT statements is similar to the use of INSERT statements in databases such as MySQL. The INSERT statement supports the following two syntaxes:

* INSERT INTO table SELECT ...
* INSERT INTO table VALUES(...)

For Doris, an INSERT command is a complete import transaction.

Therefore, whether you are importing one piece of data or multiple pieces of data, we do not recommend using this method to import data in a production environment. High-frequency INSERT operations will result in a large number of small files in the storage layer, which will seriously affect system performance.

This method is only used for simple offline testing or low-frequency and small-scale operations.

Or you can use the following method to perform batch insert operations:

INSERT INTO example_tbl VALUES
(1000, "baidu1", 3.25)
(2000, "baidu2", 4.25)
(3000, "baidu3", 5.25);

4.2. Import local data

Stream Load is used to import local files into doris. Stream Load connects and interacts with Doris through the HTTP protocol.

HOST:PORT involved in this method are all corresponding HTTP protocol ports.

  • BE's HTTP protocol port, the default is 8040.
  • The HTTP protocol port of FE, the default is 8030.

However, it must be ensured that the network of the machine where the client is located can connect to the machines where FE and BE are located.

Fundamental

                       |      |
                         |      | 1. User submit load to FE   1.提交导入请求
                         |      |
                         |   +--v-----------+
                         |   | FE           | 生成导入计划
4. Return result to user |   +--+-----------+
                         |      |
                         |      | 2. Redirect to BE  下发给每一个BE节点
                         |      |
                         |   +--v-----------+
                         +---+Coordinator BE| 1B. User submit load to BE
                             +-+-----+----+-+
                               |     |    |
                         +-----+     |    +-----+
                         |           |          | 3. Distrbute data 分发数据并导入
                         |           |          |
                       +-v-+       +-v-+      +-v-+
                       |BE |       |BE |      |BE |
                       +---+       +---+      +---+

1. Create a table

drop table if exists load_local_file_test;
CREATE TABLE IF NOT EXISTS load_local_file_test
(
    id INT,
    name VARCHAR(50),
    age TINYINT
)
unique key(id)
DISTRIBUTED BY HASH(id) BUCKETS 3;


1,zss,28
2,lss,28
3,ww,88

2. Import data
Execute the curl command to import local files (this command is not executed on the mysql side)

# 语法示例
 curl \
 -u user:passwd \  # 账号密码
 -H "label:load_local_file_test" \  # 本次任务的唯一标识
 -T 文件地址 \
 http://主机名:端口号/api/库名/表名/_stream_load
 
 
curl \
 -u root:123456 \
 -H "label:load_local_file" \
 -H "column_separator:," \
 -T /root/data/loadfile.txt \
 http://zuomm01:8040/api/test/load_local_file_test/_stream_load
  • user:passwd is the user created in Doris. The initial user is admin/root, and the password is initially empty.

  • host:port is the HTTP protocol port of BE, the default is 8040, which can be viewed on the Doris cluster WEB UI page.

  • label: You can specify a Label in the Header to uniquely identify this import task.

3. Wait for the import result

-- 这是失败的
[root@zuomm01 data]# curl \
>  -u root:123456 \
>  -H "label:load_local_file" \
>  -T /root/data/loadfile.txt \
>  http://zuomm01:8040/api/test/load_local_file_test/_stream_load
{
    "TxnId": 1004,
    "Label": "load_local_file",
    "TwoPhaseCommit": "false",
    "Status": "Fail",
    "Message": "too many filtered rows",
    "NumberTotalRows": 4,
    "NumberLoadedRows": 0,
    "NumberFilteredRows": 4,
    "NumberUnselectedRows": 0,
    "LoadBytes": 36,
    "LoadTimeMs": 82,
    "BeginTxnTimeMs": 13,
    "StreamLoadPutTimeMs": 56,
    "ReadDataTimeMs": 0,
    "WriteDataTimeMs": 9,
    "CommitAndPublishTimeMs": 0,
    "ErrorURL": "http://192.168.17.3:8040/api/_load_error_log?file=__shard_0/error_log_insert_stmt_cf4aa4d10e8d5fc5-458f16b70f0f2e87_cf4aa4d10e8d5fc5_458f16b70f0f2e87"
}


-- 这是成功的
[root@zuomm01 data]# curl \
>  -u root:123456 \
>  -H "label:load_local_file" \
>  -H "column_separator:," \
>  -T /root/data/loadfile.txt \
>  http://zuomm01:8040/api/test/load_local_file_test/_stream_load
{
    "TxnId": 1005,
    "Label": "load_local_file",
    "TwoPhaseCommit": "false",
    "Status": "Success",
    "Message": "OK",
    "NumberTotalRows": 4,
    "NumberLoadedRows": 4,
    "NumberFilteredRows": 0,
    "NumberUnselectedRows": 0,
    "LoadBytes": 36,
    "LoadTimeMs": 54,
    "BeginTxnTimeMs": 0,
    "StreamLoadPutTimeMs": 2,
    "ReadDataTimeMs": 0,
    "WriteDataTimeMs": 14,
    "CommitAndPublishTimeMs": 36
}

4. Some configurable parameters of curl:

1. label: the label of the import task. Data with the same label cannot be imported multiple times. (Tag is retained for 30 minutes by default)

2. column_separator: used to specify the column separator in the imported file, the default is \t.

3. line_delimiter: used to specify the newline character in the imported file, the default is \n.

4. Columns: used to specify the correspondence between the columns in the file and the columns in the table. The default is one-to-one correspondence.

  • Example 1: There are three columns "c1, c2, c3" in the table, and the three columns in the source file correspond to "c3, c2, c1"; then you need to specify -H "columns: c3, c2, c1 "

  • Example 2: There are three columns "c1, c2, c3" in the table. The first three columns in the source file correspond in sequence, but there is more than one column; then you need to specify -H "columns: c1, c2, c3, xxx"; finally You can just specify a name placeholder for a column at will.

  • Example 3: There are three columns "year, month, day" in the table, and there is only one time column in the source file, which is in the format of "2018-06-01 01:02:03"; then it can Specify
    -H "columns: col, year = year(col), month=month(col), day=day(col)"Complete import

5. where: used to filter the data in the imported file

  • Example 1: Only import data greater than k1 column equal to 20180601, then you can specify -H "where: k1 = 20180601" when importing

6. max_filter_ratio: The maximum tolerated data ratio that can be filtered (due to irregular data, etc.). Default is zero tolerance. Data denormalization does not include rows filtered out by where conditions.

7. partitions: used to specify the partitions designed for this import. If the user can determine the partition corresponding to the data, it is recommended to specify this item. Data that does not satisfy these partitions will be filtered out.
For example, specify import to p1, p2 partition,

-H “partitions: p1, p2”

8. timeout: Specify the timeout period for import. Unit seconds. The default is 600 seconds. The settable range is 1 second ~ 259200 seconds.

9. timezone: Specify the time zone used for this import. The default is East Eighth District. This parameter will affect the results of all imported time zone-related functions.

10. exec_mem_limit: Import memory limit. Default is 2GB. The unit is bytes.

11. format: Specify the import data format, the default is csv, and json format is supported.

12. read_json_by_line: Boolean type, true means supporting reading one json object per line, the default value is false.

13. merge_type: The merge type of data. It supports three types: APPEND, DELETE and MERGE. Among them, APPEND is the default value, which means that all the data in this batch need to be appended to the existing data. DELETE means that all the data with the same key as this batch of data are deleted. Line, MERGE semantics need to be used in conjunction with delete conditions, which means that data that meets the delete conditions will be processed according to DELETE semantics and the rest will be processed according to APPEND semantics. Example: -H "merge_type: MERGE" -H "delete: flag=1"

14. delete: only meaningful under MERGE, indicating the deletion condition of data

15. function_column.sequence_col: only applies to UNIQUE_KEYS. Under the same key column, ensure that the value column is REPLACEd according to the source_sequence column. The source_sequence can be a column in the data source or a column in the table structure.

-- 准备数据
{
   
   "id":1,"name":"liuyan","age":18}
{
   
   "id":2,"name":"tangyan","age":18}
{
   
   "id":3,"name":"jinlian","age":18}
{
   
   "id":4,"name":"dalang","age":18}
{
   
   "id":5,"name":"qingqing","age":18}

curl \
 -u root: \
 -H "label:load_local_file_json_20221126" \
 -H "columns:id,name,age" \
 -H "max_filter_ratio:0.1" \
 -H "timeout:1000" \
 -H "exec_mem_limit:1G" \
 -H "where:id>1" \
 -H "format:json" \
 -H "read_json_by_line:true" \
 -H "merge_type:delete" \
 -T /root/data/json.txt \
 http://zuomm01:8040/api/test/load_local_file_test/_stream_load
 
 
  -H "merge_type:append" \
  
  # 会把id = 3 的这条数据删除
  -H "merge_type:MERGE" \
  -H "delete:id=3"

Import suggestions

1. Stream Load can only import local files.
2. It is recommended that the data volume of an import request be controlled within 1 - 2 GB. If you have a large number of local files, you can submit them concurrently in batches.

4.3. Importing external storage data (hdfs)

1. Applicable scenarios

1. The source data is in a storage system that the Broker can access, such as HDFS.

2. The amount of data ranges from tens to hundreds of GB.

2. Basic principles

1. Create a task to submit for import

2. FE generates an execution plan and distributes the execution plan to multiple BE nodes (each BE node imports a part of the data)

3. BE starts execution after receiving the execution plan and pulls data from the broker to its own node.

4. After all BE is completed, FE decides whether the import is successful and returns the result to the client.

     +
                 | 1. user create broker load
                 v
            +----+----+
            |         |
            |   FE    |   生成导入计划
            |         |
            +----+----+
                 |
                 | 2. BE etl and load the data
    +--------------------------+
    |            |             |
+---v---+     +--v----+    +---v---+
|       |     |       |    |       |
|  BE   |     |  BE   |    |   BE  |
|       |     |       |    |       |
+---+-^-+     +---+-^-+    +--+-^--+
    | |           | |         | |
    | |           | |         | | 3. pull data from broker
+---v-+-+     +---v-+-+    +--v-+--+
|       |     |       |    |       |
|Broker |     |Broker |    |Broker |
|       |     |       |    |       |
+---+-^-+     +---+-^-+    +---+-^-+
    | |           | |          | |
+---v-+-----------v-+----------v-+-+
|       HDFS/BOS/AFS cluster       |
+----------------------------------+

1. Create a new table

drop table if exists load_hdfs_file_test;
CREATE TABLE IF NOT EXISTS load_hdfs_file_test
(
    id INT,
    name VARCHAR(50),
    age TINYINT
)
unique key(id)
DISTRIBUTED BY HASH(id) BUCKETS 3;

2. Import local data into hdfs

hadoop fs -put ./loadfile.txt  hdfs://zuomm01:8020/
hadoop fs -ls  hdfs://zuomm01:8020/    

3. Import format

语法示例:
LOAD LABEL test.label_202204(
[MERGE|APPEND|DELETE]  -- 不写就是append 
DATA INFILE
(
"file_path1"[, file_path2, ...]  -- 描述数据的路径   这边可以写多个 ,以逗号分割
)
[NEGATIVE]               -- 负增长
INTO TABLE `table_name`  -- 导入的表名字
[PARTITION (p1, p2, ...)] -- 导入到哪些分区,不符合这些分区的就会被过滤掉
[COLUMNS TERMINATED BY "column_separator"]  -- 指定分隔符
[FORMAT AS "file_type"] -- 指定存储的文件类型
[(column_list)] -- 指定导入哪些列 

[COLUMNS FROM PATH AS (c1, c2, ...)]  -- 从路劲中抽取的部分列
[SET (column_mapping)] -- 对于列可以做一些映射,写一些函数
-- 这个参数要写在要写在set的后面
[PRECEDING FILTER predicate]  -- 做一些过滤
[WHERE predicate]  -- 做一些过滤  比如id>10 

[DELETE ON expr] --根据字段去做一些抵消消除的策略  需要配合MERGE
[ORDER BY source_sequence] -- 导入数据的时候保证数据顺序
[PROPERTIES ("key1"="value1", ...)]  -- 一些配置参数

4. Load the data on HDFS into the table

LOAD LABEL test.label_20221125
(
DATA INFILE("hdfs://zuomm01:8020/test.txt")
INTO TABLE `load_hdfs_file_test`
COLUMNS TERMINATED BY ","            
(id,name,age)
)
with HDFS (
"fs.defaultFS"="hdfs://zuomm01:8020",
"hadoop.username"="root"
)
PROPERTIES
(
"timeout"="1200",
"max_filter_ratio"="0.1"
);

This is an asynchronous operation, so you need to check the execution status

show load order by createtime desc limit 1\G;

*************************** 1. row ***************************
         JobId: 12143
         Label: label_20220402
         State: FINISHED
      Progress: ETL:100%; LOAD:100%
          Type: BROKER
       EtlInfo: unselected.rows=0; dpp.abnorm.ALL=0; dpp.norm.ALL=4
      TaskInfo: cluster:N/A; timeout(s):1200; max_filter_ratio:0.1
      ErrorMsg: NULL
    CreateTime: 2022-08-31 01:36:01
  EtlStartTime: 2022-08-31 01:36:03
 EtlFinishTime: 2022-08-31 01:36:03
 LoadStartTime: 2022-08-31 01:36:03
LoadFinishTime: 2022-08-31 01:36:03
           URL: NULL
    JobDetails: {
   
   "Unfinished backends":{
   
   "702bc3732d804f60-aa4593551c6e577a":[]},"ScannedRows":4,"TaskNumber":1,"LoadBytes":139,"All backends":{
   
   "702bc3732d804f60-aa4593551c6e577a":[10004]},"FileNumber":1,"FileSize":36}
 TransactionId: 1007
  ErrorTablets: {}
1 row in set (0.00 sec)

--失败的案例:会有详细的错误信息,可以参考参考

mysql> show load order by createtime desc limit 1\G;
*************************** 1. row ***************************
         JobId: 12139
         Label: label_20220402
         State: CANCELLED
      Progress: ETL:N/A; LOAD:N/A
          Type: BROKER
       EtlInfo: NULL
      TaskInfo: cluster:N/A; timeout(s):1200; max_filter_ratio:0.1
      ErrorMsg: type:LOAD_RUN_FAIL; msg:errCode = 2, detailMessage = connect failed. hdfs://zuomm01
    CreateTime: 2022-08-31 01:32:16
  EtlStartTime: 2022-08-31 01:32:19
 EtlFinishTime: 2022-08-31 01:32:19
 LoadStartTime: 2022-08-31 01:32:19
LoadFinishTime: 2022-08-31 01:32:19
           URL: NULL
    JobDetails: {
   
   "Unfinished backends":{
   
   "4bd307c0bd564c45-b7df986d26569ffa":[]},"ScannedRows":0,"TaskNumber":1,"LoadBytes":0,"All backends":{
   
   "4bd307c0bd564c45-b7df986d26569ffa":[10004]},"FileNumber":1,"FileSize":36}
 TransactionId: 1006
  ErrorTablets: {}
1 row in set (0.01 sec)
3. Load parameter description

1. load_label: the unique Label of the imported task

2. [MERGE|APPEND|DELETE]: Data merging type. The default is APPEND, which means this import is a normal append write operation. The MERGE and DELETE types are only available for Unique Key model tables. The MERGE type needs to be used with the [DELETE ON] statement to mark the Delete Flag column. The DELETE type indicates that all the data imported this time is deleted data.

3. DATA INFILE: The path of the imported file can be multiple.

4. NEGTIVE: This keyword is used to indicate that this import is a batch of "negative" imports. This method is only available for aggregate data tables with the integer SUM aggregate type. This method will invert the integer value corresponding to the SUM aggregate column in the imported data. Mainly used to offset previously imported incorrect data.

5. PARTITION(p1, p2, …): You can specify to import only certain partitions of the table. Data that is no longer within the partition range will be ignored.

6. COLUMNS TERMINATED BY: Specify column separator

7. FORMAT AS: Specify the type of file to be imported, supporting CSV, PARQUET and ORC formats. Default is CSV.

8. Column list: used to specify the column order in the original file.

9. COLUMNS FROM PATH AS: Specify the columns extracted from the import file path.

10. PRECEDING FILTER: pre-filtering conditions. The data is first spliced ​​into original data rows in order according to the column list and COLUMNS FROM PATH AS. Then filter according to the pre-filter conditions.

11. SET (column_mapping): Specify the conversion function of the column.

12. WHERE predicate: Filter imported data based on conditions.

13. DELETE ON expr: needs to be used in conjunction with the MEREGE import mode, only for tables of the Unique Key model. Used to specify the column and calculation relationship representing Delete Flag in the imported data.

15. load_properties: Specify the relevant parameters of the import. Currently the following parameters are supported:

  • timeout: import timeout. Default is 4 hours. Unit seconds.
  • max_filter_ratio: The maximum tolerated data ratio that can be filtered (due to irregular data, etc.). Default is zero tolerance. The value range is 0 to 1.
  • exec_mem_limit: Import memory limit. Default is 2GB. The unit is bytes.
  • strict_mode: Whether to strictly limit data. Default is false.
  • timezone: Specify the time zone of some functions affected by time zone, such as strftime/alignment_timestamp/from_unixtime, etc. Please refer to the time zone documentation for details. If not specified, the "Asia/Shanghai" time zone is used
4. Advanced examples

Import data from HDFS, using wildcards to match two batches of files. Import into two tables respectively

LOAD LABEL example_db.label2
(
    DATA INFILE("hdfs://hdfs_host:hdfs_port/input/file-10*")
    INTO TABLE `my_table1`
    PARTITION (p1)
    COLUMNS TERMINATED BY ","
    FORMAT AS "parquet"  
    (id, tmp_salary, tmp_score) 
    SET (
        salary= tmp_salary + 1000,
        score = tmp_score + 10
    ),
    DATA INFILE("hdfs://hdfs_host:hdfs_port/input/file-20*")
    INTO TABLE `my_table2`
    COLUMNS TERMINATED BY ","
    (k1, k2, k3)
)
with HDFS (
"fs.defaultFS"="hdfs://zuomm01:8020",
"hadoop.username"="root"
)

Import the data and extract the partition fields in the file path

LOAD LABEL example_db.label10
(
    DATA INFILE("hdfs://hdfs_host:hdfs_port/user/hive/warehouse/table_name/dt=20221125/*")
    INTO TABLE `my_table`
    FORMAT AS "csv"
    (k1, k2, k3)
    COLUMNS FROM PATH AS (dt)
)
WITH BROKER hdfs
(
    "username"="hdfs_user",
    "password"="hdfs_password"
);

Filter the data to be imported.

LOAD LABEL example_db.label6
(
    DATA INFILE("hdfs://host:port/input/file")
    INTO TABLE `my_table`
    (k1, k2, k3)
    SET (
        k2 = k2 + 1
    )
        PRECEDING FILTER k1 = 1  ==》前置过滤
    WHERE k1 > k2   ==》 后置过滤
)
WITH BROKER hdfs
(
    "username"="user",
    "password"="pass"
);

只有原始数据中,k1 = 1,并且转换后,k1 > k2 的行才会被导入。
5. Cancel import

When the Broker load job status is not CANCELLED or FINISHED, it can be manually canceled by the user.

When canceling, you need to specify the Label of the import task to be canceled. The cancel import command syntax can be viewed by executing HELP CANCEL LOAD.

CANCEL LOAD [FROM db_name] WHERE LABEL="load_label"; 

4.4. Synchronize data through external tables

Doris can create external tables. After creation, you can directly query the data of the external table through the SELECT statement, or import the data from the external table through INSERT INTO SELECT.

The data sources currently supported by Doris external tables include: MySQL, Oracle, Hive, PostgreSQL, SQLServer, Iceberg, ElasticSearch

CREATE [EXTERNAL] TABLE table_name ( 
 col_name col_type [NULL | NOT NULL] [COMMENT "comment"] 
) ENGINE=HIVE
[COMMENT "comment"] 
PROPERTIES ( 
-- 我要映射的hive表在哪个库里面
-- 映射的表名是哪一张
-- hive的元数据服务地址
 'property_name'='property_value', 
 ... 
); 

Parameter Description:

1. Appearance column

  • Column names must correspond to the Hive table one-to-one
  • The order of columns needs to be consistent with the Hive table
  • Must contain all columns in the Hive table
  • Hive table partition columns do not need to be specified and can be defined just like ordinary columns.

2. ENGINE needs to be specified as HIVE

3. PROPERTIES properties:

  • hive.metastore.uris: Hive Metastore service address
  • database: The database name corresponding to mounting Hive
  • table: The table name corresponding to mounting Hive
1. Usage examples

Create a test table in Hive:

CREATE TABLE `user_info` ( 
 `id` int, 
 `name` string, 
 `age` int
) stored as orc;

insert into user_info values (1,'zss',18);
insert into user_info values (2,'lss',20);
insert into user_info values (3,'ww',25);

Create external table in Doris

CREATE EXTERNAL TABLE `hive_user_info` ( 
 `id` int, 
 `name` varchar(10), 
 `age` int 
) ENGINE=HIVE 
PROPERTIES ( 
'hive.metastore.uris' = 'thrift://linux01:9083', 
'database' = 'db1', 
'table' = 'user_info' 
);

Query external table

select * from hive_user_info;

Import data from external table to internal table

-- 就是用sql查询,从外部表中select出数据后,insert到内部表即可
insert into doris_user_info
select
 *
from hive_user_info;

Hive table schema changes will not be automatically synchronized, and the Hive table needs to be rebuilt in Doris.

Currently Hive’s storage format only supports Text, Parquet and ORC types.

4.5、Binlog Load

Binlog Load provides a CDC (Change Data Capture) function that enables Doris to incrementally synchronize users' data update operations in the Mysql database.

Insert image description here

1. Applicable scenarios

1. INSERT/UPDATE/DELETE support

2. Filter Query

3. Not compatible with DDL statements yet

2. Basic principles

In the current version design, Binlog Load needs to rely on canal as an intermediary, so that canal can be faked as a slave node to obtain the Binlog on the Mysql master node and parse it, and then Doris will obtain the parsed data on Canal, mainly involving the Mysql side, Canal end and Doris end, the overall data flow is as follows:

+---------------------------------------------+
|                    Mysql                    |
+----------------------+----------------------+
                       | Binlog
+----------------------v----------------------+
|                 Canal Server  数据解析       |
+-------------------+-----^-------------------+
               Get  |     |  Ack
+-------------------|-----|-------------------+
| FE                |     |                   |
| +-----------------|-----|----------------+  |
| | Sync Job        |     |                |  |
| |    +------------v-----+-----------+    |  |
| |    | Canal Client                 |    |  |
| |    |   +-----------------------+  |    |  |
| |    |   |       Receiver        |  |    |  |
| |    |   +-----------------------+  |    |  |
| |    |   +-----------------------+  |    |  |
| |    |   |       Consumer        |  |    |  |
| |    |   +-----------------------+  |    |  |
| |    +------------------------------+    |  |
| +----+---------------+--------------+----+  |
|      |               |              |       |
| +----v-----+   +-----v----+   +-----v----+  |
| | Channel1 |   | Channel2 |   | Channel3 |  |
| | [Table1] |   | [Table2] |   | [Table3] |  |
| +----+-----+   +-----+----+   +-----+----+  |
|      |               |              |       |
|   +--|-------+   +---|------+   +---|------+|
|  +---v------+|  +----v-----+|  +----v-----+||
| +----------+|+ +----------+|+ +----------+|+|
| |   Task   |+  |   Task   |+  |   Task   |+ |
| +----------+   +----------+   +----------+  |
+----------------------+----------------------+
     |                 |                  |
+----v-----------------v------------------v---+
|                 Coordinator                 |
|                     BE                      |
+----+-----------------+------------------+---+
     |                 |                  |
+----v---+         +---v----+        +----v---+
|   BE   |         |   BE   |        |   BE   |
+--------+         +--------+        +--------+

As shown in the figure above, the user submits a data synchronization job to FE.

1. FE will start a canal client for each data synchronization job to subscribe to and obtain data from the canal server.

2. The receiver in the client will be responsible for receiving data through the Get command. Every time a data batch is obtained, it will be distributed to different channels by the consumer according to the corresponding table. Each channel will generate a subtask for sending data for this data batch.

3. On FE, a Task is a subtask of the channel sending data to BE, which contains the data of the same batch distributed to the current channel.

4. The channel controls the start, submission, and termination of a single table transaction. Within a transaction cycle, multiple batches of data will generally be obtained from the consumer, so multiple sub-tasks will be generated to send data to BE. These Tasks will not actually take effect until the transaction is successfully submitted.

5. When certain conditions are met (such as exceeding a certain time and reaching the maximum submitted data size), the consumer will block and notify each channel to submit transactions.

6. If and only if all channels are submitted successfully, canal will be notified through the Ack command and continue to obtain and consume data.

7. If any channel fails to submit, the data will be retrieved from the location where the last consumption was successful and submitted again (the successfully submitted channel will not be submitted again to ensure idempotence).

8. During the entire data synchronization operation, FE continuously obtains data from canal through the above process and submits it to BE to complete data synchronization.

3. Configure Mysql side

In the master-slave synchronization of Mysql Cluster mode, the binary log file (Binlog) records all data changes on the master node. Data synchronization and backup among multiple nodes of the Cluster must be performed through the Binlog log, thereby improving the availability of the cluster. The architecture usually consists of a master node (responsible for writing) and one or more slave nodes (responsible for reading). All data changes that occur on the master node will be copied to the slave nodes.

Note: Currently, Mysql 5.7 and above must be used to support the Binlog Load function.

1. To turn on the binary binlog function of mysql, you need to edit the my.cnf configuration file and set it up.

[root@zuomm01 sbin]# find   /  -name   my.cnf
/etc/my.cnf
修改mysqld中的一些配置文件
[mysqld] 
server_id = 1
log-bin = mysql-bin
binlog-format = ROW

#binlog-format 的三种模式
#ROW   记录每一行数据的信息
#Statement  记录sql语句
#Mixed   上面两种的混合

2. Restart MySQL to make the configuration take effect.

systemctl restart mysqld 

3. Create users and authorize them

-- 设置这些参数可以使得mysql的密码简单化
set global validate_password_length=4; 
set global validate_password_policy=0; 
-- 新增一个canal的用户,让他监听所有库中的所有表,并且设置密码为canal
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'canal'@'%' IDENTIFIED BY 'canal' ;
-- 刷新一下权限
FLUSH PRIVILEGES;

4. Prepare test form

CREATE TABLE `user_doris` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) DEFAULT NULL,
  `age` int(11) DEFAULT NULL,
  `gender` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8
5. Configure Canal terminal

Canal is a sub-project under Alibaba's otter project. Its main purpose is to provide incremental data subscription and consumption based on incremental log analysis of MySQL database. It is used to solve cross-machine room synchronization business scenarios. It is recommended to use canal 1.1.5 and above. Version.

Download address: https://github.com/alibaba/canal/releases

Upload and decompress the canal deployer compressed package

mkdir /opt/app/canal
tar -zxvf canal.deployer-1.1.5.tar.gz -C /opt/app/canal

Create a new directory under the conf folder and rename it

There can be multiple instances in a canal service. Each directory under conf/ is an instance. Each instance has an independent configuration file.

mkdir /opt/app/canel/conf/doris
拷贝配置文件模板 
cp /opt/app/canal/conf/example/instance.properties /opt/app/canal/conf/doris/
修改 conf/canal.properties 的配置

Modify instance configuration file

vi canal.properties 

进入找到canal.destinations = example
将其修改为 我们自己配置的目录
canal.destinations = doris-load 

start up

sh bin/startup.sh
6. Configure target table

Doris creates a target table corresponding to Mysql

CREATE TABLE `binlog_mysql` ( 
 `id` int(11) NOT NULL COMMENT "", 
 `name` VARCHAR(50) NOT NULL COMMENT "", 
 `age` int(11) NOT NULL COMMENT "" ,
 `gender` VARCHAR(50) NOT NULL COMMENT ""
) ENGINE=OLAP 
UNIQUE KEY(`id`) 
DISTRIBUTED BY HASH(`id`) BUCKETS 1; 

basic grammar

CREATE SYNC [db.]job_name 
( 
 channel_desc,  
 channel_desc 
 ... 
) 
binlog_desc 

Parameter Description:

1. job_name: is the unique identifier of the data synchronization job in the current database

2. channel_desc: used to define the data channel under the task, which can represent the mapping relationship between the MySQL source table and the doris target table. When setting this item, if there are multiple mapping relationships, it must be satisfied that the MySQL source table should have a one-to-one correspondence with the doris target table. Any other mapping relationships (such as one-to-many relationships) will be considered illegal when checking the syntax. .

3. column_mapping: mainly refers to the mapping relationship between the columns of the MySQL source table and the doris target table. If not specified, FE will default to a one-to-one correspondence between the columns of the source table and the target table in order. However, we still recommend explicitly specifying the column mapping relationship, so that when the structure of the target table changes (such as adding a nullable column), the data synchronization job can still proceed. Otherwise, when the above changes occur, the import will report an error because the column mapping relationship no longer corresponds one to one.

4. binlog_desc: Defines some necessary information for connecting to the remote Binlog address. Currently, the only supported connection type is canal mode, and all configuration items need to be prefixed with canal.

  • canal.server.ip: address of canal server
  • canal.server.port: port of canal server
  • canal.destination: the string identifier of the instance mentioned above
  • canal.batchSize: The maximum batch size obtained from the canal server in each batch, default 8192
  • canal.username: username of instance
  • canal.password: password of instance
  • canal.debug: When set to true, detailed information about the batch and each row of data will be printed, which will affect performance.
CREATE SYNC test.job20221228
( 
 FROM test.binlog_test INTO binlog_test
) 
FROM BINLOG  
( 
 "type" = "canal", 
 "canal.server.ip" = "zuomm01", 
 "canal.server.port" = "11111", 
 "canal.destination" = "doris", 
 "canal.username" = "canal", 
 "canal.password" = "canal" 
);

View job status

展示当前数据库的所有数据同步作业状态。 
SHOW SYNC JOB; 
展示数据库 `test_db` 下的所有数据同步作业状态。 
SHOW SYNC JOB FROM `test`; 

Control jobs: Users can control the stop, pause and resume of jobs through the three commands STOP/PAUSE/RESUME

停止名称为 `job_name` 的数据同步作业 
STOP SYNC JOB [db.]job_name 

暂停名称为 `job_name` 的数据同步作业 
PAUSE SYNC JOB [db.]job_name 

恢复名称为 `job_name` 的数据同步作业 
RESUME SYNC JOB `job_name` 

4.6. Export data

Data export (Export) is a function provided by Doris to export data. This function can export the data of user-specified tables or partitions in text format to remote storage through the Broker process, such as HDFS / object storage (supports S3 protocol), etc.

EXPORT TABLE test.event_info_log1 -- 库名.表名
to "hdfs://linux01:8020/event_info_log1"  -- 导出到那里去
PROPERTIES
(
    "label" = "event_info_log1",
    "column_separator"=",",
    "exec_mem_limit"="2147483648",
    "timeout" = "3600"
)
WITH BROKER "broker_name"
(
    "username" = "root",
    "password" = ""
);

View export status

mysql> show EXPORT \G;
*************************** 1. row ***************************
     JobId: 14008
     State: FINISHED
  Progress: 100%
  TaskInfo: {
   
   "partitions":["*"],"exec mem limit":2147483648,"column separator":",","line delimiter":"\n","tablet num":1,"broker":"hdfs","coord num":1,"db":"default_cluster:db1","tbl":"tbl3"}
      Path: bos://bj-test-cmy/export/
CreateTime: 2019-06-25 17:08:24
 StartTime: 2019-06-25 17:08:28
FinishTime: 2019-06-25 17:08:34
   Timeout: 3600
  ErrorMsg: NULL
1 row in set (0.01 sec)

Parameter Description

  • JobId: unique ID of the job
  • State: Job status:
    • PENDING: Job to be scheduled
    • EXPORTING: Data is being exported
    • FINISHED: The job was successful
    • CANCELLED: Job failed
  • Progress: job progress. The progress is in query plans. Assume there are 10 query plans in total and 3 have been completed so far, so the progress is 30%.
  • TaskInfo: Job information displayed in Json format:
    • db: database name
    • tbl: table name
    • partitions: Specify the exported partitions. * indicates all partitions.
    • exec mem limit: Query plan memory usage limit. Unit byte.
    • column separator: column separator of the export file.
    • line delimiter: The line delimiter of the exported file.
    • tablet num: The total number of tablets involved.
    • broker: The name of the broker used.
    • coord num: The number of query plans.
  • Path: The export path on the remote storage.
  • CreateTime/StartTime/FinishTime: The creation time, start scheduling time and end time of the job.
  • Timeout: job timeout. The unit is seconds. This time is calculated from CreateTime.
  • ErrorMsg: If an error occurs in the job, the reason for the error will be displayed here.

Precautions

1. It is not recommended to export a large amount of data at one time. The maximum amount of exported data recommended for an Export job is tens of GB. Excessively large exports result in more junk files and higher retry costs.

2. If the amount of table data is too large, it is recommended to export according to partitions.

3. During the execution of the Export job, if FE restarts or switches master, the Export job will fail and the user will need to resubmit.

4. If the Export job fails, the __doris_export_tmp_xxx temporary directory generated in the remote storage and the generated files will not be deleted and need to be deleted manually by the user.

5. If the Export job runs successfully, the __doris_export_tmp_xxx directories generated in the remote storage may be retained or cleared based on the file system semantics of the remote storage. For example, in Baidu Object Storage (BOS), after the last file in a directory is removed through the rename operation, the directory will also be deleted. If the directory has not been cleared, the user can clear it manually

6. When the Export operation is completed (success or failure), FE restarts or switches masters, and some of the job information displayed by SHOW EXPORT will be lost and cannot be viewed.

7. The Export job will only export the data of the Base table and not the data of the Rollup Index.

8. The Export job will scan data and occupy IO resources, which may affect the query delay of the system.

Original text: https://mp.weixin.qq.com/s/kjCiPfNDT27KJq4iEN0LYw

Guess you like

Origin blog.csdn.net/qq_44787816/article/details/134770828