[Aliyun] Real-time data warehouse Hologres Demo01 Real-time calculation and real-time write data to Hologres

Part of the content is directly extracted from Ali's official website as background and overview:

background

In the real-time data business scenario, the most common link is to initially clean the data collected in real-time through real-time calculation, write the data to the database in real time, and then connect to BI tools to realize the visual analysis of the data. The data processing flow is shown in the figure below.

In the entire business link, the database is required to provide high-performance computing services, store massive amounts of data, and connect multiple BI analysis tools at the same time. It is difficult for a single database to achieve all of the above functions. You must use the relevant capabilities of other databases to complete the business process.
When importing and exporting data, redundant storage will be generated and storage resources will be wasted. At the same time, multiple systems need to be maintained, which adds a certain degree of difficulty to development and operation and maintenance.
Overview
To solve business pain points in real-time scenarios, Hologres provides a real-time data API interface. Business data and log data can directly call the real-time data API interface, write data in real time, and then Hologres provides high-performance computing services and massive data storage services. The data processing flow is shown in the figure below.

In the entire business link, you do not need to import and export data, and the written data is stored in Hologres uniformly, without redundant storage, saving computing and storage resources. A set of systems can meet your multiple needs, saving development and operation and maintenance costs.
Blink writes data in real time:

请确保开通的实时计算与Hologres地域一致,以免连接失败。(不是跨Region的服务)
Blink 3.6之前的版本未内置Hologres Connector(注意版本),实时写入数据至Hologres需要引用JAR文件,您可以提交工单或通过Hologres交流群(钉钉群号:32314975)获取。
当Hologres中接收数据的表已设置主键,默认按照主键更新实时写入的数据。(按照PK的update time写入)
如果使用批处理方式导入数据,则需要设置BatchSize并使用HoloHub的Endpoint。

Use blink-3.4.4 (default) version~

Steps

This case demonstrates how to randomly generate random numbers in real-time calculations and query data in real-time in interactive analysis. (You can refer to the document for all the content of this demonstration: real-time data real-time writing and query)
Platform: real-time computing platform, HoloStudio
Note: This case focuses on the demonstration of the operation steps, the data is randomly generated data, the actual business please Do processing according to actual scene data)

1. Interactive analysis and table building

Log in to HoloStudio and create an internal table to receive data. The example table creation statement is as follows:

create table test(a int, b text, c text, d float8, e int8);

2. Real-time calculation ready for real-time operation

Log in to the real-time computing platform, create a new job development, and fill in the job. The main task is to establish a connection between real-time calculation and interactive analysis, and import data. The sample SQL is as follows:
//Simulate the message queue to randomly generate data

create table randomSource (a int, b VARCHAR , c VARCHAR , d DOUBLE, e BIGINT) with (type = 'random');

//Establish connection table information for real-time calculation and interactive analysis, the following connection string information is required

create table blink_test_demo (
a INT ,
b VARCHAR ,
c VARCHAR ,
d DOUBLE,
e BIGINT,
PRIMARY KEY (a)
) with (
type = 'custom',
tableFactoryClass = 'com.alibaba.blink.connectors.hologres.HologresTableFactory',
`endpoint` = '交互式分析实例VPC网络地址:交互式分析实例VPC网络端口',
`userName` = '当前账号的Access ID',
`password` = '当前账号的Access Key',
`dbName` = '要连接的交互式分析数据库名',
`tableName` = '交互式分析中用于接收数据的表'
);

//Import data into the connection table

insert
into blink_test_demo
select
a,b,c,d,e
from
randomSource;


There will be an error during the grammar check: this is because there is a lack of a reference to the hologres connection jar package, the following start to import the jar package (blink-connector-hologres-07-demo.jar)


After completing the assignment, select Resource Reference in the left menu bar of real-time calculation, and reference the resource package (Resource package: http://docs-aliyun.cn-hangzhou.oss.aliyun-inc.com/assets/attach/170591 /cn_en/1591698479126/blink-connector-hologres-07-demo%281%29.jar?spm=a2c4g.11186623.2.23.ff644333ZjOaJX&file=blink-connector-hologres-07-demo%281%29.jar)

Then save and perform another syntax check:


The next step is to start real-time jobs online. Choose to go online:

Start job: Submit the job to the production environment to run. Click O&M in the upper left corner to manually start the job. (It takes 1-2 minutes for the job to start and output information, please wait patiently)

Interactive analysis to read data in real time

After real-time calculation has output data, you can go to HoloStudio for real-time data query.
You can see that the TPS is 2 Blocks/s, and you can also adjust the bathSize;

Randomly execute two SQL tests:

SELECT * FROM test;

SELECT COUNT (*) FROM test;

Done!

Guess you like

Origin blog.csdn.net/u010478127/article/details/108971090