Earlier we talked about how to use data distribution, how to synchronize data to ElasticSearch, and use ElasticSearch's excellent query performance to easily support tens of millions of data queries. At the same time, there is also a point: aggregate data into one place, and join query fields for field redundancy , the order line table is not described in detail as a nested document. In this article, we will explain how to do redundancy and nesting.
Wide table code handling
Concept introduction
Wide table: The fields of multiple dimensions are stored in one table, and the increase of data redundancy is to reduce the association and facilitate the query. Querying a table can find multiple fields of different dimensions
Narrow table: The same as our Mysql ordinary table three normal form, the fields of the same dimension are formed into a table, and the table and the table are related to query other dimension data
Dimension table: Contains dimension codes and multiple attributes under this dimension
Fact table: contains the relevant attributes of a business event
Code handles wide tables
CloudCanal provides an entry that allows us to write custom code. We can write our processing code on this basis, such as Join other tables, find out multiple rows of data in sub-tables, and then transmit them to ElasticSearch.
Code official reference example: https://gitee.com/cloughence/cloudcanal-data-process
ES index primary key processing
MYSQL synchronizes ElasticSearch, _Id cannot customize different fields in a task, only Id can be used. The _Id value configured on the interface can sometimes take effect, sometimes not, which is weird
solve:
1. CloudCanal found a type problem, and the official reply is "_Id is processed automatically, and does not support custom code to interfere. You need to interfere with the columns involved in _Id, and the custom code can handle AfterPkColumn and BeforePkColumn by yourself"
2. The configuration on the interface is just a decoration. It is unstable and worried that there will be pits in the future, so deal with it at the code level:
-
Change the Id field of the index from the Long type to the Keyword type, which is convenient for storing characters (the plan is Id$ table name)
-
Before adding, modifying or deleting operations, first replace the value of Id
CloudCanal releases code
Custom code packaging
-
Modify Src/main/resources/META-INF/cloudcanal/plugin.properties under the sub-project to the class that needs to be used;
-
Mvn -Dtest -DfailIfNoTests=false -Dmaven.javadoc.skip=true -Dmaven.compile.fork=true clean Package package under the sub-project;
-
Copy the Jar package under Target and rename it to the name you need.
Upload custom code
In the process of creating a task, upload a custom Jar package, select the packaged Jar package and upload it
Custom Code Deployment Updates
When configuring a task for the first time, you can upload the Jar package. If the subsequent code is updated, you can update the Jar package on the task management interface. After the update, the task will restart, and subsequent incremental data will go to the new Jar package.
Debug custom code
If you encounter problems with the CloudCanal custom code, you can view the error report on the task log interface for localization. You can also open the CloudCanal Debug mode, and the task startup will automatically wait for the Debug link.
Enter parameter setting
Task Details -> Parameter Settings
Configure the DebugMode parameter
Find the DebugMode parameter and configure it to True. Make the configuration effective and restart the task. After the task starts it will stop and wait for a remote Debug client to connect.
Start remote Debug
Remote Debug, take Idea as an example, set the task running address (sidecar container or running node Ip, and default port 8787)
Through custom code processing, in addition to the wide table code processing in this article, we can also do a lot of operations based on data distribution, which gives the data distribution component a great possibility. In the address of the code Demo, there are many custom In the case of code processing, you can try your own hands and use the code to process the custom distributed data, so as to achieve the purpose of data processing.
Recommended in the past
1Hand 's enterprise-level digital PaaS platform HZERO version 1.9.0 is officially released!
2Hand aPaaS low-code platform - Feida 2.3.0 RELEASE is officially released!
3 Heavy release | Hande iPaaS global integration platform Jixingta 1.5.0 version officially released
contact us
Please log in to the open platform for product trial . Please open it on PC:
https://open.hand-china.com/market-home/trial-center/
For product details , please log in to the open platform:
https://open.hand-china.com/document-center/
If you have any questions, log in to the open platform for B/L feedback :
▲ For more exciting content, scan the code to follow the public account of "Sihai Hande"