Mixed isomers associated data source computing optimization

In the report project, report data sources often come from a variety of heterogeneous data sources, such as: a relational database (oracle, db2, mysql), nosql database (mongodb), http data source, hadoop (hive, hdfs), even It is excel or text file. For such cases, it is common practice using ETL tools, these data sources are synchronized to the data warehouse and then calculated. But there is a problem of this approach:

1, the configuration complicated, difficult;

2, high cost;

3, data can not be accessed in real time, the time delay is longer;

4, building and managing data warehouses are more complex;

5, if a large amount of data is inefficient, but also to constantly synchronize data ETL various application systems;

6, the data warehouse is utilized traditional database technology, when the load is increased by the need for higher cost expansion.

And compared to the traditional practice using Run Dry report is straightforward to implement hybrid data sources report, which would be directly read by a variety of mixed data sources built-in calculation engine set, so that the data stored in the most appropriate manner, the final with a smaller cost-based real-time reports showing mixed data sources. Comparative Run Dry ETL mode and report mode on the architecture as shown below:

Here, the "state sales staff sales report," the design point of view about the specific implementation steps. Reports in the following figure:

Sales data reports from mongodb database marketing systems, information salesman db2 database from the HR system. Using dry-run mixed data sources report embodiment, the synchronization does not require periodic report data source, not the delay time.

The first step in the preparation of set designer solver script, and save it as statesales.dfx, the script reads as follows:

Code Description:

A1: db22 connection pre-configured data source.

A2: execute SQL, employee number taken from a table, the parameter state = "California".

A3, A4: reading collection orders from mongodb sales system.

A5: using the set switch function calculator, A4 in the field SellerID A2 is switched to the record associated conditions sellerid = eid. @i option means that if the corresponding record is not found, then delete the row.

A6: generate a new sequence tables give the required fields.

A7、A8:关闭数据库连接。

A9:返回给集算报表。

第二,在报表设计器中定义参数 state,配置集算数据集:

第三,设计报表如下:

运行报表,输入参数计算后,即可得到前面希望的报表。报表上部的查询界面是润乾报表自动提供的“参数模板”功能。参数模板和 db2、mongodb 数据源配置的具体做法参见教程和其他文档,这里不再赘述。

需要说明的是,如果数据源类型发生了变化,只需进行小幅改动即可使报表生效。比如新上线的销售系统采用了 oracle 数据库,只要修改 statesales.dfx 的 A1 改为:

hrdb=connect(“ora”)

同时复制 oracle jdbc 驱动、配置 oracle 数据库的连接参数即可。

Guess you like

Origin www.cnblogs.com/shiGuangShiYi/p/12110828.html