Report data to achieve pre-calculated

Reporting applications, or if a large amount of data calculation process is more complex, often leads to too slow to prepare the report data source, thus affecting the performance report. In this case the data needed for the report may be calculated in advance, when rendered direct reference, such that the user can quickly obtain the response at the time of accessing the report.

First, the current means and disadvantages

Reporting is often required when accessing the parameters, so obviously not all combinations of the parameters corresponding to the report data sources are ready, it is usually calculated in advance only generated intermediate data, still again follow some simple calculations when rendering (e.g., filtration , packet aggregation, sorting, etc.). But even so, it is unlikely to be fully completed in the middle of all subsequent operations on the basis of data from the reporting tools, reporting tools usually only complete some small amount of data operations. That is, the intermediate data storage also requires the ability to re-calculations, and so will the intermediate data stored in the form generally in the middle of the table in the database, so that when re-rendering borrow computing capabilities database.

The middle table will be calculated in advance the following drawbacks: First, excessive computational work to do database, will undoubtedly increase the pressure on the database, but even cause performance falling instead of rising; Second, the middle of the table too easily lead to management confusion, because the database non-hierarchical (tree structure with different file system), so there are a lot of middle of the table tend to increase the database management more difficult. Furthermore, the intermediate large read from the database table will appear in I / O bottleneck, the same result in poor performance report.

Second, Run Dry Reporting Solutions

(Implemented in conjunction operator set) calculated in advance programs run dry without the use of a database report intermediate table to avoid the above drawbacks. Run Dry report calculation engine has built a complete computing power, on the one hand the intermediate data can be stored in a file, on the other hand may be recalculated after the file source data as a report, a report to shorten the calculation time, improve report performance.

This mode looks similar intermediate table using the database, we need to prepare in advance to calculate the data, but there are big differences: First, do not take up expensive database space, will not increase the burden on the database; second, intermediate data of the organization based file management system, clear; third, IO bottleneck does not occur when large volumes of data.

Run Dry these reports can do, thanks to a set of operator engine built specifically for calculated data. Operator sets the file system and the engine can interact seamlessly (and read out), can be read multiple file formats, such as the common text, Excel, etc., including binary higher efficiency, so that the document includes a recalculation ability to easily calculate in advance the report.

Run Dry statements will be described with an example of the step of the pre-calculated:

1, save intermediate results need to be documented

Run Dry statements support the common text file formats, such as Order Details can be grouped data in aggregate placed directly in the text (orderDetail.txt) in. If desired to improve performance, Run Dry report also supports a more efficient binary file format, the text can be faster than the 2-5 times, performed in the following code can be set in the solver convert a text file into a binary file format.

file ( "E: / Order Details .b") export @ b (file (. "E: / Order Details .txt" .cursor ())

Of course, the process of generating the intermediate data collector itself may also be employed to achieve solver, but not herein concerns, not described in detail here.

2, generate reports based on the intermediate source data file

Run Dry report files can be directly calculated based on again, thereby obtaining the report data source, such as the following filtering algorithm.

Script parameters and their meanings are as follows:

imagepng

Wherein cols column name is selected, where the condition of the filter (spliced ​​into the above format parameter passing), num is the number of records fetched.

Script Content:

  A
1 = File ( "E: / preliminary summary of orders for the last five years .txt") cursor @ t ($ {cols}).
2 =A1.select(${where})
3 =A2.fetch(num)
4 return A3

Based on the above-described script files and group summary filtering operations, wherein:

A1: reading large text data files by the cursor (streaming), where support columns selected, the user can select a column of data in accordance with the control parameters.

A2: conditional filtering according to the parameters, the result is still the cursor.

A3: parameter limits the number of records in accordance with the recording cursor removed.

A4: return a result set for the report.

Above script handles only an intermediate file, if required query data from a script written in multiple files at the same time can do (for example to 2):

  A
1 = File ( "E: / 1996-1999 preliminary summary of orders .txt") cursor @ t ($ {cols}).
2 = File ( "E: / 2000-2005 preliminary summary of orders .txt") cursor @ t ($ {cols}).
3 =[A1,A2].conj@x()
4 =A3.select(${where})
5 =A4.fetch(5000)
6 return A5

Wherein the intermediate data storage files by year, one every five years. If the query data for the period of 1996-2005, we need to read the two documents. A3 on the cursor script file are two longitudinal stitching, merging into a cursor and then was treated in a manner identical to the first script. When the data range queries continues to expand, requiring multiple file, multiple files cursors can be consolidated by means of longitudinal stitching cycle.

3, design reports

This step includes calling sets Run Dry statements count script, edit the report expressions complete reporting, etc., which are reporting regular action, and not repeat them.

And general optimization approach is similar to using a pre-calculated to improve report performance to fully consider the use scene. For some easily resolved scene calculation process is particularly suitable for use pre-calculated, as do the large table summarizes the data tables is connected with the other, then the first may be a large table summarizes the data previously stored as a file, and then make the connection to other computing table. In addition, real-time requirements of data need to fully consider, such as statements of historical query class is more suitable for use in pre-calculated, of course, Run Dry report also provides additional means to ensure the real-time requirements of the data.

In summary, due to the general reporting tools do not have the documents computing power, so to achieve pre-calculated often carried out via the database; and Run Dry statements have a complete file of computing power, you can avoid all the disadvantages of a database center brings to the table, which is on user great practical value.

Guess you like

Origin www.cnblogs.com/xiaohuihui-11/p/12026880.html