In the process of building a BI platform for enterprises, master these functional data and prepare to get twice the result with half the effort!

In the process of enterprise informatization and the establishment of BI platform, data preparation has always been a very important link, which is the basis of data analysis. We ensure the accuracy, authenticity and completeness of the data through data preparation, and improve the display quality of the data, thereby ensuring the high-quality and efficient data analysis.

Data preparation mainly includes data processing, data extraction, and data mart creation. Data processing refers to the cleaning (extract), transformation (transform), and loading (load) of data before data analysis, referred to as ETL. The source data may come from different business systems, they have different data formats, and also contain some redundant information. ETL is responsible for extracting data from dispersed and heterogeneous data sources, such as relational data, flat data files, etc., to the temporary intermediate layer, cleaning, converting, and integrating, and finally loading it into a data warehouse or data mart to become online analysis processing , Data mining provides decision support data.

In the process of building a BI platform for enterprises, master these functional data and prepare to get twice the result with half the effort!

Smartbi products have a family of professional data preparation function modules. Through self-service ETL functions, business themes, self-service data sets, perspective analysis and other functional members, it improves the quality and performance of data and shortens the time for data preparation.

Let's introduce one by one:

Self-service ETL

Date of Birth: 2019

Loyalty target: technical personnel and business personnel with ETL data processing needs

Creation background: Some customers have data processing needs and need related processing tools. The customer's source data may come from different business systems. They have different data formats and also contain some redundant information. It is often necessary to extract, clean, transform, and integrate data from dispersed and heterogeneous data sources through ETL. . However, general ETL tools are more technical and require more professional personnel to proceed.

Function introduction:

Smartbi separates the technology by encapsulating the ETL algorithm, enabling business personnel to perform self-service ETL operations. Self-service ETL is implemented in the form of workflow to extract the semantics of the data model for the library table. It is preprocessed by simple drag-and-drop operations, and supports: filtering and mapping, null value processing, JOIN, removing duplicate values, column classification, derived columns, etc. A variety of preprocessing methods can solve the problems of enterprise data fragmentation, messiness, and inconsistent standards, and process the data into a data model with semantic consistency and integrity. It can also enhance the ability of self-service data sets to build data models.

Combat frequency: ☆☆☆☆

In the process of building a BI platform for enterprises, master these functional data and prepare to get twice the result with half the effort!

Skill tricks:

Self-service ETL does not need to be deployed separately, and can be seamlessly connected with Smartbi, allowing the results of data preparation to be directly provided to BI in the form of data tables. Self-service ETL adopts a distributed computing architecture and supports multi-threading in a single node, which greatly improves the performance of data processing. The processing time of 100 million-level data can reach the minute level. It mainly has the following four characteristics:

● Integration. Integrated in Smartbi, it can be used without independent deployment.

● Visualization. Fully interface-based direct operation, streaming data processing, with a wealth of processing nodes, business personnel can participate.

● High performance. The distributed meter has powerful performance and adopts the industry's advanced architecture. It can handle massive quantities. The maximum scale can reach the PB level. The data processing performance is 10 times that of the same type of traditional tools.

● Strong function. A large number of components take into account general data processing and advanced data processing.

Business theme

Date of Birth: 2011

Loyalty objects: generally created by technical personnel through drag and drop, and used by business personnel

Create background:

The semantic virtual layer, Semantic Virtual Tier, is a concept clearly defined by Gartner. The semantic model is a similar concept. It refers to a logical view that covers the entire warehouse or the scope of analysis.

If business users want to organize wide tables by themselves, the semantic model is the prerequisite for building dynamic large tables. Only after the full warehouse modeling is available, can business personnel build their own analytical models.

For example, in the actual application of the project, after accessing the data source, the basic tables in the relational data source may belong to different business logic. For example, there are 20 basic tables under the data source, 10 of which are related to human resource data, and the other 10 are related to product sales. According to actual business needs, these original database basic tables can be redefined according to business logic and assembled into business objects (logical tables) for use by relevant personnel.

In the above situation, we set up different table relationships, used business themes to freely combine table fields, hierarchically repackaged the 10 human resources-related tables into the "human resources" theme, and re-packaged the 10 tables related to product sales. Encapsulate them into a "product sales" theme, and provide these two themes to different business personnel, which will make it easier to query.

Function introduction:

Business theme is the most commonly used type of data resource. It is a method where technicians construct a semantic virtual layer by dragging and dropping the original table, thereby transforming complex data relationships into a logical model that can be recognized and used by business analysts. It usually implements business modeling based on analysis scenarios. Users can define the relationship between fields and tables in the database according to business logic, forming topics that business personnel can understand, and can control permissions.

Combat frequency: ☆☆☆☆☆

In the process of building a BI platform for enterprises, master these functional data and prepare to get twice the result with half the effort!

Skill tricks:

● Whole warehouse modeling. The entire warehouse can be modeled, table related information can be set, business meaning translation of table and field names, field data type and display format can be set; table fields can be freely combined, hierarchically layered, and easier to query and drag ; Support the setting of different table relations for different topics of related tables, and realize flexible setting of table relations.

● Permission control. Can realize the subject authority, data row authority, column authority, etc., truly achieve professional-level security control.

● Data integration. With powerful data integration capabilities, it supports setting calculation fields, setting dimensional levels, geographic dimensions, and time dimensions.

Image display:

Business objects are the basic elements that make up a business theme. Business objects can nest business objects, and the nested business objects are collectively referred to as "business sub-objects". Business objects can be dragged in from the left side of the table, or they can be created.

In the process of building a BI platform for enterprises, master these functional data and prepare to get twice the result with half the effort!

In the process of building a BI platform for enterprises, master these functional data and prepare to get twice the result with half the effort!

Business attributes are the most basic elements that make up a business object and are equivalent to fields in a table.

Self-service data set

Date of Birth: 2018

Loyalty object: business personnel

Creation background: Traditional data set construction requires technical personnel to carry out. In the actual situation, when the data resources of the business personnel cannot meet the demand for data analysis (for example, the business personnel need to perform correlation analysis on multiple public data sets, or upload local Excel and combine with public data sets for correlation analysis), You need a data set that business people can easily create and use.

Function introduction:

Self-service data set is a type of data set based on individual needs and flexible query capabilities. Business users can use cross-database, multi-table association, data conversion, complex logical relationship calculation, and data extraction ETL in a visual manner according to business requirements. And other functions, extract data to self-service data set. For example, business personnel who need large-screen display can choose the self-service data set provided by Smartbi to make self-service dashboards.

Combat frequency: ☆☆☆☆☆

Skill tricks:

● Visual operation. The operation interface is visualized without any code.

● High performance guarantee. When the amount of data is large, the extraction rules can be defined to the cache to speed up subsequent analysis applications.

● Support data row permission control. The data row authority control can be realized through the visual interface setting.

● Support cross-database query. When users query data in a wide range and not limited to one database, they can query across multiple databases.

● Support dimension level definition. The self-service data set supports the definition of the time dimension level of the date field and the geographical dimension level definition of the area field. The dimension level realizes the function of drilling in the self-service dashboard.

Perspective analysis

Date of Birth: 2014

Loyalty: data analysts, front-line business personnel, etc. can use

Creation background: Everyone is a data analyst. In the current era, a large amount of data is being generated all the time. The competitive landscape and variable elements of the market are changing rapidly. The frequency of data analysis has increased to daily or even hourly, and the angle of data analysis must be adjusted at any time. Professional data analysis requires a complex data processing process. OLAP multidimensional analysis requires Cube, the establishment of dimension tables, fact tables, fixed dimension levels, aggregation indicators, etc., and data query requires advanced SQL statements.

And every employee of an enterprise needs to transform into a data analyst and use self-service BI data analysis tools to maximize their individual combat capabilities. Perspective analysis came into being on this basis.

Function introduction:

With the design of "Excel-like PivotTable", multi-dimensional analysis no longer needs to build a model, and can combine dimensions, summarize calculations, slice, drill, and gain insights into data. Not only that, any field can be directly used as an output field or filter condition to easily realize data query and exploration.

Combat frequency: ☆☆☆☆☆

In the process of building a BI platform for enterprises, master these functional data and prepare to get twice the result with half the effort!

Figure: Operation example similar to Excel pivot table

Skill tricks:

● Simple and easy to use. Similar to Excel pivot table, easy to use. Data cross combination can be completed without SQL or modeling, multi-dimensional drilling, flexible query.

● High performance. Supports extracting data to the high-speed cache library, supports large data volume query performance, and responds in seconds.

● Multi-dimensional. Supports analysis scenarios with ultra-multi-dimensional or even unfixed dimensions.

● Various time calculations. Time calculations and secondary calculations can be set according to business attributes, such as rapid analysis of year/month/day growth rates, etc.; and support a variety of application scenarios, such as customizing the week start time, taking the same time period data for ring comparison, etc.

Guess you like

Origin blog.51cto.com/15047075/2608883