大数据多维分析常用操作图解 OLAP Operations

8cb874eaacfe6e0d8f7cd3eb1bcbca88.png

c2e0d2aca71876b9faa0f85f750165a5.png

d828bc61ca519b064aa68a62c4c94a9a.png

58e68dc9ba476b3adcb51525fac64501.png

多维数据模型中的 OLAP 操作

OLAP Operations in the Multidimensional Data Model

在多维模型中,记录被组织成不同的维度,每个维度包括由概念层次结构描述的多个抽象(abstraction)层次。该数据组织方式支持用户灵活地从各种角度查看数据。存在许多 OLAP 数据立方体操作(data cube operation)来演示这些不同的视图,允许交互式查询和搜索手头的记录。因此,OLAP 支持交互式数据分析的用户友好环境。

考虑要对多维数据执行的 OLAP 操作。该图显示了商店销售额的数据立方体。多维数据集包含维度、位置、时间和项目,其中位置与城市值相关,时间与季度相关,项目与项目类型相关。

In the multidimensional model, the records are organized into various dimensions, and each dimension includes multiple levels of abstraction described by concept hierarchies. This organization support users with the flexibility to view data from various perspectives. A number of OLAP data cube operation exist to demonstrate these different views, allowing interactive queries and search of the record at hand. Hence, OLAP supports a user-friendly environment for interactive data analysis.

Consider the OLAP operations which are to be performed on multidimensional data. The figure shows data cubes for sales of a shop. The cube contains the dimensions, location, and time and item, where the location is aggregated with regard to city values, time is aggregated with respect to quarters, and an item is aggregated with respect to item types.

卷起(Roll-Up)

上卷操作(也称为上钻或聚合操作)通过向下概念层次结构(即降维)对数据立方体执行聚合。汇总就像缩小数据立方体。该图显示了对维度位置执行的汇总操作的结果。位置的层次结构定义为 Order Street、城市、省或州、国家/地区。汇总操作通过将位置层次从城市级别提升到国家级别来聚合数据。

当通过维度缩减执行汇总时,会从多维数据集中移除一个或多个维度。例如,考虑一个具有两个维度的销售数据立方体,位置和时间。可以通过移除出现在按位置而不是按位置和按时间的总销售额的聚合中的时间维度来执行汇总。

The roll-up operation (also known as drill-up or aggregation operation) performs aggregation on a data cube, by climbing down concept hierarchies, i.e., dimension reduction. Roll-up is like zooming-out on the data cubes. Figure shows the result of roll-up operations performed on the dimension location. The hierarchy for the location is defined as the Order Street, city, province, or state, country. The roll-up operation aggregates the data by ascending the location hierarchy from the level of the city to the level of the country.

When a roll-up is performed by dimensions reduction, one or more dimensions are removed from the cube. For example, consider a sales data cube having two dimensions, location and time. Roll-up may be performed by removing, the time dimensions, appearing in an aggregation of the total sales by location, relatively than by location and by time.

下图说明了汇总的工作原理。

f59c4c28f40f85d9ba23879768ec2eae.png

向下钻取(Drill-Down)

向下钻取操作(也称为向下滚动)向上滚动的逆操作。向下钻取就像放大数据立方体。它从不太详细的记录导航到更详细的数据。可以通过逐步降低维度的概念层次结构或添加其他维度来执行向下钻取。

该图显示了通过逐步降低定义为日、月、季度和年的概念层次结构对维度时间执行的向下钻取操作。通过将时间层次结构从季度级别降到更详细的月份级别来显示向下钻取。

因为向下钻取向给定数据添加了更多详细信息,所以也可以通过向多维数据集添加新维度来执行向下钻取。例如,可以通过引入额外的维度(例如客户组)来向下钻取图的中央多维数据集。

The drill-down operation (also called roll-down) is the reverse operation of roll-up. Drill-down is like zooming-in on the data cube. It navigates from less detailed record to more detailed data. Drill-down can be performed by either stepping down a concept hierarchy for a dimension or adding additional dimensions.

Figure shows a drill-down operation performed on the dimension time by stepping down a concept hierarchy which is defined as day, month, quarter, and year. Drill-down appears by descending the time hierarchy from the level of the quarter to a more detailed level of the month.

Because a drill-down adds more details to the given data, it can also be performed by adding a new dimension to a cube. For example, a drill-down on the central cubes of the figure can occur by introducing an additional dimension, such as a customer group.

下图说明了向下钻取的工作原理。

e7321eceb51c4b4dcb2e48b2e1e97b6a.png

切片

切片是多维数据集的子集,对应于维度的一个或多个成员的单个值。例如,当客户想要在三维多维数据集的一维上进行选择时,会执行切片操作,从而生成二维站点。因此,切片操作对给定多维数据集的一维执行选择,从而产生一个子多维数据集。

slice is a subset of the cubes corresponding to a single value for one or more members of the dimension. For example, a slice operation is executed when the customer wants a selection on one dimension of a three-dimensional cube resulting in a two-dimensional site. So, the Slice operations perform a selection on one dimension of the given cube, thus resulting in a subcube.

下图说明了 Slice 的工作原理。

bfed017ffcd686938fa224105f8129cd.png

这里 Slice 使用标准 time = "Q1" 对维度 "time" 起作用。

它将通过选择一个或多个维度来形成一个新的子立方体。

切块(Dice)

切块操作通过在二维或更多维度上操作选择来描述子立方体。

下图显示了切块的操作。

48000947ad30d9f27a3067edda111e4c.png

基于以下选择标准对立方体的骰子操作涉及三个维度。

  • The dice operation on the cubes based on the following selection criteria involves three dimensions.

  • (location = "Toronto" or "Vancouver")

  • (time = "Q1" or "Q2")

  • (item =" Mobile" or "Modem")

旋转(Pivot)

枢轴操作也称为旋转。Pivot 是一种可视化操作,它在视图中旋转数据轴以提供数据的替代表示。它可能包含交换行和列或将行维度之一移动到列维度中。

a3572cbd6820c3242b626db2acea68f9.png

下图显示了旋转操作。

8a69523c6f51c52124d19726978b86b1.png

其他 OLAP 操作

执行包含多个事实表的查询。钻取操作利用关系 SQL 有助于将数据立方体的底层向下钻取到其后端关系表。

其他 OLAP 操作可能包含对列表中前 N 或后 N 元素进行排名,以及计算移动平均线、增长率和利息、内部收益率、折旧、货币兑换和统计任务。

OLAP 提供分析建模功能,包含一个计算引擎,用于确定比率、方差等,并用于计算各个维度的度量。它可以在每个粒度级别和每个维度交叉处生成汇总、聚合和层次结构。OLAP 还提供用于预测、趋势分析和统计分析的功能模型。在这种情况下,OLAP 引擎是一个强大的数据分析工具。

Executes queries containing more than one fact table. The drill-through operations make use of relational SQL facilitates to drill through the bottom level of a data cubes down to its back-end relational tables.

Other OLAP operations may contain ranking the top-N or bottom-N elements in lists, as well as calculate moving average, growth rates, and interests, internal rates of returns, depreciation, currency conversions, and statistical tasks.

OLAP offers analytical modeling capabilities, containing a calculation engine for determining ratios, variance, etc. and for computing measures across various dimensions. It can generate summarization, aggregation, and hierarchies at each granularity level and at every dimensions intersection. OLAP also provide functional models for forecasting, trend analysis, and statistical analysis. In this context, the OLAP engine is a powerful data analysis tool.

14 basic OLAP operations

  • Drill-up

  • Drill-down

  • Slice

  • Dice

  • Pivot

  • Scoping

  • Screening

  • Drill across

  • Drill through

  • Sort

  • Add measure

  • Drop measure

  • Union

  • Difference

OLAP Operations in Data Mining

OLAP is a widely spread technology belonging to Business Intelligence processes developed to coordinate and analyze vast amounts of data. OLAP databases are stored in the form of multidimensional cubes where each cube comprises the data supposed relevant by a cube administrator. Through certain OLAP operations, a user is able to obtain a specified view of the cube and extract requisite information from it. So this way it’s possible to get a necessary Pivot Table and Pivot Chart report.

General OLAP operations involve Drill-up, Drill-down, Pivot, and Slice-and-Dice. Here we’d like to expand the list and look through all possible OLAP operations with examples for data mining including slicing and dicing in OLAP.

But before defining what is OLAP operation, let’s figure out what language is used in this process.

OLAP language

OLAP operations could be based on two OLAP languages: SQL and MDX.

SQL or Structured Query Language is a computer language developed to work in two dimensions in order to manage relational database and manipulate data.

MDX or Multidimensional expressions is a language for analytical queries expression. Its principle difference from SQL language is that MDX is able to reference multiple dimensions. Microsoft primarily invented MDX as a SQL extension.

These two languages are different and have their own peculiarities. However, OLAP operations using SQL and MDX languages are pretty similar.

Our product Ranet OLAP uses MDX query language, that is why today we made an accent on MDX OLAP operations example.

OLAP operations:

So let’s outline the typical OLAP operations now.

Drill Up

This operation you can meet as a part of pair drill up and drill down in OLAP. Drill-up is an operation to gather data from the cube either by ascending a concept hierarchy for a dimension or by dimension reduction in order to receive measures at a less detailed granularity. So that to see a broader perspective in compliance with the concept hierarchy a user has to group columns and unite the values. As there are fewer specifics, one or more dimensions from the data cube will be deleted, when this OLAP operation is run. In some sources drill up and roll up operations in OLAP come as synonyms, so this variant is also possible.

Here’s a typical example of a Drill-up or roll up OLAP operations example:

324a4ae9312cdefd2894426497a3eb0d.png

Drill down

OLAP Drill-down is an operation opposite to Drill-up. It is carried out either by descending a concept hierarchy for a dimension or by adding a new dimension. It lets a user deploy highly detailed data from a less detailed cube. Consequently, when the operation is run, one or more dimensions from the data cube must be appended to provide more information elements.

Have a look at an OLAP Drill-down example in use:

b6963893404b4e2bb64cac69b9961f95.png

Slice

The next pair we are going to discuss is slice and dice operations in OLAP. The Slice OLAP operations takes one specific dimension from a cube given and represents a new sub-cube, which provides information from another point of view.It can create a new sub-cube by choosing one or more dimensions. The use of Slice implies the specified granularity level of the dimension.

OLAP Slice example will look the following way:

837320092e293cb0f9ea7f6f7e478d06.png

Dice

OLAP Dice emphasizes two or more dimensions from a cube given and suggests a new sub-cube, as well as Slice operation does. In order to locate a single value for a cube, it includes adding values for each dimension.

The diagram below shows how Dice operation works:

50f91dfb18dedbff4f6bf2763bd20668.png

Pivot

This OLAP operation rotates the axes of a cube to provide an alternative view of the data cube. Pivot clusters the data with other dimensions which helps analyze the performance of a company or enterprise.

Here’s an example of Pivot in operation:

bdc791117d872a8deb341b04a5828a8e.png


Scoping

The operation of Scoping restrains the presentation of the database objects to a specified subset. It will let users receive and update certain data values which they want. If there is a huge amount of data and a user needs to constrain the access of information to a specified subset Scoping is mostly conducive.

Screening

Screening is conducted to limit the set of data extracted.

Drill across

Drill across and Drill through in OLAP are another pair of opposite operations. The operation Drill across reconciles cells from several data cubes which share the same scheme.

Drill through

OLAP Drill through enables to navigate from data at the lower level in a cube to data in the operational systems whence the cube was ejected. The operation is usually exploited to identify the cause of outlier values in a data cube.

Sort

Sort brings the cube back where the members of a dimension were sorted.

Add Measure

Thanks to this OLAP operation one is able to add new measures to a cube.

Drop Measure

In contrast to Add Measure, it’s also possible to get rid of a measure from a data cube if it's not necessary.

Union

Due to an opportunity of Union, you can unite a number of cubes which have the same scheme but separate instances.

Difference

Difference eliminates the cells in a cube which are owned by another one. These two cubes must possess the same scheme.

Questions

In order to summarize everything up, let’s go through the top asked questions about OLAP operations.

How to define the concept of OLAP and the operations it supports?

Online Analytical Processing is a technology, which helps to perform business data multidimensional analysis and operate complex calculations and data modeling. OLAP databases are stored in the form of multidimensional cubes where each cube comprises the data supposed relevant by a cube administrator. OLAP operations aimed to help user to obtain a specified view of the cube and extract requisite information from it.

What are different types of OLAP operations?

We can distinguish 14 basic OLAP operations:

  • Drill-up

  • Drill-down

  • Slice

  • Dice

  • Pivot

  • Scoping

  • Screening

  • Drill across

  • Drill through

  • Sort

  • Add measure

  • Drop measure

  • Union

  • Difference

More info you can find in the beginning of the article where we discussed the typical OLAP operations with examples.

What is difference between slice and dice in OLAP?

The Slice operation takes one specific dimension from a cube given and represents a new sub-cube which provides information from another point of view. The Dice operation in the contrary emphasizes two or more dimensions from a cube.

In conclusion, its a must to point out that OLAP system contains all historical processing of information which you’ll be able to see in a summarized and multidimensional view drawing on the operations described above. Through them, the data will turn out flexible and user-friendly to analyze.

番外篇:

022726831781a5133bea49368313ac64.png

猜你喜欢

转载自blog.csdn.net/universsky2015/article/details/125213956