Section 2 site click-stream projects (under): 1, the basic concepts of data warehouse modeling

Data Warehouse Design Overview:
1, dimensional modeling: our common data warehouse is used to analyze a means of data
is divided into two areas: Table dimensions: from a different angle to look at the same thing, get different the results
fact sheet: record the exact thing has happened, is described in a complete thing
three ways dimensional modeling:
the first way: the star model, based on the fact table as the core, a number of different the dimension tables around the hash, then related through foreign key relationship between the primary
second way: snowflake model, based on the fact table as the core, with a number of different dimensions hash tables around, there are a number of different dimension tables other possible associated dimension table
third: constellation mode to multiple fact tables as the core basis, may be associated with the same dimension tables, most of the model are based on a constellation patterns between multiple fact tables

============================================================

Second, the data warehouse design module development ----

1. Dimensional modeling basic concepts

Dimensional modeling (dimensional modeling) dedicated to analyzing databases, data warehouses, data marts modeling method. Data marts can be understood as a kind of "small data warehouse."

Dimension table (dimension)

Dimensions represent a volume when you want to analyze the data used, for example, you want to analyze product sales, you can choose to be analyzed by category, or by region analysis. Press .. Such analysis constitutes a dimension. Another example is "Yesterday afternoon I spent 200 yuan to drink a cup of cappuccino at Starbucks." So in order to analyze the theme of consumption, it can be extracted from this information in three dimensions: time dimension (yesterday afternoon), the location dimension (Starbucks), trade dimensions (cappuccino). Generally, relatively fixed dimension table information, and a small amount of data.

Fact sheet (fact table)

It represents a measure for the analysis of the subject. The fact table contains the foreign key associated with each of the dimension tables and dimension tables by JOIN associated manner. The fact table is usually a measure of numeric type, and the record number will continue to increase, the rapid growth of the size of the table. Examples of the above example, consumption, and its consumption exemplary table structure following facts:

Consumer fact sheet: Prod_id (reference product dimension table), TimeKey (reference time dimension table), Place_id (reference location dimension table), Unit (sales).

Overall, the design does not need to strictly abide by the principles of standardization in the data warehouse. Because the dominant feature is the data warehouse for analysis to the query-based, does not involve data updates. Design is based on the fact table can be correctly recorded history information for the guidelines, the dimension table design is capable of polymerization at right angles to the subject matter as a criterion.

 

2. Dimensional modeling three modes

2.1. Star Schema

Star schema (Star Schema) is the most common way of dimensional modeling. Star schema is a fact table as the center, all the dimension tables directly connected to the fact table, like a star.

Dimensional modeling by a star schema into a fact table and a set of dimension tables, and has the following characteristics:

 . A dimension tables and fact tables only associated not associated dimension table;

 . B Each single dimension table primary key, and the master key is placed in the fact table, connected to both sides as a foreign key;

. C facts table as the core dimension tables in a star-shaped distribution around the core;

  

 

2.2. Snowflake pattern

Snow mode (Snowflake Schema) is an extension of a star schema. Snowflake pattern dimension table may have other dimension tables, although this model compared to some of the more standard star, but because this model is not easy to understand, the maintenance costs are high, and performance needs associated with multi-layered dimensional table performance also lower than the star model. It is generally not very common.

 

2.3. Constellation Mode

Constellation model is a star schema extending from, the star model is based on a fact table, and the constellation model is based on multiple fact tables, dimension and share information.

Two dimensions of the modeling methods described earlier are multi-dimensional table corresponds to a single fact table, but in many cases dimensions of space more than one fact table, and a dimension table may also be used in multiple fact tables. In the latter part of business development, most of the dimensional modeling are used in a constellation pattern.

 

 

 

 

Guess you like

Origin www.cnblogs.com/mediocreWorld/p/11105427.html
Recommended