什么是数据仓库

摘自: 《Data Mining - Concepts and Techniques》 

According toWilliam H. Inmon, a leading architect in the construction of data warehouse

systems, “A data warehouse is a subject-oriented, integrated, time-variant, and

nonvolatile collection of data in support of management’s decision making process”

[Inm96]. This short, but comprehensive definition presents the major features of a data

warehouse. The four keywords, subject-oriented, integrated, time-variant, and nonvolatile,

distinguish data warehouses from other data repository systems, such as relational

database systems, transaction processing systems, and file systems. Let’s take a closer

look at each of these key features.

Subject-oriented: A data warehouse is organized around major subjects, such as customer, supplier, product, and sales.Rather than concentrating on the day-to-day operations and transaction processing of an organization, a data warehouse focuses on the modeling and analysis of data for decision makers. Hence, data warehouses typically provide a simple and concise view around particular subject issues by excluding data that are not useful in the decision support process.

Integrated: A data warehouse is usually constructed by integratingmultiple heterogeneous sources, such as relational databases, flat files, and on-line transaction records. Data cleaning and data integration techniques are applied to ensure consistency in naming conventions, encoding structures, attribute measures, and so on.

Time-variant: Data are stored to provide information from a historical perspective (e.g., the past 5–10 years). Every key structure in the data warehouse contains, either implicitly or explicitly, an element of time.

Nonvolatile: A data warehouse is always a physically separate store of data transformed from the application data found in the operational environment. Due to this separation, a data warehouse does not require transaction processing, recovery, and concurrency control mechanisms. It usually requires only two operations in data accessing: initial loading of data and access of data.

猜你喜欢

转载自goaheadtw.iteye.com/blog/1733903