Data redundancy

1 What is data redundancy

Data redundancy: duplicate data in a data set is called data redundancy.

For example in the design of a database, a table belonging to a field, but it also appears in another one or more tables, and it is exactly the same sense in which the table the original representation, then this field is a redundant field.

1. relational database data redundancy mainly refers to a relational database stored repeating the same information data.

2. Data redundancy a waste of valuable resources, should be minimized. But relational databases to achieve some of the features that some data redundancy is required. Necessary data redundancy is mainly used for the following purposes:

① establish links between data, such as between two tables related by common attributes

② data recovery, such as a backup file to prepare for official documents to be destroyed recovery

③ verification data, such as setting up a data check bit check data can be changed in the storage and transmission process

④ facilitate data used, such as to view visual data, usage data easily and efficiently

⑤ reduce data communication overhead, such as distributed database repeated in different venues

2 data redundancy reasons formed

Table repeated reasons, the attributes and ancillary Table relational database files, which define the table structure by the attribute tuple (record), whose properties range There are many types, data redundancy relation database so formed rEPEAT, rEPEAT repeat 4 classes, attribute value tuples.

Repeat 2.1 table

For data security need to make the backup table, the master table is damaged when this can be used to recover data. Distributed database to reduce data communication overhead often repeated to put the table, that data redundancy is necessary here redundant data can not be deleted. If the unnecessary duplication of tables produced for other reasons should be deleted.

2.2 Properties repeat

Different tables have attributes and the same was repeated in the table attributes repeated two cases:

(1) repeating properties in different tables used to establish links between tables, only requires a public property, which is required redundant data can not be deleted; more than one attribute between the tables should be deleted. If the following three tables:

T1(A,B,C);T2(A,B,D);T3(A,C,D,E)。

Wherein A is a property common to the three tables; property B of T1, T2 common two tables, attribute C is T1, T3 total of two tables; attribute D is T2, T3 common to two tables. A is taken as the public properties, T1, T2 can retain a two tables B attribute; T1, T3 can retain a two tables C attribute; T2, T3 two tables D can retain a property.

There are multiple properties within the same attributes of the content (2) the same table, if not require data security checks, should be deleted.

2.3 yuan repeating group

Different records in the table contents are sometimes identical, if not necessary, should be deleted.

2.4 Repeat attribute value

Features range set by attribute group can be divided into finite and infinite class categories.

(1) an infinite class attribute duplicate values. Unlimited class attribute value refers to a group of the same order of magnitude or attribute value is infinite number of database records its attribute set range, as real, integer, date, various numbers.

Unlimited class attribute values ​​may occasionally be repeated, but this is just a coincidence, not data redundancy.

(2) a limited class attribute duplicate values. Limited class attribute value refers to a group which is less than the number of database records of at least one attribute range set magnitude of attribute values, such as the product name, department name, title name, program name.

Repeat actually caused by the finite-many relationship attribute value classes, as an essential redundant data sometimes not be processed, then the program will have a better view without effect and efficiency. But when a large amount of repetition, also managed to be caused by data redundancy compression, this is usually to create a new table and corresponding program.

2.5 with different causes of data redundancy operation on different levels to eliminate.

① eliminate duplicate table data redundancy due to a disk file-level operations

Repeating data redundancy due to elimination of the operation ② attribute changes to the database structure

③ tuple data duplication elimination of redundancies caused by the completion of the recording operation stage

3 data redundancy shortcomings

① data storage space wasted

② data exchange and reduce the efficiency of database access

③ table is a need to synchronize updates (for more than one field where reference is fatal)

4 data redundancy advantages

① highly interactive and database data access efficiency

② help when data loss, data recovery

Said that in the end, data redundancy what is good or bad or to make a rational choice based on their own projects done.

Guess you like

Origin www.cnblogs.com/twelvezuo/p/11671201.html