Database - relational data theory

Relational data theory

  This article review the information database consolidation.

  reference:

  https://blog.csdn.net/prdslf001001/article/details/80336835

  https://www.bilibili.com/video/av73467859/

  https://www.bilibili.com/video/BV1eE411a79r/

First, the problems caused by data redundancy

  1) redundant storage: the information is stored repeatedly, resulting in wasting a lot of storage space.

  2) update anomalies: When a duplicate copy of the information is modified, all copies must make the same modifications. So when updating the data, the system must pay a high price to maintain the integrity of the database, or face the risk of data inconsistency.

  3) insert exception: only when some of the information has been previously stored in the database, while others can be stored in the database information.

  4) Delete Exception: Additional information may be lost when you remove certain information.

 

Second, the definition of functional dependency

  1, functional dependency

  In relation R, if the value of the property or the property set A equal to two tuples, if the corresponding properties of the two neuron progenitor or B attribute set in the same value, referred to as the A-> B. A decision function B; or B function depends on A.

 

 

   2, with non-trivial dependence trivial function

  For any relational schema, trivial functional dependency is bound to set up, it does not reflect the new semantics. Unless otherwise stated, always discuss non-trivial functional dependencies.

 

 

   3, full and partial functional dependency functional dependency

  Fully functional dependency :( student number, course number) -> result; a single school, the results can not be determined, a single course, we can not determine results; only two at the same time, in order to decide;

  Function dependent part :( school, class number) -> name; school class number and name resolution can determine, a single number can be determined school name;

 

 

   4, the transfer function dependent

  Student ID -> system number, line number -> Head; Head transmitted depends on the number of school

 

 

 Third, functional dependency theory

  1, the code, super code, and a main candidate code Code

  Code is a set of one or more attributes.

  Super Size is a collection of one or more attributes, super size of these properties allows us to focus on one entity uniquely identify an entity.

  Candidate code is minimal Super Size set, that is, it's not an arbitrary subset of ultra-yards, and he himself is a super code.

  The master key is the selected candidate code used to distinguish the different tuples in a relation.

  

  Determining the candidate code:

  R mode is provided in relation U = ABC ....... attributes like N, U in the range of four kinds of attributes in the FD:

  (1) occurs around;
  (2) appears only in the left portion;
  (3) appear in the right portion only;
  (4) does not occur around;

  Algorithm: key candidate seeking the following steps:
  1. Only the right part of the attribute appearing FD, does not belong to the candidate code;
  2. FD attribute in the left portion only occur, there must be a particular code among the candidates;
  3. External attributes must exist among any candidate code; (about not occur)

  4. Other properties by-attribute combination with 2, 3, seeking properties closure until the closure of X is equal to U, if equal to U, then X is a candidate code.

  Example 1: R <U, F>, U = (A, B, C, D, E, G), F = {AB -> C, CD -> E, E -> AA -> G }, and the main code candidate seeking properties.

  Since: G only appears on the right, the candidate code is certainly not contain G, BD only in the left, so that the candidate code certainly BD, the BD closure or BD, then the BD are combined, in addition to G, BD It may be combined with A, C, E.


    Look ABD
    ABD ABD packet from itself, and AB -> C, CD -> E, A -> G, the closure is so ABD ABDCEG = U
   look at the BDC
    the CD -> E, E -> a, a -> G, BDC from the package itself, the closure of the BDC BDCEAG = U
    last look BDE
    E -> a, a -> G, AB -> C, since the package itself BDE, so BDE the closure is BDEAGC = U

    Since (ABD), (BCD), (BDE) of the closure are so ABCDEG candidate codebook has three problems are ABC, BCD and BDE

   Candidate Code: ABC, BCD, BDE;

  Main attributes (primary attributes, can determine other properties): ABCDE;

  Non-primary attributes: G;
  

  2, Armstrong axiom system

  Provided relational schema R <U, F>, where U is a set of attributes, F is a set of functions that depend on U, it has the following inference rules

  ① A1 reflexive law: if Y⊆X⊆U, the X → Y is F implied; namely: ABC → AB; AB -> A ( dependent trivial function);
  ② augmented law A2: If X → Y is F implied, and Z⊆U, the XZ → YZ is F implied;
  ③ transfer law A3: If X → Y, Y → Z is F implied, the X → Z is F implied.
  The three above inference rule, but also the introduction of the following three rules of inference:
  ④ merge rule: If X → Y, X → Z, then X → YZ is F implied;
  ⑤ pseudo-passing rules: If X → Y, WY → Z, XW → Z is F is implied; namely: A → B, AC → BC ; BC → D; D → draw the AC;
  ⑥ decomposition rules: If X → Y, Z⊆Y, the X → Z is F implied . That is: A → BC; can be derived: A → B, A → C ;

  3. Closure of Attribute

  A closure is the collection of attribute derived directly or indirectly all attributes.

  For example: f = {a-> b, b-> c, a-> d, e-> f}; can be obtained directly by the a and b d, indirectly, c, then a closure is {a, b, c, d};

  Known relationship R (A1, A2, A3, A4, A5, A6), F is a functional dependencies {(A2, A3) -> A4, A3 -> A6, (A2, A5) -> A1} , Q (A2, A3) on the properties of the closure to F: {A2, A3, A4, A6}; as: A2, A3 can be taken to A4, A3 A6 can be obtained;

  Known relationship R (A, B, C, D, E, F, G), the function F is dependent set {A -> B, B -> D, AD -> EF, AG -> C} , Q: a property on closure F is: {a, B, D, E, F}; as: a get B, B can be obtained D, AD EF can be obtained;

  4, the smallest functional dependencies (regular coverage)

  1, the definition:
  if the functional dependencies F satisfies the following conditions, F is called a minimal functional dependencies. Also known as minimum or minimal covering set of dependencies.

  (1) F in the right portion of any of a dependent function having only one attribute.

  There is no functional dependency X → A, such that F and F- {X → A} is equivalent to (2) in F.

  There is no functional dependency X → A (3) F, X is such that there is a proper subset of Z F- {X → A} U {Z → A} and F are equivalent.

  2, the minimum set of common algorithms rely:
  ① decomposed by law, the right to make any part of a functional dependency in F having only one attribute;

  ② remove excess functional dependency: dependency X → Y from a beginning of the function to be removed from F, and then seeking the closure X X of the remaining functions dependent on the +, see X + contains Y, if yes, removing X → Y; otherwise it can not be removed, in order to do so. Until not find redundant functional dependencies;

  ③ remove excess portions of each left-dependent properties. A non-functional dependency checking a single left portion dependent properties . E.g. XY → A, Y is judged to excess, instead of places X → A are equivalent XY → A? If A belongs (X) +, then Y is redundant properties, can be removed. (In the above step, obtaining dependence relationships set F, this time, on the basis of F, X or Y determined closure, contains A)

  3, dependent on the minimum set of Case:
  Example 1: relational schema R (U, F) in, U = ABCDEG, F = { B-> D, DG-> C, BD-> E, AG-> B, ADG-> BC}; F minimum required functional dependencies

  step:

  (1) decomposition by law, so that any portion of a right-dependent function F in having only one attribute; obtained: F = {B-> D, DG-> C, BD-> E, AG-> B, ADG -> B, ADG-> C};

  (2) remove excess functional dependency: X → Y dependent function starts from a first to remove from F, and then seeking closure X + X in the remaining dependent function in order to do so. Until not find redundant functional dependencies;

    ① removed B-> D, this point F = {DG-> C, BD-> E, AG-> B, ADG-> B, ADG-> C}, B is obtained under these conditions closure B + = B ; B + does not include D, so B-> D reserved.

    ② removed DG-> C, this time F = {B-> D, BD-> E, AG-> B, ADG-> B, ADG-> C}, In this case closure DG DG + = DG, does not contain C , it can not be removed DG-> C.

    ③ removed BD-> E, this point F = {B-> D, DG-> C, AG-> B, ADG-> B, ADG-> C}, In this case closure BD + = BD, does not include E, so you can not get rid of BD-> E, remain.

    ④ removed AG-> B, at this time F = {B-> D, DG-> C, BD-> E, ADG-> B, ADG-> C}; this time AG + = AG, does not contain B, it is not remove AG-> B, remain.

    ⑤ removed ADG-> B, at this time F = {B-> D, DG-> C, BD-> E, AG-> B, ADG-> C}, this time ADG + = ADGCBE, comprising to B, is deleted ADG-> B, is not retained.

    ⑥ remove ADG-> C, this time F = {B-> D, DG-> C, BD-> E, AG-> B}, this time ADG + = ADGCBD, contains C, is deleted ADG-> C, Not Retained.

    The resulting sum, this time to give F = {B-> D, DG-> C, BD-> E, AG-> B};

  (3) removing excess dependency attribute of each left portion. A non-functional dependency checking a single left portion dependent properties.

  At this time, the left-dependent non-single function attributes: DG-> C, BD-> E, AG-> B; so do the following:

    ① First look DG-> C, D is determined whether the extra seeking DG - D = G closure, in which case the closure G G + = G, C does not contain, retention D. Determining whether extra G, seeking DG - G = D closure, when D + = D, does not comprise C, G can not be so removed;

    ② look BD-> E, B determines whether extra seeking BD - B = D closure, when the closure D D + = D, excluding E, retention B. Determining whether the excess D, seeking BD - D = B closure, in which case B + = BDE, includes E, so remove D.

    ③ Finally, look AG-> B, determines whether extra A, find AG - A = G of the closure, G + = G, B is not included, can not be removed A. Determining whether extra G, find AG - G = A of the closure, closure A is A + = A, B free, G can not be removed, or AG-> B;

  So the conclusion that: F is the minimum functional dependencies: F = {B-> D, DG-> C, B-> E, AG-> B};

 

  5, lossless join decomposition  

  1) Analyzing method table

 

 

 

 

 

 

 

 

   2) lossless join Theorem

  

  Case (1): relational schema R (SAIP), F = {S-> A, SI-> P}; ρ = {R1 (SA), R2 (SIP)} detect whether the lossless join decomposition?

  Since: R1∩R2 = S; R1-R2 = A; R2-R1 = IP; so stars: S -> A; or S -> IP; and S -> A in the F = {S-> A, SI -> P}, the decomposition is non-destructive so this connection.

  Example (2): Given R <U, F>, U = {A, B, C}, F = {A → B}, the following two decomposition:
  ① rho] 1 = {AB, the BC};

  ② ρ2 = {AB, AC};

  Since: AB∩BC = B; AB-BC = A; BC-AB = C; stars; B → A, or B → A, are not included in the two F = {A → B}, the decomposition so ρ1 lossy.

  Since: AB∩AC = A; AB-AC = B; AC-AB = C; stars: A → B, or A → C, A → B and is included in the F = {A → B}, the decomposition so ρ2 It is lossless.

  6, an exploded holding dependence

  Case (1): the relational schema R <U, F>, U = {A, B, C, D, E}, F = {B → A, D → A, A → E, AC → B} decomposition ρ = {R1 (ABCE), R2 (CD)} satisfies remains functional dependency.

  Since: B → A, A → E, AC → B established on R1, D → A and R1 are not established on R2, thus the need for further determination.

  Since B → A, A → E, AC → B are retained (because they are elements in R1), so we have to be further determined that D → A is not also being held.

  ① look R1: as: result = D; result ∩R1 = ф (empty set); therefore: t = ф, result = D;

  ② look R2: as: result = D; result ∩R2 = D; D + = DA; D + ∩ R2 = D; so: t = D, result = D;

  After circulating a result did not change, so the final result = D, does not include A, D → A is not maintained so that the holder is not dependent decomposition.

 

  Case (2): Relationship R <U, F>, U = {A, B, C, D, E}, F = {A → C, B → C, C → D, DE → C, CE → A} , R is an exploded R1 (AD), R2 (AB), R3 (BE), R4 (CDE), R5 (AE), this decomposition is determined whether the functional dependence.

  Because:, C → D, DE → C are held in the R4 (CDE), and A → C, B → C, CE → A, does not hold on R1 .... R5, requires further determination.

  (1)A→C;

  ① look R1: as: result = A; result ∩R1 = A; A + = ACD; A + ∩ R1 = AD; so: t = AD, result = AD; At this time, result changed, into R2;

  ② look R2: as: result = AD; result ∩R2 = ф, finally result = AD;

  ③ look R3: as: result = AD; result ∩R3 = ф, finally result = AD;

  ④ Look R4: as: result = AD; result ∩R4 = D, D + = D; D + ∩ R4 = D; finally result = AD;

  ⑤ Look R5: Because: result = AD; result ∩R5 = A, finally result = AD;

 

  7, paradigm

  (1): 1NF: each component is not subdivided data items (values ​​atoms). Namely: properties, attributes or not presence of the complex multi-valued attributes.

  (2): 2NF: each non-primary property function entirely dependent on the candidate key (code). Note: Here is the code (not the main attributes); namely: the presence of non-primary property is not dependent on the function code portion.

  (3): 3NF: each non-primary property is not dependent on the transmitted code. That is: the presence of non-primary property is not dependent transfer function of a code.

  (4): BCNF: the absence of the main properties for some functional dependency and the transfer function code dependent. Analyzing method: The arrow on the left must be a candidate code (not just a property, partial code).
  Analyzing Method Paradigm:

  

 

   例1:R(A,B,C),F={A->B, B->A, A->C}

    L :No,R:C,LR:A,B

    Calculating A + = ABC, A is the candidate code

    Calculation B + = ABC, B is the candidate code

    Main properties: A, B; non-primary attributes: C

    1) to see if the non-primary property is partly dependent on the main properties, we found no partially dependent.

    2) to see if the non-primary attributes transmitted dependent on the primary property, found B -> A -> C, C is transmitted is dependent on B, but the transmission dependent established that A -> B is not satisfied, or A -> C push out . Therefore, no partial transmission dependent.

    3) to see if all dependent on all candidate code on the left, all rely on the left followed by A, B, A codes are thus all candidates BCNF normal.

  例2:R(A,B,C,D),F={B->D, D->B, AB->C}

    L:A, R:C,LR:B,D

    L based certain properties, the L and LR combination AB, AD

    Main properties: A, B, D; non-primary attributes: C

    AB + = ABCD; AD + = ABCD; therefore AB, AD candidate codes.

    1) Check partially dependent. C is totally dependent on AB, not partially dependent.

    2) Check transitive dependencies. C is directly dependent on the complete candidate code AB, rely not passed.

    3) to see if the candidate is a full code. All rely on the left followed by B, D, AB, B, D is not so for the candidate code 3NF paradigm.

 

  8, mode decomposition

  3NF decomposition:

  Calculating a minimum functional dependencies

  The minimum functional dependency successively decomposed to give 3NF holding function dependent decomposition.

  Holding a candidate code dependent exploded add the results to obtain decomposition 3NF lossless connection.

  

  BCNF decomposition:

  R (A, B, C, D), F = {A-> B, C-> D} been looking for functional dependency is not a candidate code entry A-> B, the dependent set decomposed into two parts:

  1) AB

  2) ACD (B Release by A)

  Continue to break down ACD.

  

  例: R(A,B,C,D,E,F),F={AE->F,A->B, BC->D, CD->A, CE->F}

  L:C,E

  R:F

  LR:A,B,D

  L based certain properties, the L and LR combination ACE, BCE, CDE.

  Main properties: A, B, D; non-primary attributes: C

  ACE + = ABCDEF; BCE + = ABCDEF; CDE + = ABCDEF; so ACE, BCE, CDE candidate codes.

 

 

  The above function sequentially dependent decomposition: AEF, AB, BCD, CDA, CEF.

  Holding function obtained 3NF decomposition dependency: AEF, AB, BCD, CDA, CEF

  Optionally added into a candidate code (here selected from ACE).

  Ligation obtained 3NF lossless decomposition: AEF, AB, BCD, CDA, CEF, ACE

 

  AEF, AB, BCD, CDA, CEF not all candidate key

  The first decomposition AE-> F:

  AEF, the remaining R = (ABCDE), F = {A-> B, BC-> D, CD-> A} (F can be derived)

  The second decomposition A-> B:

  AB, the remaining R = (ACDE), F = {CD-> A} (B that can be exported, B R is the longer BC-> D also deleted )

  The third decomposition CD-> A:

  CDA, the remaining R = (CDE), F = {} (A may be derived)

  CDE candidate code is an exploded stopped.

  Therefore BCNF decomposition AEF, AB, CDA, CDE

  

  

Guess you like

Origin www.cnblogs.com/hoo334/p/12595336.html
Recommended