Database design-anti-paradigm redundancy

What

When learning databases in universities, the requirements are all designed according to the three paradigms. However, in the context of high concurrency and large data volume on the Internet, anti-paradigm design is often required.

Why

Because the benefit of anti-paradigm is that it can reduce calculations through redundant storage.

How

Introduce an example: an example of order business

  1. Order business
    Order(oid, info_detail)//Order master table
    T(buyer_id, seller_id, oid)//Buyer, seller relationship table
  2. What if the amount of data is large?
    Horizontal split
  3. How to split horizontally? How to satisfy the query?
    Order -> oid
    T -> buyer_id, seller_id? The
    main order table is easy to segment (take the model of the oid), but what about the relationship table T? If you follow the buyer's terms, the buyer's reading question (just need to read One library) is solved, but the seller reading problem (need to read multiple libraries) cannot be solved.
  4. Solution:
    What problem does the redundant table solve?
    a. Large amount of data
    b. Need to split horizontally
    c. How to solve the query requirement of multiple fields on a schema
    ?
    • Database level:
      a. Order(oid, info_detail)
      b. T1(buyer_id, seller_id, oid)//When the buyer reads the data, read this table
      c. T2(seller_id, buyer_id, oid)//The seller reads the data Time to read this table

    • Service level:
      Solution 1: Service synchronous redundancy
      Insert picture description here
      Solution 2: Service asynchronous redundancy
      Insert picture description here
      Solution 3: Offline asynchronous redundancy: By monitoring binlog logs (such as Ali's canal)
      Insert picture description here

    • How to determine the timing of the operation of the front table and the reverse index? For example, should a new order be inserted into the T1 table or into the T2 table first?
      This question extends to: For an operation that cannot be guaranteed transactional, "Which task will be done first and which task will be done later"?
      Methodology: If atomicity is broken and inconsistencies occur, whoever does it first will have less impact on the business, and whoever will do it first.

      For example, for T1 and T2, if you operate T1 first, then the buyer can immediately see their order, but there may be risks when operating T2 without success, that is, the seller cannot see the order, and the impact is small; If you operate T2 first, the seller can immediately see your order, but there may be risks when you do not succeed in operating T1, that is, the buyer cannot see the order, and the impact is greater (as a To C application, how can you endure What about the bad experience on the C side?). So the final solution is to execute T1 first, and then operate T2.

    • How to ensure consistency?
      Methodology: Eventual consistency
      Option 1: Full data scanning: This solution is relatively stable and can be used as a bottom line solution. The disadvantage is that it is slow and difficult to achieve real-time performance. It is generally done once a day.
      Insert picture description here
      Plan 2: Incremental log scan: fast, most scenarios can use this plan, you can do it once an hour.
      Insert picture description here
      Solution 3: Real-time message detection: The real-time performance is the best, and it can be achieved in seconds. Unless it is some payment that requires high real-time performance, this solution can be considered.
      Insert picture description here

Guess you like

Origin blog.csdn.net/hudmhacker/article/details/108573169