Contribute to Industrial Internet of Things, Data Warehouse Fact Layer DWB Layer Construction of Industrial Big Data 【17】

Data warehouse fact layer DWB layer construction

01: Project review

  1. What subject areas are in the project?

    • Service Domain: Ticket Topic, Installation Topic
    • Client Domain: Client Matter Topic
    • Warehousing Domain: Material Topic
    • Operational Domain: Hours Topic
    • Marketplace: Ticket Topic
  2. What are the core dimensions in the project?

    • time dimension
    • Regional dimension
    • Gas station dimension
    • Service Site Dimensions
    • Organization Dimension
    • Logistics dimension
    • warehouse dimension
    • ……
  3. What are the core fields in the administrative region dimension?

    省份id	省份名称	城市id	城市名称		县区id		县区名称		乡镇id	乡镇名称
    
    • Periodically or according to changes, full synchronization to the data warehouse
  4. What are the core fields in the time dimension?

    年	季度		月		周		日		年的第几天	周的第几天	工作日		节假日		周日
    
    • Every year, the time dimension information of the next year is generated in advance, and the increment is put into the data warehouse
  5. What are the core fields in the service outlet dimension?

    服务网点id	编码	名称		省份   城市   县区    组织机构id  组织机构名称
    
  6. What are the core fields in the gas station dimension?

    油站id	油站编码	油站名称		省份	城市	县区	乡镇		客户id	客户名称	公司id	公司名称
    
  7. What are the core fields in the organization dimension?

    工程师id	工程师名称		岗位id	岗位名称		部门id		部门名称
    
  8. concentration problem

    • DG can't connect: The process of YARN is faulty, which makes ThriftServer unable to run
      • Hadoop:NameNode、DataNode、ResourceManager、NodeManager
      • Hive:Metastore、Hiveserver2
      • Spark:ThriftServer
    • Exception: ProtocolBuffer mismatch: dim_date
      • The data file does not match the table definition
      • step1: Check the table creation syntax
      • step2: file: the file is wrong when uploading
    • Syntax + Function + Data Relationship
      • Syntax + Functions: Calculations
      • Data Relationships: Logic

02: Project goals

  • Overall goal: build DWB in data warehouse: subject transaction fact table

    • Construction of core theme facts: SQL implementation + theme indicators

      • Original transaction fact data [DWD]: order data

        o001		userid1		2021-01-01	200.00
        
      • Subject Transaction Fact Data [DWB]: Order Subject

        o001		userid1		2021-01-01	订单总金额:200		订单总个数:1
        
      • Topic Cycle Snapshot Fact Table: Data Application Layer [ST: Dimension [DWS] + Fact Index [DWB]]

        2021-01-01	订单总金额:xxxx		订单总个数:xxxx
        
  • Key content: SQL and data relationships

03: Layered review

  • Objective: Review hierarchical design of one-stop manufacturing projects

  • implement

    image-20210821102418366

    • ODS layer: raw data layer: 101 tables: AVRO
    • DWD layer: detailed data layer: 101 tables: ORC
    • DWS layer: dimension data layer: dimension table
    • DWB layer: Light summary layer: Join + build basic indicators
  • summary

    • Review One-Stop Manufacturing Project Hierarchical Design

04: Design of DWB layer

  • Goal: Master the design of DWB layer

  • path

    • step1: function
    • step2: source
    • step3: Requirements
  • implement

    • Function: Store the transaction fact data required by each fact topic and the results of light aggregation, for the ST layer to perform statistical aggregation based on the DWS layer to obtain the final indicators of each topic
      • Association: associate and merge the fields required by the fact subject into a fact table, and build subject-based facts
      • Aggregation: Lightly aggregate commonly used basic indicators based on fine-grained
    • Source: Association or light aggregation of data in the DWD layer
    • Requirements: According to the divisional requirements of one-stop manufacturing business themes, construct the DWB layer data of each theme
  • summary

    • Master the design of DWB layer

Guess you like

Origin blog.csdn.net/xianyu120/article/details/131939122