An example of a big data platform development specification

One, before

The content includes but is not limited to the internal and external requirements of the platform, code development, testing and online process specifications.

From the perspective of big data technology platform architecture, sort out the normative examples of how to cooperate with data warehouse and data analysis students, for reference only.

2. Environmental information

First of all, it is necessary to sort out information such as the deployment method, host, and version of each component, so that other platforms, data warehouses, or data analysis students can understand the deployment of the platform base.

3. Demand process

3.1 Main process

The main process of the overall demand is as follows.

insert image description here

3.2 Demand Initiation

3.2.1 Demand Initiator

Including but not limited to the following personnel.

  • Data Warehouse Development
  • data analysis
  • data product
  • Insiders (Active Digging)

3.2.2 Requirement Type

  • Business requirements: business requirements that need to be supported by components, such as application script development
  • Component functional requirements: component functional extensions, such as connector expansion or wheel building
  • Component bug requirements: bugs caused by component design defects or version defects need to be solved in combination with the community and source code

3.2.3 Demand Channel

  • Tapd/DevOps/Zen Tao (collaboration management platform)
    • Mainly for component bug requirements, such as submission address: Data Platform_Online BUG Summary
  • Feishu/DingTalk/Enterprise WeChat (enterprise office platform)
    • Mainly for urgent and important questions/needs, feedback can be made through DA (data demand entry group), question group or private chat (private chat/private work is allowed, but internal communication is required to avoid poor information ). Current related issues will be recorded in Unified Anchor: new cluster issue record
  • offline communication
    • Complicated requirements can be communicated offline and need to be disassembled and confirmed, for example: data warehouse requirements records

3.2.4 Demand docking person

If the team is still in the immature stage of development, it is not recommended to adopt the method of [ unified docking person ], and the strategy of [ one master and multiple slaves ] can be adopted, that is, except for the team leader, other members can be the "needs docking person" , but no matter who is the "requirement docking person", it is necessary to record the demand in the [ requirement pool ] to avoid poor information.

  • head of the team
  • Component Owner
    • The management (technical support/Q&A) of each component needs to designate the first responsible person and the second responsible person
  • demand pool
    • The record of requirements avoids the loss of requirements and helps to establish a clear traceback of requirements. The list of requirements pool records: Big Data Platform Requirements Pool

3.3 Demand processing

3.3.1 Internal Evaluation

  • For small requirements that can be quickly responded and processed, this step and the subsequent process can be skipped. For example, doris has the problem of reading and writing failures due to large sql memory oom
  • For needs that cannot be responded and processed quickly, interim meetings need to be organized by the docking person to conduct internal assessments, such as doris upgrades
  • Items to be confirmed in internal assessment
    • Whether the demand is reasonable and whether follow-up processing is required
    • Prioritize needs
    • Specify who needs to be handled

3.3.2 Planning and scheduling

  • Scheduling by demand handlers according to demand priority, and feedback to relevant internal and external personnel

3.3.3 Scheme review

  • The demand processor outputs the plan, and the form of the plan can be in the form of list, 123 statement, flow chart or document
  • For complex requirements/projects, the summary and detailed solutions can be output successively, such as the need for self-developed middleware or application system base, etc.
  • Organized by demand handlers, the review method can be conducted in the form of seats or meetings
  • There must be two or more participants, depending on the needs, whether you need to bring the initiator

3.3.4 Development delivery

  • Development, testing and delivery are carried out according to the scheduled time. If other urgent matters need to be prioritized, they can be postponed through coordination
  • The specific process refers to the following [ development process ]

4. Development process

Including development, testing and launch process.

insert image description here

4.1 Development

  • If there is an outline or detailed scheme design (flow chart), it must be strictly followed
  • Develop according to the schedule, and give progress feedback every week or every other day to avoid information gap
  • For public tools and APIs, a private maven warehouse can be established for unified management
  • If the developed program fails to pass the test, it needs to be reworked immediately to re-initiate the follow-up process

4.2 Code Review

  • As time goes by, the amount of development code such as component transformation, script or API will increase, and code review needs to be paid attention to
    • Facilitate a technical climate within the team
    • Improve code quality and unify specifications
    • Avoid the phenomenon of "authority obsession" and "repeated wheel making"
  • The cr link takes a certain amount of time, so the cr time needs to be included in the scheduling

4.3 Testing

  • Corresponding test cases need to be output, including but not limited to self-test, joint debugging, scenario verification and read-write pressure test, etc.
  • The form of use cases is not limited, and can be output in the form of lists, 123 statements, flow charts or mind maps

4.4 Launch

  • The online process must follow the [ Operation and Maintenance Specifications ]
  • Changes involving component content (bug fixes, parameter adjustments, component upgrades and restarts, etc.) need to be announced in advance

4.4.1 Example of an upgrade notice

[喇叭][喇叭]【Doris Be2.0升级通知】
@人员 
变更时间:2023-08-08 12:12 至 2023-08-08 13:13
变更类型:BE滚动升级
变更版本:1.2.6-release升级至2.0-roc3
变更内容:仅升级BE
变更原因:
1. 引入workgroup和倒排索引等2.0新特性
2. 使用新优化器提升整体查询效率
3. ......
测验结果:升级前测试报告
回滚策略:无
预计影响范围:doris上游任务可能会存在闪断

4.4.2 Example of Tuning Announcement

[喇叭][喇叭]【Dolphinscheduler调优通知】
@人员 
变更时间:2023-08-08 13:13 至 2023-08-08 14:14
变更类型:调优重启
变更内容:
1. datasource调整为druid
2. 默认连接大小由50调整为100
3. 新增ldap模块,ds登录改为sso账号密码登录
预计影响范围:所有ds调度任务(重启后自动重试)

4.4.3 Example of Restart Announcement

[喇叭][喇叭]【Doris Be紧急重启通知】
@人员 
变更时间:2023-08-08 00:00 至 2023-08-08 01:01
变更类型:重启
变更内容:无
变更原因:进程假死
预计影响范围:doris上游任务可能会存在闪断

4.4.4 Completion Announcement Example

【XXX完成通知】
完成时间:2023-08-08 08:08
完成结果:升级完成/调优完成/重启完成
完成说明:顺风顺水顺财神

5. Development Specifications

5.1 Database

  • reject select *
  • All tables need to be commented
  • The temporary library/table name must be prefixed with tmp_employment number
  • Select paradigm or anti-paradigm design according to requirements
  • Reserved fields are prohibited for library names, table names, and field names
  • Library names, table names, and field names must use lowercase letters and be separated by underscores

5.2 JAVA

For details, please refer to the Ali Java Coding Specification.

5.2.1 Naming

  • Class names use the UpperCamelCase style, except for the following cases: DO / BO / DTO / VO / AO / PO, etc.
  • The names of constants are all capitalized, and the words are separated by underscores, so as to make the semantic expression complete and clear, and don’t think the name is too long
  • Method names, parameter names, member variables, and local variables all use the lowerCamelCase style, and must follow the camel case

5.2.2 Constants

  • Do not allow any magic values ​​(i.e. undefined constants) to appear directly in the code
  • When long or Long is initially assigned, use uppercase L instead of lowercase l. Lowercase is easily confused with the number 1, causing misunderstanding

5.2.3 Notes

  • Comment out code sparingly; elaborate above, not simply comment out; delete if useless
  • Comments on classes, class attributes, and class methods must use Javadoc specifications, using /* content / format, and // xxx methods are not allowed

5.3 Git strategy

  • Gitlab address: https://git.xxx.com/
  • Gitlab account password: doris/doris

insert image description here

5.4 Code Review

  • CR needs to be performed before the test, or it should be performed synchronously
  • Perform cr for each merge request, that is, mr is cr, to avoid mr accumulation
  • During cr, the main flow chart of the code can be output first, which will get twice the result with half the effort
  • Private review for small needs
    • 2~3 people can go to the computer to review
    • Talk about requirements first and then review
  • Core process meeting review
    • The main review process, so that participants are familiar with the process
    • Secondly, review the core code and configuration related content

This is the end of the sharing of big data platform development specification examples. If you encounter any problems during the review process, please leave a message for exchange

Guess you like

Origin blog.csdn.net/ith321/article/details/132125871