If you are an architect and start from 0, how do you do the architecture of a background project?

foreword

In the reader community (50+) of the 40-year-old architect Nien , many friends want to get a high salary, so they need to interview the architect, complete the upgrade of the architecture, and enter the architecture track.

During the interview process of the architect, the following questions are often encountered:

If you are given a project that requires you to build from 0 to 1, what aspects do you need to start with?

How did you structure the project?

Nin, a 40-year-old architect, has done many architectural solutions. At present, Nien has been engaged in the transformation of architects for 2 years, and he does not know how many developments he has guided to complete the transformation of architects.

Now, to provide you with a more comprehensive reference answer. It allows everyone to fully demonstrate their strong "technical muscles", and make your supervisors and colleagues love to "can't control themselves and salivate" .

Also include this question and reference answers in our " Nin Java Interview Collection " V74 version, for the reference of later friends, to improve everyone's 3-high architecture, design, and development levels.

For the PDF files of the latest "Nin Architecture Notes", "Nin High Concurrency Trilogy" and "Nin Java Interview Collection", please go to the official account at the end of the article [Technical Freedom Circle] to obtain

Article directory

Overview: background technology stack structure

If you are the technical leader of a team, how to build the company's back-end technology stack from 0.

As a 40-year-old architect, if I were to build the company's back-end technology stack from scratch, I would follow the steps below:

Architecture 1: The team assists in the selection and training of the basic tool chain

Architecture 2: Building Microservice Development Infrastructure

Architecture 3: Choosing the right RPC framework

Architecture 4: Select and build a highly available registration center

Architecture 5: Select and build a highly available configuration center

Architecture 6: Select and build high-performance caching middleware

Architecture 7: Select and build high-performance message middleware

Architecture 8: Select and build a high-performance relational database

Architecture 9: Architecture of CICD Publishing System/Deployment System

Architecture 10: 360-degree monitoring and maintenance architecture

Architecture 11: High-concurrency and high-throughput load balancing deployment architecture for production environments

The entire background technology architecture mainly includes four levels:

  • Language : Which development languages ​​are used, such as: C++/Java/Go/PHP/Python/Ruby, etc.;
  • Components : Which components are used, such as: MQ components, database components, etc.;
  • Process : What kind of process and specification, such as: development process, project process, release process, monitoring and alarm process, code specification, etc.;
  • System : Systematic construction, the above process needs to be guaranteed by a system, such as: a release system that standardizes the release process, a code management system, etc.;

Combining the above four levels of content,

The structure of the entire background technology stack is shown in Figure 1:

Figure 1 Background technology stack structure

Figure 1 Background technology stack structure

We select systems and components one by one, and finally form our background technology stack.

The team assists in the selection and training of the basic tool chain

The team assists the basic tool chain, mainly the three major management

  • project management
  • task management
  • problem management

Project management software is a centralized place for the needs, problems, processes, etc. of the entire business. Most of our cross-departmental communication and collaboration rely on project management tools.

There are some SaaS project management services that can be used, but they do not meet the needs in many cases. At this time, we can choose some open source projects. These projects themselves have certain customization capabilities and a wealth of plug-ins can be used. The needs of general start-up companies are basically Can be satisfied, commonly used items are as follows:

  • Jira : Developed in Java, it has user stories, task splitting, burndown charts, etc. It can be used for project management, and can also be applied to cross-departmental communication scenarios, which is relatively powerful;
  • Redmine : Developed with Ruby, there are many plug-ins that can be used, fields can be customized, project management, bug tracking, WIKI and other functions are integrated, but many plug-ins have not been updated for N years;
  • Phabricator : Developed in PHP, it was an internal tool before Facebook. After the buddy who developed this tool left his job, he set up a company to specialize in this software. It integrates code hosting, Code Review, task management, document management, issue tracking and other functions. Strong Recommended for agile teams;

Advice from 40-year-old architect Nien

Nien has used it, and the current recommendation is Jira

Build microservice development infrastructure

Building a microservice development infrastructure requires consideration of many aspects, including but not limited to the following:

  1. Choose the right microservice framework and technology stack : currently popular microservice frameworks include Spring Cloud, Go-Micro, gRPC, etc. It is very important to choose a framework that suits your team's technology stack.
  2. Choose the right RPC framework
  3. Building infrastructure : including but not limited to service registration and discovery, load balancing, API gateway, distributed configuration center, distributed locks, message queues, etc.
  4. Security : including but not limited to encryption of communication between services, access control, identity authentication, etc.

Before building a microservice development infrastructure, you need to analyze and plan your own business scenarios, determine what infrastructure and technology stacks are needed, and then implement them step by step. At the same time, it is necessary to pay attention to scalability and maintainability, so as to be able to quickly adapt to changes in the process of business development.

Choose the right microservice framework and technology stack

Choosing the right microservice framework and technology stack requires consideration of several factors, including the following:

  1. Business requirements : Different business requirements require different technology stacks and frameworks to support. For example, if you need high concurrency and high availability, you can choose to use technologies such as Go language and Kubernetes to build microservices.
  2. Development team skills : The technology stack and framework chosen should match the skill level of the development team so that developers can get started quickly and develop efficiently.
  3. Community support : Choose popular technology stacks and frameworks to get better community support, faster problem solving and updated features.
  4. Performance and stability : The selected technology stack and framework should have good performance and stability so that it can support high load and long-running.

Common microservice frameworks and technology stacks include:

  1. Spring Cloud : For Java development teams, with rich features and community support.
  2. Go Micro : Suitable for Go development teams, featuring high performance and ease of use.
  3. Node.js + Express : Suitable for JavaScript development teams, featuring lightweight and rapid development.
  4. Kubernetes : Suitable for microservice architectures that require high availability and elasticity, and can support multiple programming languages ​​and frameworks.
  5. Istio : It is suitable for microservice architectures that require service grid functions, and can provide functions such as traffic management, security, and observability.

When choosing, you need to choose the appropriate microservice framework and technology stack based on specific business needs and development team skills.

Advice from 40-year-old architect Nien

The suggestion of the 40-year-old architect Nin is to choose SpringCloud Alibaba + Dubbo RPC + Dubbo-Go for two reasons:

(1) High performance : In Nien's performance test case, Dubbo is 10 times more powerful than Feign. For details, please refer to Nien's blog

(2) Taking into account the team's technology stack : Can cross Go and Java multi-language micro-service architecture, Java technology stack students can develop business micro-services based on Java, which focuses on business development. Students of the Go technology stack can develop high-performance technical microservices based on Go, which focuses on technology development and performance optimization.

(3) Both function and performance : Java focuses on the rapid development of functions, while Go focuses on the rapid improvement of performance.

Based on the above structure, Nien mentored a small partner for 6 years and got an offer with an annual salary of 60W.

Choose the right RPC framework

Wikipedia's definition of RPC is: Remote Procedure Call (Remote Procedure Call, RPC) is a computer communication protocol. This protocol allows a program running on one computer to call a subroutine on another computer without the programmer having to program additionally for this interaction.

Generally speaking, a complete RPC call process is a process in which the server implements a function, and the client uses the interface provided by the RPC framework to call the implementation of this function and obtain the return value.

RPC frameworks in the industry are roughly divided into two schools, one focuses on cross-language calls, and the other focuses on service governance.

Cross-language call type RPC :

Cross-language call-type RPC frameworks include Thrift, gRPC, Hessian, Hprose, etc. This type of RPC framework focuses on cross-language invocation of services, and can support language-independent invocation in most languages, making it very suitable for multilingual invocation scenarios. However, this type of framework does not have a mechanism for service discovery. In actual use, the proxy layer is required to perform request forwarding and load balancing policy control.

Among them, gRPC is a high-performance, general-purpose open source RPC framework developed by Google. It is mainly designed by Google for mobile application development and based on the HTTP/2 protocol standard. It is developed based on the ProtoBuf (Protocol Buffers) serialization protocol and supports many development languages. . It is not distributed per se, so further development is required to implement the functionality of the framework.

Hprose (High Performance Remote Object Service Engine) is a new lightweight cross-language cross-platform object-oriented high-performance remote dynamic communication middleware licensed by MIT open source.

Metallurgical RPC :

The service governance RPC framework is characterized by rich functions, providing high-performance remote calls, service discovery and service governance capabilities, suitable for service decoupling and service governance of large services, and can be transparent for specific language (Java) projects access. The disadvantage is that the language coupling is high, and cross-language support is difficult.

Common domestic RPC frameworks are as follows:

  • Dubbo : Dubbo is an excellent high-performance Java service framework open sourced by Alibaba, which enables applications to implement service output and input functions through high-performance RPC, and can be seamlessly integrated with the Spring framework. Inside Taobao, Dubbo had a competitive relationship with Taobao's HSF, another similar framework, which led to the disbandment of the Dubbo team.
  • DubboX : DubboX is an RPC framework extended by Dangdang based on the Dubbo framework. It supports REST-style remote calls, Kryo/FST serialization, and adds some new features. Motan: Motan is a Java framework open sourced by Sina Weibo. It was born relatively late, starting in 2013 and open source in May 2016. Motan has been widely used in the Weibo platform, completing nearly 100 billion calls for hundreds of services every day.
  • rpcx : rpcx is a distributed RPC service framework similar to Alibaba Dubbo and Weibo Motan, implemented based on Golang net/rpc.

But rpcx is basically maintained by only one person, and there is no complete community, so be careful before using it.

Advice from 40-year-old architect Nien

The suggestion of the 40-year-old architect Nin is to choose Dubbo for two reasons:

(1) High performance : In Nien's performance test case, Dubbo is 10 times more powerful than Feign. For details, please refer to Nien's blog

(2) Cross-language : Dual-language RPC calls can be made across Go and Java, thereby realizing a multi-language microservice architecture.

Select and build a highly available registration center

Name discovery and service discovery are divided into two modes, one is the client discovery mode, and the other is the server discovery mode. A common service discovery in the framework is the client discovery pattern.

The so-called server-side discovery mode means that the client sends a request to the service through a load balancer, and the load balancer queries the service registry and routes the request to an available service instance. Commonly used load balancers are all of this type, which are often used in microservices.

All name discovery and service discovery depend on a highly available service registry. There are three commonly used service registries in the industry:

etcd, a highly available, distributed, consistent, key-value store, is used in shared configuration and service discovery. Two notable projects use it: Kubernetes and Cloud Foundry.
Consul, a tool for discovering and configuring services, provides an API for client registration and service discovery. Consul can also determine the availability of services by performing health checks.
Apache ZooKeeper, is a widely used, high-performance coordination service for distributed applications. Apache ZooKeeper was originally a sub-project of Hadoop, but now it is a top-level project.
In addition, there are eureka, nacos, etc. You can choose the components that suit you according to the characteristics of the relevant components.

To select and build a highly available registration center, the following aspects need to be considered:

  1. Functional requirements : When choosing a registration center, you need to choose according to your own business needs, such as service discovery, load balancing, configuration management, etc.
  2. Performance requirements : The registration center needs to have high performance and be able to support high-concurrency and high-throughput requests.
  3. Availability requirements : The registration center needs to have high availability and be able to guarantee 24-hour uninterrupted operation to avoid the unavailability of the entire system due to a single point of failure.
  4. Security requirements : The registration center needs to have a certain degree of security, which can ensure the confidentiality and integrity of data, and avoid data leakage and tampering.

Common registration centers include ZooKeeper, Etcd, Consul, etc., all of which have high availability and security, and all support functions such as service discovery and configuration management. Among them, ZooKeeper is the earliest distributed coordination service, with a mature ecosystem and a wide range of application scenarios; Etcd is an open source distributed key-value storage system launched by CoreOS, with high availability and consistency guarantee; Consul is a service discovery service launched by HashiCorp and configuration management tools for ease of use and scalability.

When building a highly available registration center, it is necessary to adopt a cluster deployment method to avoid single point of failure. At the same time, in order to ensure data security, SSL/TLS encryption can be enabled, and access control mechanisms can be used to limit access rights.

Advice from 40-year-old architect Nien

Nien has used it, and the current recommendation is the highly available nacos, which is the version of nacos+mysql

For details, please refer to Nin's architecture notes

Select and build a unified configuration center

With the increasingly complex program functions, the configuration of the program is increasing: various function switches, downgrade switches, grayscale switches, parameter configuration, server address, database configuration, etc. In addition, the background program configuration The requirements are also getting higher and higher: after the configuration is modified, it will take effect in real time, grayscale release, management configuration by environment, user, and cluster, perfect permissions, audit mechanism, etc. In such a large environment, traditional configuration files, Databases and other methods are increasingly unable to meet the needs of developers for configuration management, and a unified and basic configuration system is needed

The unified configuration system refers to the centralized management of all configuration information in a large system, so as to facilitate the management and maintenance of the system. A common unified configuration system architecture includes the following components:

  1. Configuration center : used to store and manage all configuration information, and provide functions such as configuration query, modification, and deletion.
  2. Configuration client : used to obtain configuration information from the configuration center and apply it to the system.
  3. Configuration publishing tool : used to publish configuration information to the configuration center, so that the configuration client can obtain it.
  4. Configuration management tools : used to manage and maintain configuration information, including adding, modifying, and deleting configurations.
  5. Configuration monitoring tool : used to monitor changes in configuration information, discover and deal with abnormalities in configuration information in a timely manner.

In practical applications, you can choose to use open source configuration center tools, such as ZooKeeper, Etcd, Consul, Nacos, Apollo, etc., or you can develop a configuration center system yourself.

At the same time, it is also necessary to select the appropriate configuration client and configuration release tool according to the actual situation. In terms of configuration management and monitoring, you can use some open source tools or develop a system yourself. In short, the architecture of the unified configuration system needs to be designed and selected according to actual needs.

Advice from 40-year-old architect Nien

Nien has used it, and the current recommendation is the highly available nacos, which is the version of nacos+mysql

For details, please refer to Nin's architecture notes

Select and build high-performance caching middleware

Selecting and building high-performance caching middleware requires consideration of multiple factors, including performance, reliability, scalability, and ease of use. The following are some common high-performance caching middleware:

  1. Redis : Redis is an open source high-performance cache and key-value storage system that supports a variety of data structures, including strings, hashes, lists, sets, and ordered sets. Redis improves performance by storing data in memory, while supporting data persistence and cluster mode.
  2. Memcached : Memcached is an open source high-performance distributed memory object caching system that can cache any serializable data, such as database query results, API responses, etc. Memcached can improve scalability and reliability by clustering multiple nodes.
  3. Hazelcast : Hazelcast is an open source distributed memory data grid system that supports functions such as caching, distributed data structures, and distributed computing. Hazelcast can be clustered with multiple nodes to improve scalability and reliability.
  4. Couchbase : Couchbase is an open source distributed NoSQL database and caching system that can cache any type of data, including JSON documents, key-value pairs, and binary data. Couchbase supports functions such as clusters composed of multiple nodes and data persistence.

When building high-performance caching middleware, the following aspects need to be considered:

  1. Hardware configuration : The cache middleware requires a large amount of memory, so sufficient memory and processor resources need to be configured.
  2. Deployment architecture : It is necessary to consider the deployment architecture of the cache middleware, such as single node, master-slave replication, cluster, etc.
  3. Data persistence : It is necessary to consider the way of data persistence, such as memory snapshot, AOF log, RDB file, etc.
  4. Security : The security of cache middleware needs to be considered, such as access control, data encryption, etc.
  5. Monitoring and management : It is necessary to consider the monitoring and management of cache middleware, such as performance monitoring and fault diagnosis.

In short, the selection and construction of high-performance caching middleware requires comprehensive consideration of multiple factors, and selection and configuration according to specific needs and scenarios.

Advice from 40-year-old architect Nien

Nien has used it, and the current recommendation is a highly available redis cluster. For details, please refer to Nien's architecture notes

Special attention should be paid to the fact that redis is related to the high availability of the system and is prone to production accidents.

If there is a bigkey in redis, in a high-concurrency scenario, the system is likely to be paralyzed, which seriously affects the availability of the system. Yesterday, a small partner came to ask for help. Their redis bigkey problem caused their system to be paralyzed for an hour, and the economic loss was several million. Tens of millions of indirect losses.

After I talked about this problem in the community, two friends said that they also encountered it, and it was all production accidents.

If the big key is detected and resolved, please refer to Nin's architecture notes for details

Select and build high-performance messaging middleware

Message middleware is an essential component in the background system. Generally, we use message middleware in the following scenarios:

  • Asynchronous processing :
    Asynchronous processing is one of the main reasons for using message middleware. The most common asynchronous scenarios in the work include sending a registration success email after successful registration, returning old data when the cache expires, and then updating the cache asynchronously and writing asynchronously Logs, etc.; through asynchronous processing, the waiting response time of the main process can be reduced, and non-main processes or non-important businesses can be centralized and asynchronously processed through the message middleware.
  • System decoupling :
    For example, in the e-commerce system, when the user successfully completes the payment and completes the order, the payment result needs to be notified to the ERP system, invoice system, WMS, recommendation system, search system, risk control system, etc. for business processing; these business processing Real-time processing is not required, strong consistency is not required, only eventual consistency is required, so the system can be decoupled through message middleware. This system decoupling can also cope with future unclear system requirements.
  • Shaving peaks and filling valleys :
    When the system encounters a large traffic flow, you will see traffic graphs like peaks one by one on the monitoring graph. By using the message middleware, the large-traffic requests are put into the queue, and the queued requests are queued by the consumer program. Processing requests are slowly digested to achieve the effect of eliminating peaks and filling valleys. The most typical scenario is the seckill system. In the e-commerce seckill system, the order service is often the bottleneck of the system, because ordering requires database operations on inventory, etc., and strong consistency needs to be ensured. At this time, message middleware is used for downloading. Order queuing and flow control allow the order service to slowly process the orders in the queue, protecting the order service, so as to achieve the effect of shaving peaks and filling valleys.

The message middleware in the industry is a very common thing. When you make a selection, you use open source ones, some make your own wheels, and even use MySQL or Redis directly as a queue. The key depends on whether it meets your needs.

Choosing the right messaging middleware requires consideration of multiple factors, including but not limited to:

  • The number and frequency of messages that need to be processed
  • message size and format
  • Availability and Fault Tolerance Requirements
  • Data Security and Encryption Requirements
  • Scalability and flexibility requirements
  • Compatibility of development language and technology stack
    Common message middleware includes RocketMQ, Kafka, RabbitMQ, Kafka, ActiveMQ, Redis, NATS, etc. Each middleware has its own characteristics and applicable scenarios.

If you need to process a large number of messages and need high throughput and low latency, you can consider using Kafka. If you need to process messages in real time and need high availability and fault tolerance, you can consider using RabbitMQ. If you need to process lightweight messages, and require high performance and low latency, you can consider using Redis.

When choosing message middleware, it is necessary to comprehensively consider according to specific business requirements and technology stacks, and choose the most suitable middleware.

Advice from 40-year-old architect Nien

Nien has used both, and currently recommends kafka + RocketMQ

For details, please refer to Nin's architecture notes, and Nin's four-part series on penetrating RocketMQ source code and architecture

Select and build a high-performance relational database

There are two types of relational databases, one is traditional relational data, such as Oracle, MySQL, Maria, DB2, PostgreSQL, etc., and the other is NewSQL, which is a new type of relational database that must at least meet the following five points:

  • Completely supports SQL, and supports complex SQL queries such as JOIN / GROUP BY / subqueries.
  • Supports ACID transactions that are standard for traditional data, and supports strong isolation levels.
  • With the ability of elastic scaling, capacity expansion and contraction are completely transparent to the business layer.
  • Real high availability, remote multi-active, fault recovery process does not require human access, the system can automatically recover disasters and perform strong consistent data recovery.
  • Possess certain big data analysis ability.

The most commonly used traditional relational database is MySQL, which is mature, stable, and can meet some basic needs. Before a certain data volume, the basic stand-alone traditional database can be handled, and now many open source systems are based on MySQL. Ready-to-use, coupled with master-slave synchronization and front-end caching, applications with millions of pv can be handled.

However, CentOS 7 has abandoned MySQL and used MariaDB instead. The MariaDB database management system is a branch of MySQL, mainly maintained by the open source community and licensed under the GPL. One of the reasons for developing this branch is: after Oracle acquired MySQL, there is a potential risk of closing MySQL, so the community adopts a branch to avoid this risk.

After Google released F1: A Distributed SQL Database That Scales and Spanner: Google's Globally-Distributed Database, NewSQL became popular in the industry. So there is CockroachDB, so there is TiDB of Uncle Qi.

Many companies in China have used TiDB. TiDB had already been used in big data analysis when I was a startup company. The main reason for the application at that time was that MySQL needed to use sub-databases and tables, and the logic development was complicated and the scalability was not enough.

Select and build high-performance NoSQL

As the name suggests, NoSQL is Not-Only SQL, and some people say it is No-SQL. Personally, I prefer Not-Only SQL. It is not used to replace relational databases, but exists as a supplement to relational databases.

There are 4 types of common NoSQL:

  • Key-value , suitable for content caching, suitable for mixed workloads, concurrency, high expansion requirements and large data sets. Its advantages are simple, fast query speed, and its disadvantage is the lack of structured data. The common ones are Redis, Memcache, BerkeleyDB and Voldemort, etc.;
  • Columnar , stored in column clusters, stores data in the same column together, and is commonly used in distributed file systems, represented by Hbase and Cassandra. Cassandra is mostly used in scenarios where more writing is required and less reading is required. In China, clusters of 360 or about 1,500 machines are mostly used. There are many large-scale companies abroad, such as eBay, Instagram, Apple and Wal-Mart, etc.;
  • Documents and data storage solutions are very suitable for carrying a large amount of irrelevant and complex information with greatly different structures. The performance is between kv and relational database, it is inspired by lotus notes, common ones are MongoDB, CouchDB, etc.;
  • Graphs , graph databases are good at handling any situation involving relationships. Social networks, recommender systems, etc. Focusing on building a relationship graph requires calculations on the entire graph to obtain results. It is not easy to do distributed cluster solutions. Common ones include Neo4J, InfoGrid, etc.

In addition to the above four types, there are also some special databases, such as object databases and XML databases, which are optimized for certain storage types.

In actual application scenarios, when to use a relational database, when to use NoSQL, and which type of database to use, this is a very important consideration when we choose an architecture, and it will even affect the entire architecture.

Architecture of CICD release system/deployment system

From the perspective of software production, the typical flow from code to final service is shown in Figure 2:

Figure 2 Flowchart

Figure 2 Flowchart

As can be seen from the above figure, it is a long process from developers writing code to serving end users, which can be divided into three stages as a whole:

  • From code (Code) to artifact library (Artifact) : This stage mainly continuously builds the developer's code, and centrally manages the artifacts generated by the build, which is the stage of preparing input content for the deployment system.
  • From product to runnable service : This stage mainly completes the deployment of the product to the specified environment, which is the most basic work content of the deployment system.
  • From the development environment to the final production environment : This stage mainly completes the migration of a change in different environments, which is the core capability of deploying the system to launch the final service.

The release system integrates product management, release process, authority control, online environment version change, grayscale release, online service rollback, etc. It is an important channel for the final presentation of the developer's work.

The architecture of a CI/CD release system/deployment system typically includes the following components:

  • Source code management system : such as Git, SVN, etc., used to manage the code base.
  • Continuous integration tools : such as Jenkins, GitLab CI, Travis CI, etc., to automate building, testing, and packaging applications.
  • Product warehouse : such as Docker Hub, Harbor, Aliyun Container Registry, etc., used to store the image of the application.
  • Deployment tools : such as Kubernetes, Docker Swarm, Mesos, etc., are used to automate the deployment of applications.

These components can be selected and combined according to actual needs to form a complete CI/CD release system/deployment system.

Among them, continuous integration tools and deployment tools are the core components, which are responsible for automating building, testing, packaging and deploying applications, so as to achieve a fast, reliable and repeatable software release process.

In the initial stage of the project, Jenkins + Gitlab + Harbor can be integrated. The above solutions basically include product management, release process, authority control, online environment version change, grayscale release (need to be implemented by yourself), online service rollback and other functions.

Code management tool selection

Code is one of the lifeblood of a project, and code management is very important. There are two common considerations:

Security and rights management, put the code on the intranet and implement strict code control and physical isolation of the machine for the core code that is the lifeblood of the company;
code management tools, Git is the best choice for code management, you deserve it.

GitLab is the hottest open source Git hosting server today, none of them. Although there is an enterprise version, its community version can basically meet most of our needs. Combining with Gerrit for code review, it is basically perfect.

Of course, GitLab also has code comparison, but it is not as intuitive as Gerrit.

Gerrit provides a better code inspection interface and mainline management experience than GitLab, and is more suitable for use in cultures that have high requirements for code quality.

Continuous integration tool selection

Continuous integration, short for CI (continuous integration), is a software development practice in which team development members often integrate their work, and integration may occur multiple times a day.

Each integration is verified through automated builds (including compilation, release, and automated testing) to detect integration errors as early as possible.

Continuous integration provides basic tasks such as code branch management/comparison, compilation, inspection, and publication output for the R&D process, and provides unified support for test coverage version compilation and generation.

In the industry's free continuous integration tool system, we have the following options:

  • Jenkins : Java written with a powerful plug-in mechanism, MIT protocol open source (free, high degree of customization, it can be distributed on multiple machines for building and load testing). Jenkins can be regarded as omnipotent, there is basically nothing that Jenkins cannot do, no matter from a small team to a large team, Jenkins can handle it. However, if you want to use it on a large scale, you still need manpower to learn and maintain it.
  • TeamCity : TeamCity is more friendly to use than Jenkins and is also a highly customizable platform. But if more people use it, TeamCity will charge.
  • Strider : Strider is an open source continuous integration and deployment platform, implemented using Node.js, stored using MongoDB, BSD license, similar in concept to Travis and Jenkins.
  • GitLab CI : Since GitLab 8.0, GitLab CI has been integrated in GitLab. We only need to add a .gitlab-ci.yml file to the project, and then add a Runner to perform continuous integration. And GitLab and Docker have a very good ability to cooperate with each other.
    The difference between the free version and the paid version can be found here: https://about.gitlab.com/products/feature-comparison/.
  • Travis : Travis is strongly associated with GitHub; security issues need to be considered when using SaaS with closed source code; it cannot be customized; open source projects are free, and others are charged.
  • Go : Go is the latest incarnation of ThoughtWorks' Cruise Control. Go is free except for commercial support provided by ThoughtWorks. It is available for Windows, Mac and various Linux distributions.

Architecture of automated testing platform

The next step is to build an automated testing platform.

To build an automated testing platform, the following aspects need to be considered:

  1. Choose the right testing framework and tools : You can choose some popular testing frameworks and tools, such as Selenium, Appium, JMeter, etc., and choose the tools that suit you according to your needs.
  2. Build a test environment : You need to build a test environment, including test servers, test databases, test data, etc. You can use virtual machines or containers to build a test environment for testing.
  3. Writing test cases : Test cases need to be written, and the test cases should cover each function point of the system in order to discover potential problems.
  4. Integrate test tools and test cases : Integrate test tools and test cases into the automated test platform for automated testing.
  5. Run test cases : After writing test cases, you need to run test cases, collect test results, and generate test reports.
  6. Regular maintenance and update : The automated test platform needs regular maintenance and update to ensure the stability of the test environment and the validity of the test cases.

The above are the general steps to build an automated test platform, and the specific implementation method needs to be adjusted according to the actual situation.

You can combine SpringBoot + TestNG testing framework to build your own automated testing platform

TestNG is an open source automated testing framework;

TestNG is inspired by JUnit and NUnit, but introduces some new features to make it more powerful and easier to use.

TestNG stands for Next Generation (an acronym for Next Generation).

TestNG is similar to JUnit (especially JUnit 4), but it is not an extension of the JUnit framework. It's designed to outperform JUnit, especially when used to test-integrate multiple classes.

The founder of TestNG is Cedric Beust (Cedric Beust).

TestNG removes most of the limitations of older frameworks, enabling developers to write more flexible and powerful tests. Because it borrows heavily from Java annotations (introduced in JDK5.0) to define tests, it can also show how to use this new feature in a real Java language production environment.

Nin, a 40-year-old architect, reminded that instead of building an automated testing platform from 0 to 1, it can be modified based on an open-source automated testing platform.

The following two test platforms are very good transformation projects:

360-degree all-round monitoring and maintenance architecture

360-degree all-round monitoring and maintenance architecture includes

  • log system
  • surveillance system

log system

The log system generally includes logging, collection, transfer, collection, storage, analysis, presentation, search, and distribution.

Some special ones such as dyeing , full chain tracking or monitoring may need to rely on the implementation of the log system.

The construction of the log system is not only the construction of tools, but also the construction of specifications and components. It is best to add some basic logs at the framework and component levels, such as full link tracking and the like.

For the general log system ELK can meet most of the requirements, ELK includes the following components:

ElasticSearch is an open source distributed search engine. Its features include: distributed, zero configuration, automatic discovery, automatic index fragmentation, index copy mechanism, RESTful style interface, multiple data sources, automatic search load, etc.

Logstash is a completely open source tool that collects, analyzes, and stores your logs for later use.
Kibana is an open source and free tool that provides Logstash and ElasticSearch with a log analysis friendly web interface that helps summarize, analyze and search important data logs.

Filebeat has completely replaced Logstash-Forwarder as a new generation of log collectors. At the same time, due to its lightweight and security features, more and more people are starting to use it.

Because the free ELK does not have any security mechanism, Nginx is used here as a reverse proxy to prevent users from directly accessing the Kibana server.

In addition, configuring Nginx to implement simple user authentication improves security to a certain extent.

In addition, Nginx itself has the function of load balancing, which can improve system access performance.

The ELK architecture is shown in Figure 3:

Figure 3 ELK flow chart

Figure 3 ELK flow chart

For real-time computing needs, you can use the Flume + Kafka + Storm + MySQL solution. The general architecture is shown in Figure 4:

Figure 4 Real-time analysis system architecture diagram

Figure 4 Real-time analysis system architecture diagram

in:

Flume is a distributed, reliable, and highly available log collection system for massive log collection, aggregation, and transmission. It supports customizing various data senders in the log system for data collection; at the same time, Flume provides simple data processing , and the ability to write to various data recipients (customizable).
Kafka is an open source stream processing platform developed by the Apache Software Foundation, written in Scala and Java. It is essentially a "large-scale publish/subscribe message queue according to the distributed transaction log architecture", which is widely used for its horizontal scalability and high throughput.

Kafka pursues high throughput and high load, and Flume pursues data diversity. The combination of the two is perfect.

surveillance system

The monitoring system only includes those related to the background. There are mainly two parts here. One is the monitoring of the operating system layer, such as the monitoring of machine load, IO, network traffic, CPU, memory and other operating system indicators.

The other is the monitoring of service quality and business quality, such as service availability, success rate, failure rate, capacity, QPS and so on.

Common business monitoring systems first include monitoring at the operating system level (this part is more mature), and then expand to other monitoring, such as Zabbix, Xiaomi's Open-Falcon, and one that supports both, such as Prometheus.

If the requirements for business monitoring are relatively high, it is recommended to give priority to Prometheus in the selection of entrepreneurship.

There is an interesting distribution here, as shown in Figure 5.

Figure 5 Monitoring system distribution

Figure 5 Monitoring system distribution

Zabbix is ​​used more in Asia, while Prometheus is mostly used in America, Europe, and Australia. In other words, Prometheus is used more in English-speaking countries (developed countries?).

Prometheus is an open source monitoring and alarm system and time series database (TSDB) developed by SoundCloud.

Prometheus is developed in Go language and is an open source version of Google's BorgMon monitoring system.

Compared with the push data method used by other monitoring systems, Prometheus uses the pull method, and its architecture is shown in Figure 6:

Figure 6 Prometheus architecture diagram

Figure 6 Prometheus architecture diagram

As shown in the figure above, the main components of Prometheus are as follows:

  • Prometheus Server : It is mainly responsible for data collection and storage, and provides support for PromQL query language.
  • Server : Specify the crawling target through configuration files, text files, ZooKeeper, Consul, DNS SRV Lookup, etc.
    According to these goals, the Server will fetch the metrics data regularly, and each fetching goal needs to expose an http service interface for it to fetch regularly.
  • Client SDK : The official client libraries include Go, Java, Scala, Python, Ruby, and many other third-party libraries that support Nodejs, PHP, Erlang, etc.
  • Push Gateway : An intermediate gateway that supports temporary Job active push indicators.
  • Exporter Exporter : is the general term for a type of data collection component of Prometheus. It is responsible for collecting data from the target and converting it into a format supported by Prometheus.
    Different from traditional data acquisition components, it does not send data to the central server, but waits for the central server to come to grab it actively.
  • Prometheus : Provides various types of Exporters to collect the running status of various services. Currently supported are databases, hardware, message middleware, storage systems, HTTP servers, JMX, etc.
  • Alertmanager : It is a separate service that can support Prometheus query statements and provide a very flexible alert method.
  • Prometheus HTTP API query method, customize the required output.
  • Grafana : It is an open source analysis and monitoring platform that supports Graphite, InfluxDB, OpenTSDB, Prometheus, Elasticsearch, CloudWatch and other data sources. Its UI is very beautiful and highly customizable.

Startup companies choose the Prometheus + Grafana solution, coupled with a unified service framework (such as gRPC), which can meet the monitoring needs of most small and medium teams.

High concurrency and high throughput load balancing deployment architecture for production environment

High concurrency and high throughput load balancing link architecture, including:

  • Selection and use design of DNS
  • Selection and usage design of LB (load balancing)
  • Selection and use design of CDN

Selection and use design of DNS

DNS is a very general service. Startups basically just need to choose a suitable cloud provider. There are mainly two in China:

Alibaba Wanwang: Ali acquired Wanwang in 2014 and integrated its domain name services, and finally formed the current Alibaba Wanwang, which includes DNS services;

Tencent DNSPod: Tencent acquired 100% shares of DNSPod for 40 million in 2012, mainly providing domain name resolution and some protection functions;

If your business is in China, these two are the main ones, just choose one. Enterprises like Toutiao also use DNSPod services, unless there are some special reasons that need to be self-built, such as some CDN manufacturers, or for regional There are special restrictions.

If you want to be more affordable, you can use the cheapest basic version of Alibaba. If you want a higher success rate, you should use the expensive version of DNSPod.

Choose Amazon abroad. Ali’s DNS service only has nodes in Japan and the United States. Southeast Asia has only recently begun to deploy. DNSPod is only available in the United States and Japan. Like some overseas companies, the cloud services they choose are basically Amazon.

If it is an online product, DNS strongly recommends using the paid version. Ali’s paid version of tens of dollars can basically meet the needs. If you need some logic for debugging by province or region, you need to add money, which is only a few hundred dollars a year, saving money and effort.

If you are abroad, choose Amazon first. If you need domestic and foreign interoperability and have your own APP, it is recommended to implement some disaster recovery logic or intelligent scheduling yourself, because there is no ready-made DNS service that can better meet domestic and foreign scenarios at the same time, or Use multiple domain names, different domain names go to different DNS.

Selection and usage design of LB (load balancing)

LB (Load Balancing) is a general service, and the LB services of general cloud vendors basically have the following functions:

  • Support four-layer protocol request (including TCP, UDP protocol);
  • Support seven-layer protocol requests (including HTTP, HTTPS protocols);
  • The centralized certificate management system supports the HTTPS protocol;
  • health examination;

If all your online service machines use cloud services and are in the same cloud service provider, you can directly use the LB service provided by the cloud service provider, such as Alibaba Cloud's SLB, Tencent Cloud's CLB, Amazon's ELB, etc. wait. If it is a self-built computer room, it is basically LVS + Nginx.

Selection and use design of CDN

CDN is already a very popular market now, and basically it can only earn some hard money, and it is sold at the cost. In China, Wangsu is the leader, and their home accounts for more than 40% of the entire domestic market share, followed by Tencent and Ali. A large part of Wangsu's rise is due to the rise of live broadcasting.

Abroad, Amazon and Akamai together account for about 50%. The former international market leader Akamai has more than half of the global share. After Amazon CDN entered the market, its share dropped by nearly 20%. Many small and medium-sized enterprises have turned to the latter. Akamai Also helpless.

Domestic CDN vendors that go overseas mostly serve domestic companies going overseas. Among the three larger CDN service providers, Wangsu has more nodes, but not much. Ali and Tencent are still in the early stage, and only a few countries have nodes.

As far as start-up companies are concerned, Tencent Cloud or Alibaba Cloud can be used as a CDN. The related systems are relatively complete and can be easily accessed. Wangsu’s system support is relatively weak and expensive. Moreover, when the traffic comes up, CDNs can’t use only one CDN, but need to use multiple CDNs. Different CDNs have different node coverage across the country, and there are some customer clusters for different customer cloud vendors, not full node coverage (but some Cloud vendors say that they are nodes of the entire network), in addition to the problem of node coverage, multiple CDNs also play a role in disaster recovery to a certain extent.

Say it at the end: If you have any problems, you can learn from the old architecture

The road to architecture is full of ups and downs

Architecture is different from advanced development. The question of architecture is open and developmental, and there is no standard answer

In the process of architecture or transformation, if you encounter a complex scenario, you really don’t know how to make an architecture solution, and you really can’t find a solid solution, what should you do?

You can come to the 40-year-old structure Nien for help.

Yesterday, a small partner, they were going to build the golden link structure of the e- website , and they couldn't find an idea at first, but after Nien's 10-minute voice guidance, it suddenly became clear.

The realization path of technical freedom PDF:

Realize your architectural freedom:

" Have a thorough understanding of the 8-figure-1 template, everyone can do the architecture "

" 10Wqps review platform, how to structure it? This is what station B does! ! ! "

" Alibaba Two Sides: How to optimize the performance of tens of millions and billions of data?" Textbook-level answers are coming "

" Peak 21WQps, 100 million DAU, how is the small game "Sheep a Sheep" structured? "

" How to Scheduling 10 Billion-Level Orders, Come to a Big Factory's Superb Solution "

" Two Big Factory 10 Billion-Level Red Envelope Architecture Scheme "

… more architecture articles, being added

Realize your responsive freedom:

" Responsive Bible: 10W Words, Realize Spring Responsive Programming Freedom "

This is the old version of " Flux, Mono, Reactor Combat (the most complete in history) "

Realize your spring cloud freedom:

" Spring cloud Alibaba Study Bible "

" Sharding-JDBC underlying principle and core practice (the most complete in history) "

" Get it done in one article: the chaotic relationship between SpringBoot, SLF4j, Log4j, Logback, and Netty (the most complete in history) "

Realize your linux freedom:

" Linux Commands Encyclopedia: 2W More Words, One Time to Realize Linux Freedom "

Realize your online freedom:

" Detailed explanation of TCP protocol (the most complete in history) "

" Three Network Tables: ARP Table, MAC Table, Routing Table, Realize Your Network Freedom!" ! "

Realize your distributed lock freedom:

" Redis Distributed Lock (Illustration - Second Understanding - The Most Complete in History) "

" Zookeeper Distributed Lock - Diagram - Second Understanding "

Realize your king component freedom:

" King of the Queue: Disruptor Principles, Architecture, and Source Code Penetration "

" The King of Cache: Caffeine Source Code, Architecture, and Principles (the most complete in history, 10W super long text) "

" The King of Cache: The Use of Caffeine (The Most Complete in History) "

" Java Agent probe, bytecode enhanced ByteBuddy (the most complete in history) "

Realize your interview questions freely:

4000 pages of "Nin's Java Interview Collection" 40 topics

The PDF file update of the above Nien architecture notes and interview questions, ▼Please go to the following [Technical Freedom Circle] official account to get it▼

Guess you like

Origin blog.csdn.net/crazymakercircle/article/details/131045050