Why Netflix is no operation and maintenance jobs?

Netflix is ​​the best practitioners of the industry's micro Services Architecture, which is micro-services architecture based on the public cloud, continuous delivery, monitoring, safeguard stability, can provide a large number of principles and practices to comply with the industry.

In this operation and maintenance segments, Netflix is ​​still a model of best practice. The world's top Internet companies is how to define the operation and maintenance as well as how to carry out the operation and maintenance work.

Netflix operation and maintenance of the status quo

Netflix is ​​no operation and maintenance positions, and operation and maintenance of the corresponding job is SRE (Site Reliability Engineer) .SRE ≠ operation and maintenance, SRE core idea is: redesign and operation and maintenance work is defined by the method of software engineering.

Rely on people to do the operation and maintenance of the way before the change, in turn, to change the system through tools, teamwork, organizational mechanisms and culture, etc., will be in before the end of the most R & D system operation and maintenance, and development back to the same side by side the starting line.

Why Netflix will do so extreme?

Netflix from several aspects of the technical architecture, organizational structure, corporate culture point of view.

1. technical architecture and business challenges of the next massive scale

The introduction of a flexible micro-services architecture to improve the development efficiency of the developer. But greatly increased the complexity of the architecture, has been unable to rely on manual control, and for subsequent delivery and online operation and maintenance has brought great difficulties and challenges, it is necessary to seek more effective and unified architecture on top of this technology solutions to complex problems of cognition. Furthermore, on top of which a unified set of technical solutions, development and operation and maintenance produced a new division of responsibilities and collaborative approach, that is, SRE. At this time, operation and maintenance services at the micro-architecture must rely on software engineering ideas to create tools support system to support. A micro-architecture that is service not only to be able to support the business functions, and also provides the foundation needed to expose more of the late delivery and operation and maintenance phase line maintenance capabilities.

A few simple examples, such as the offline service, routing policy adjustments, the number of concurrent dynamic adjustment function switch, ACL access control, abnormal and bypass the fuse, call relations and quality of service log output and so on, up to the capacity of the building our operation and maintenance tools and services platform.

It can be seen at the micro-service architecture model, operation and maintenance of the overall technology has become an essential part of the system architecture, and services related to the micro-architecture of the system is closely linked not split.

DevOps philosophy and derived from a series of topics, you can carefully think about, is actually the same context and logic. DevOps want to solve the growing contradiction between the development and operation and maintenance, the study of the fundamental problems caused by, or micro-service architecture behind the technical complexity in ever-increasing.

   Netflix gives us a revelation: at the micro-services architecture model, we have to change a line of thought and thinking to redefine the operation and maintenance, operation and maintenance must be closely integrated with the micro-service architecture itself.

2. a more rational organizational structure and advanced system tools and ideas

As mentioned above, in the micro-services architecture model, operation and maintenance of infrastructure and overall technology has become an integral part of the system, both out of line will bring a subsequent series of serious problems.

As early as 2012, or even earlier before, Netflix had been aware of this problem. In the organizational structure, middleware, SRE, DBA, delivery and automation tools, infrastructure, and other teams are placed in the unified cloud platform project (Cloud and Platform Engineering) under the big teams at the product level unified planning and construction, so can maximize organizational capacity, avoiding the disjointed development and operation and maintenance.

  Netflix Inspiration II: rational organizational structure is a necessary condition to protect the technical architecture of landing, use technical means to solve the efficiency and stability problems encountered during the operation and maintenance is the fundamental solution.

3. Freedom and Responsibility coexistence of corporate culture

Netflix's corporate culture is Freedom & Responsibility, that is freedom and responsibility coexist, high degree of freedom, but also requires staff have a stronger sense of responsibility and sense of Owner.

Reflected in the technical team is, You Build It, You Run It. Engineers can always be submitted to the production code or publish new services, but at the same time as you Owner, responsible for the stable operation of the service and the code you posted online.

Driven by this culture, technology development team will naturally be considered from the design phase to delivery phase and online operation and maintenance of end to end integrated solutions, and not just develop on demand development, post-delivery and maintenance should be a called operation and maintenance roles to consider. No, culture dictates, Netflix is ​​absolutely not allowed in this situation exists, you are a developer, you are the Owner, you will be responsible for end to end.

  Netflix gives us Revelation 3: Owner awareness is very important, the correct way of doing things need guidance, which is excellent and the extreme distance.

problem:

Now, after many companies using the micro-service architecture, did not fully take into account the subsequent operation and maintenance issues based on micro-services architecture. But also in the operation and maintenance team set up, it is still out of the whole technical team, not to mention its middleware architecture and design team integrated pull-through to construction, not to mention the natural rational planning and construction of the product level.

So the question is leading to inefficient operation and maintenance, relies entirely artificial, online fault-prone, but has very low efficiency, development, and operation and maintenance are in a state of being very painful, and operation and maintenance team members will encounter in transition and barriers to growth.

to sum up:

1. Under the micro-service architecture model, we have to change a line of thought and thinking to redefine the operation and maintenance, operation and maintenance must be closely integrated with the micro-service architecture itself.

2. Reasonable organizational structure is a necessary condition to protect the technical architecture of landing, use technical means to solve the efficiency and stability problems encountered during the operation and maintenance is the fundamental solution.

3.Owner awareness is very important, the correct way of doing things need guidance, which is excellent and the extreme distance.

4. From the human dimension labor movement through the steering system tools, teamwork, organizational mechanisms and other ways to change the culture.

 

Guess you like

Origin www.cnblogs.com/xiaobao2/p/11224898.html