The Ops phase of building DevSecOps must be strengthened

  • What combines safety and reliability

  • Why Mistaking "DevSec" for "DevSecOps"

  • How Secure Your Ops

    • Security Chaos Engineering-Security Chaos Engineering

    • Toughness-Resilience

    • Contactless-No Touch

  • When perceptions of safety need to change immediately

  • References

What combines safety and reliability

  There are many articles in the industry about DevSecOps practice, most of which are introductions of static code inspection, passive scanners, security frameworks and SDKs, and domain name online card points. The first misunderstanding in practicing DevSecOps is mistaking "DevSec" for "DevSecOps", lacking the operation and maintenance\delivery phase of continuous operation, and only moving security to the left of developers is choosing to do a simple thing, not the right thing .

  Developers are usually not security engineers, and security teams usually don't have developers. Neither team has the resources or skills to meet the needs of the other. Safety is not only the responsibility of the safety and R&D teams. It requires the participation of every engineer. The author believes that safety work should not only be done from the perspective of ensuring no accidents, but must Thinking from the perspective of bringing positive value to business development , DevSecOps starts with focusing on code, and starts the cycle with deployment and operation and maintenance. Security in the Deploy and Operation phases is also very important.

  It's time to talk about the integration of security and reliability. Reliability and security are inseparable. Without security, security is impossible; without reliability, security is incomplete and not consolidated. The field of security operations in the operation and maintenance and delivery phases will be the main battlefield of the DevSecOps industry in the next 5 to 10 years, and domestic Alibaba and Tencent have already deployed personnel and technology accumulation in this area.

Why Mistaking "DevSec" for "DevSecOps"

  When practicing DevSecOps, many people make a mistake in the first step - mistaking "DevSec" for "DevSecOps", ignoring the security of the Ops stage, and defining continuous security to distinguish between two teams: one is the development and engineering team, and the other is In the operation and maintenance/release team, tossing developers is a traditional skill of the security team. The security prejudice thinks that SRE is to apply for machines and apply patches. In fact, the role of SRE is completely different now. Although no one wants to toss SRE, But we can't compromise, how can the work not be strict? How can you do a good job in harmony and harmony? There is a price to be paid to achieve change.

A company introduced the so-called DevSecOps tool chain, and it ended at the online stage

Only pay attention to the sec of dev, not the sec of ops stage, which is mainly reflected in four aspects:

  The idea of ​​construction is limited by past experience . For cloud-native projects, microservices, big data, and SaaS projects, it is not enough to pass the security team's web and mobile application security review and say "the security review has passed, and it can be released online." When it comes to application security, it actually means In the narrow sense of research and development security, it seems that the host system security, deployment security, operation and maintenance security, and infrastructure security have nothing to do with themselves, and the security hand cannot be stretched too far, but now the future is "infrastructure as a service" (IaaS) , The trend of "configuration as a service" should not only focus on the code dimension, but also pay attention to the entire life cycle of the product. The power of supply chain attacks is not just code. Data shows that developers spend 39 percent of their time managing DevOps infrastructure.

  Taking the supply chain attack as an example, there are 8 attack links as follows, but if you focus on R&D security, you can only cover 4 of the code technical levels, ignoring the scenarios A, B, G, and H in the figure below.

Revisiting the broader landscape of supply chain attacks

  Choose to do the present, not the long-term . The current problem is that security is only considered in the stages of requirements, design, development and implementation. From a business perspective, enterprise security itself is an ongoing project, and attention must be paid to the delivery process of the entire security capability. The development of hardware and software is never complete. There are always things we can do to reduce the risk and impact of future vulnerabilities.

  Safety engineering requirements are not high . We always think that the company's security risks are introduced by the code written by the R&D personnel. In fact, for a full-link attack, exploiting code security vulnerabilities is only the entry boundary. It doesn't matter. Most of the current capabilities are mainly focused on anti-intrusion, and it is impossible for many companies to prevent deletion of databases and run away. A lot of energy has been spent on scanning, repairing vulnerabilities, and alerting capabilities for long-tail work. The return on security investment is already insufficient. Why not really pay attention to the shortcomings of construction? Of course, some black and white box security operation capabilities "look" relatively mature, and the challenge is not big. In essence, we have not put forward higher requirements. We should not only encourage devsec from 10 to 100, but also encourage secops from 0 to 1, 1 to 10.

  Ignore the human factor . Humans make mistakes, and we can only reduce the impact of problems, never fully solve them. In addition to focusing on technology and processes, the operational factor of people is the biggest variable that affects the effect of DevOps.

  ESG’s survey of IT and network security professionals in the private and public sectors in North America shows that 48% of organizations deliberately push vulnerable code into products, indicating that although tools and processes can detect it, people actively ignore it. .

  Another recent example, can we really solve the problem of "deleting the database and running away"? Byte recently responded through the event of "ByteDance's interns deleted the database, and the company notified it as an important accident": No.

How Secure Your Ops

  DataOps, MLOps, and AIOps can be collectively referred to as the Operations (Ops\Operation and Maintenance) phase. SREs can help teams find the balance between releasing new features and ensuring reliability for users. SRE (or DevOps engineers) focus on stability and security. The security issues that SRE (or DevOps engineers) focus on are consistent in terms of solution strategy, technical implementation, and human requirements. However, the status quo is that the communication and collaboration between security and SRE is far from enough , in the figure below, it is obvious that everyone is doing more things to the left, and the security proficiency of the technology stack on the right is not high.

DevOps tool chain, do you pay more attention to the left side than the right side?

  We will never be able to understand our own system , but security has not kept up with this principle. We always like to standardize and unify. The distributed system we are building is so complex that no one can explain the essence of its entire operation. System insecurity is the norm. Inadequacies in the emergency response process are normal, and the concept of preventing erroneous events and strictly guarding against them is no longer suitable for the current industry development. In the face of new challenges, how to continue to do things well? We need to constantly broaden our thinking, look forward and backward at the same time, and pay attention to the field of Ops. There are already some practices in the industry that work in the field of safety and reliability:

Security Chaos Engineering-Security Chaos Engineering

  The red-blue confrontation method is a kind of traditional thinking, and its shortcomings are obvious: 1. Attackers can always find defects, and the blue team can always receive alarms, so it is hard to tell how much work needs to be done; 2. Although a small number of excellent executive teams can do weekly and monthly drills, the frequency is still seriously insufficient; 3. Opaque, non-repeatable experiments, uncontrollable process, and lack of continuous feedback. If the positioning of Red and Blue is to "discover what problems exist in construction", security chaos engineering focuses on "priority of these problems in construction".

  When we are faced with a business system of a complex scale, security breaches are almost inevitable, and no security issues are accidental. No matter how much effort you spend on vulnerability repair, 0day is almost inevitable; no matter how much security training you organize, it is almost inevitable for employees to click on the phishing system; no matter how much risk awareness you do, it is inevitable that intrusion occurs faster than blocking .

  According to the "IBM Security 2020 Cost of a Data Breach Report", 52% of data breaches are due to malicious code attacks, 23% are due to human errors, and 25% are due to system failures. Security chaos engineering solves the latter two types of problems through technical means such as observation of emergency response, safety control measures verification, baseline monitoring, and risk discovery, accounting for a total of 48%!

  Treating security as a failure requires joint improvement of operation and maintenance and security. Security chaos engineering improves the practicability of enterprise security architecture by quickly detecting whether services are robust, safe, resilient, and able to tolerate unexpected security events.

Toughness-Resilience

  What should be the first step in a security incident? Resume the normal operation of the business and stop the loss. Organizations need to build capabilities to effectively maintain system resilience and recovery in the face of performance and security threats. The resilience discussed at this year's RSAC conference has actually gone beyond the scope of security, and is down-to-earth software engineering. Resilience requires that security work not only pay attention to getting results, but also pay attention to the maturity of process stages, and introduces the concepts of control degradation, redundancy, elastic design, isolation, automatic mitigation and recovery.

Contactless-No Touch

  Malicious users (or attackers) try to damage the system, interrupt the system to affect service availability, we must put the "human factor" in the "cage". Imagine that a developer sends a wrong configuration through configuration management to cause software changes, but at this time, is the code review and code scanning process bypassed? Is it something that application security should pay attention to? The responsibility of SDL or DevSecOps is product security or application security as a whole, and should not stop at R&D security.

  For example, how to prevent high-privilege operation and maintenance personnel from opening the public IP to the Internet without permission? No Touch audits every change and must go through automation, software verification risk factors, and auditable backup measures to achieve "improving production security and avoiding interruption". Security inspection policy, security agent, audit, focus on the interaction between human and machine, machine and machine.

  Refer to AWS's "Humans and Data Don't Mix". In this regard, a lot of tool chains need to be built, and the prospects for achieving results are great.

When perceptions of safety need to change immediately

  The author has never believed in DevSecOps. The only ideas to guide the construction of the field in my field are the two concepts of mandatory infrastructure security and extreme automation. The function of the term DevSecOps is only to persuade partners to reach cooperation:) Patience, wisdom and persuasion.

  All in all, it is necessary to pay attention to the complex IT and R&D processes in the organization, properly balance the selection of security protection architecture and the rationality of the architecture, and ensure the leading position of technology security, so as not to be eliminated in three to five years, and usually plan and respond to security Persistent threats, not 100% security.

  Re-understand the continuous security of DevSecOps from now on, and act now!

References

https://www.researchgate.net/publication/335922038_Security_Chaos_Engineering_for_Cloud_Services https://www.freebuf.com/articles/security-management/275605.html

https://security.tencent.com/index.php/blog/msg/150

https://devops.com/survey-shows-mounting-devops-frustration-and-costs/

https://security.googleblog.com/2021/06/introducing-slsa-end-to-end-framework.html

Guess you like

Origin blog.csdn.net/weixin_47208161/article/details/119338162