How Google responds to emergencies

  • Introduction

  • How Google helps protect customer data

  • Emergency Response

  • team structure

  • Emergency Response Process

    • identify

    • coordination

    • deal with

    • Finish

    • keep improve

  • Summarize

  • References

Introduction

Maintaining a secure environment for customer data is a top priority at Google Cloud. Google protects customer data through an industry-leading information security operating system that combines rigorous processes, a world-class team, and a multi-layered information security and privacy infrastructure. This article focuses on Google's principled approach to management and response.

Incident response is an important aspect of Google's overall security and privacy program. Google has a rigorous emergency response management process in place. This process details the actions, escalation, mitigation, resolution and notification of any potential incidents affecting the confidentiality, integrity or availability of customer data.

At Google, Incident Response means a breach of Google's security systems that results in the accidental or unlawful destruction, loss, alteration, unauthorized disclosure, or access to Customer Data on systems managed or controlled by Google. Google takes steps to address foreseeable threats to data and systems, but emergency response does not include failed attempts or activities that do not compromise the security of Customer Data, including failed login attempts, pings, port scans, denial-of-service attacks, and other cyber-attacks that occur.

How Google helps protect customer data

The security of customer data is paramount, and data security relies on Google's collaboration with customers. Google is responsible for securing the underlying cloud infrastructure and services, and customers are responsible for securing their applications, devices, and systems when building on top of Google Cloud Infrastructure. Google provides customers with guidance and a variety of security features to achieve Google-caliber security practices:

  • Identity and Access Management

  • Data-at-rest and data-in-transit are encrypted by default without requiring any additional action by the customer

  • Multi-factor authentication, including a hardware second security key against phishing

  • Extensive network security options, including virtual private cloud (VPC) and shared VPC, built-in DDoS protection for software-as-a-service (SaaS) and platform-as-a-service (PaaS) solutions, as well as using the above mechanisms for infrastructure as a service (IaaS) Solution options

  • Detailed Audit Log

First of all, infrastructure and infrastructure, including technology, personnel, and process guarantees can support vulnerability management on a large scale, and try to avoid additional workload for customers. Google discloses the five security principles of security construction as follows:

safe construction field concept
  1. infrastructure security

    Infrastructure security is Google's core competency. Security shouldn't be an afterthought or an occasional measure, but an integral part of everyday work.

  2. data life cycle

    Google does not only focus on the field of data storage protection, but also the complete life cycle of collection, discovery, consent, access, protection, clearing, and export. This is also the driving force for Google to promote global https and secure DNS.

  3. Authentication Authorization Access Control

    Define identity policies that have access to and authorize content, identify legitimate access, detect it, and respond to it.

  4. application security

    The security of Google platform itself (GPC, Gmail, Brog), and the security of products (Android, chrome).

  5. Operations Management and Audit

    "Human factor" revolves around building the security and risk management of Google's entire operating system, including how to operate the management background and how to run infrastructure facilities.

For more information on how Google secures Google Cloud, see the article "Overview of Google Infrastructure Security Design" and the associated NEXT '18 Security Presentation, or visit the Google Cloud Security website.

Google provides customers with visibility into the various services they use on Google Cloud; customers can use the Google Workspace Security Center to prevent, detect and resolve issues with Gmail, Drive, devices, OAuth and user accounts. Likewise, for GCP, customers can use Cloud Security Command Center to gain visibility into data about resources, vulnerabilities, risks, and policies in their organization.

On the customer side, they must properly configure security features to meet their own needs, install software updates, set up network security zones and firewalls, and ensure that end users secure their account credentials and do not expose sensitive data to unauthorized parties.

The diagram below shows Google's shared responsibility model for cloud security, illustrating how the respective responsibilities of the customer and Google vary with the extent to which the customer utilizes managed services. As customers move from on-premises solutions to IaaS, PaaS, and SaaS cloud computing offerings, Google will be responsible for managing more parts of the overall cloud service, and the customer's security responsibility will decrease accordingly. Technically, the security defense is divided into 16 levels, and the unified concept is "defense in depth, and supports large-scale defense by default".

Google Security Capability Model, click to enlarge

Emergency Response

Google's Incident Response Program is managed by expert incident responders across many specialized functions to ensure that each response is well-suited to the challenges each incident presents. Depending on the specific nature of the emergency, specialized response teams may include:

  • Cloud Incident Management

  • Product Engineering

  • Site Reliability Engineering

  • Cloud Security and Privacy

  • digital forensics

  • global survey

  • Information detection

  • Security, Privacy and Product Advisor

  • trust and safety

  • anti-abuse technology

  • customer support service

Subject matter experts from these teams are involved in a variety of ways. For example, the Incident Commander coordinates incident response and, if needed, the digital forensics team detects ongoing attacks and conducts forensic investigations. Product engineers work to limit the impact to customers and provide solutions to fix affected products. The legal team works with appropriate security and privacy team members, implements Google's evidence-gathering strategy, works with law enforcement and government regulators, and advises on legal issues and requirements. Customer Support responds to inquiries and requests from customers to provide them with additional information and assistance. Communication is always an important part of emergency response.

In 2020, Google will try to recruit internal engineers who are interested in safety to participate in the emergency response process. Through training, on-duty, and information sharing, the emergency response capabilities will be opened up. On duty across time zones, eventually part-participating volunteer engineers may even be fully qualified for the role of commander.

team structure

As with the SRE process, when Google declares an incident, it designates an Incident Commander to coordinate the incident response and resolution. The Incident Commander will select specialists from different teams to form a response team. A typical response architecture is shown in the figure below. The Incident Commander delegates to these professionals the responsibility for managing different aspects of the incident and manages the entire process from declaration to closure. The diagram below depicts the relationship of the various roles and their respective responsibilities during emergency incident response.

Emergency Response Team Structure

Emergency Response Process

Every emergency response is unique, and the goals of the emergency response process are to protect customer data, restore normal service as quickly as possible, and meet regulatory and contractual compliance requirements. A brief look at the process of Google's emergency response plan is as follows:

Emergency incident response workflow, click on the picture to enlarge

identify

Early and accurate identification of incidents is key to strong incident management. This phase focuses on monitoring security events to detect and report potential emergency response.

Google's Incident Detection team uses advanced detection tools, signals, and alerts to spot potential incidents early.

Google's sources of incident detection include:

  • Automated network and system log analysis: Automated analysis of network traffic and system access can help identify suspicious, abusive, or unauthorized activity and escalate it to Google's security staff

  • Testing: Google's security team proactively scans for security threats using penetration testing, quality assurance (QA) measures, intrusion detection, and software security audits

  • Internal Code Audits: Source code audits uncover hidden vulnerabilities, design flaws, and verify that key security controls are implemented

  • Product-specific tools and processes: Automate tools where possible based on team functions to enhance Google's ability to detect incidents at the product level

  • Usage Anomaly Detection: Google employs a multi-layered machine learning system to identify anomalies in user activity across browsers, devices, app logins, and other usage events

  • Data Center and/or Work Environment Service Security Alerts: Security Alerts for Data Centers scan for incidents that could affect your company's infrastructure

  • Googlers: Googlers detect anomalies and report them

  • Google's Bug Bounty Program: Technical vulnerabilities that may affect the confidentiality or integrity of user data exist in Google-owned browser extensions, mobile applications, and web applications, sometimes reported by external security researchers

coordination

When an incident report is received, the on-duty responder will review and evaluate the nature of the incident report to determine if it is an emergency response and initiate Google's emergency incident response process.

Once an incident is confirmed, it will be handed over to the Incident Commander, who will assess the nature of the incident and implement a coordinated response. During this phase, the response includes completing a triage assessment of the incident, adjusting its severity if needed, and activating the required incident response team, with the appropriate operations/technical lead reviewing the situation and identifying critical areas that require investigation. Google assigns a product executive and a legal executive to make key decisions about how to respond. The Incident Commander will assign appropriate personnel to investigate and gather facts.

Many aspects of Google's response depend on a severity assessment based on key facts gathered and analyzed by the incident response team. These facts may include:

  • Potential for Harm to Customers, Third Parties, and Google

  • The nature of the incident (such as whether it will result in data being corrupted, accessed, or made unavailable)

  • Types of data that may be affected

  • The impact of emergencies on customers' use of the service

  • The status of the incident (such as whether the incident has been isolated, is ongoing, or contained)

Incident Commanders and other supervisors regularly reassess these factors throughout the response process as new information becomes available to ensure Google's response has been allocated the appropriate resources and urgency. Incidents with the most severe impact are assigned the highest severity. The response will also designate a communications lead who will work with the other leads to develop a communications plan.

deal with

During this phase, the focus is on investigating root causes, limiting the impact of the incident, addressing immediate security risks (if any), implementing necessary fixes during remediation, and restoring affected systems, data, and services.

Affected data will be restored to its original state wherever possible. Depending on the circumstances of a particular Incident, Google may take a number of different steps to resolve a particular Incident. For example, a technical or forensic investigation may be required to reconstruct the root cause of an issue or identify the impact on customer data. If data is accidentally changed or destroyed, Google may attempt to reconstruct the data from Google's backup copies.

A key aspect of remediation is notifying customers when incidents affect their data. Throughout the incident, key facts will be evaluated to determine whether the incident affected customer data. If it is necessary to notify the customer, the Incident Commander will initiate the notification process. The communications lead will develop a communications plan with input from the product and legal leads, notify affected customers, and respond to customer requests with the help of Google's support team after notification.

Google is committed to providing timely, clear, and accurate notifications of known details of the emergency response, steps Google has taken to mitigate potential risks, and actions Google recommends customers take to address the incident. Google will use its best efforts to provide details of the incident so that Customer can evaluate and comply with its notification obligations.

Finish

Following successful remediation and resolution of the incident response, the incident response team evaluates the lessons learned from the incident. If the incident raises critical issues, the Incident Commander may initiate a postmortem analysis. During this process, the incident response team reviews the cause of the incident and Google's response, and identifies key areas for improvement. In some cases, this process requires discussions with different product, engineering, and operations teams and product improvement efforts. If follow-up work is required, the incident response team develops an action plan to get it done and assigns a project manager to lead the long-term effort. Incidents will be closed after remedial work is complete.

keep improve

At Google, Google strives to learn from each incident and implement preventive measures to avoid future incidents.

Actionable insights from incident analysis help Google improve its own tools, training, and processes, as well as Google's overall security and privacy data protection programs, security policies, and/or response efforts. Key lessons learned also help in prioritizing related efforts and building better products.

Google's security and privacy professionals review and continually improve the company's security programs for all networks, systems and services, and provide project-specific consulting services to product and engineering teams. They will deploy machine learning, data analytics and other new technologies to monitor for suspicious activity on Google's network, address information security threats, perform routine security assessments and audits, and hire outside experts to conduct regular security assessments. Additionally, Google has a full-time team (called "Project Zero") dedicated to reporting bugs to software vendors and archiving them to an external database to guard against targeted attacks.

Google conducts regular training and awareness campaigns to drive innovation in security and data privacy. Dedicated incident response staff will be trained in forensics and evidence handling, including the use of third-party and proprietary tools. Google will also conduct tests of its emergency incident response processes in critical areas, such as systems storing sensitive customer information. These tests consider a variety of scenarios, including insider threats and software vulnerabilities, and help Google better prepare for security and privacy emergencies.

As part of the ISO-27017, ISO-27018, ISO-27001, PCI-DSS, SOC 2 and FedRAMP programs, Google will conduct regular testing of processes to provide Google customers and regulators with and independent verification of compliance controls.

Summarize

Emergency Response Process

As noted above, Google has a world-class incident response program that provides the following key functions:

  • Processes built on industry-leading technology, purpose-built to resolve incidents and optimized to operate efficiently at Google's scale

  • Groundbreaking monitoring system, data analytics and machine learning services to proactively detect and contain incidents

  • Numerous dedicated subject matter experts ready to dispatch for emergency response of any type or size

  • Mature process for promptly notifying affected customers, consistent with Google's commitments in the Terms of Service and Customer Agreement

Keeping data safe is at the heart of Google's business. Google will continue to invest in overall security programs, resources and expertise so that Google's customers can rely on Google's effective response when emergencies occur, protect their data security, and continue to meet customers' expectations for the high reliability of Google services .

References

https://www.youtube.com/watch?v=tz5ggxqEOos&ab_channel=GoogleCloudTech 

https://www.youtube.com/watch?v=NhyRtgDpiC0

https://www.infoq.cn/article/w1ldnnzvglcaohkxcfpp

https://www.sdnlab.com/22984.html 

https://www.secrss.com/articles/5328

https://services.google.com/fh/files/misc/data_incident_response_2018.pdf?hl=zh-cn

Guess you like

Origin blog.csdn.net/weixin_47208161/article/details/115713794