The 10 most common critical security issues for large model applications defined by OWASP

Register in HUAWEI CLOUD before July 15th, and you can participate in the Check lucky draw. The lucky draw is at the end of this article

1. "OWASP TOP10 LLMs Project" Project Introduction

*OWASP Top 10 for Large Language Model Applications

The OWASP Top 10 Large Model Applications project aims to educate developers, designers, architects, managers, and organizations about the potential security risks when deploying and managing large language models (LLMs). This project provides a list of the top 10 most critical vulnerabilities commonly found in LLM applications, highlighting their potential impact, ease of exploitation, and prevalence in real-world applications. Examples of vulnerabilities include hint injection, data leakage, insufficient sandboxing, and unauthorized code execution, among others. The goal is to increase awareness of these vulnerabilities, recommend remediation strategies, and ultimately improve the security posture of LLM applications.

This article mainly refers to the translation of the latest version: "OWASP Top 10 for Large Language Models (0.5)".

  • OWASP LLMs TOP10 Diagram

2. LLM01: Prompt Injections

Hint injection vulnerabilities in LLMs involve dodgy inputs leading to undetected actions. Repercussions range from data exposure to unauthorized operations and serving attackers' goals.

A hint injection vulnerability occurs when an attacker manipulates a trusted large language model (LLM) with constructed input hints through multiple channels. Such manipulations often go undetected due to the inherent trust in the output of LLMs.

There are two types: direct (the attacker affects the input to the LLM) and indirect (the "poisoned" data source affects the LLM).

Results can range from exposing sensitive information to influencing decision-making. In complex situations, LLMs can be tricked into unauthorized operations or impersonations, effectively serving the attacker's goals without alerting the user or triggering protective measures.

2.1. Precautions

  • Privilege Control
    Restrict the permissions of the LLM to the minimum required for its function. Prevents LLM from changing a user's status without explicit approval.

  • Enhanced Input Validation
    Implement robust input validation and sanitization methods to filter out potentially malicious prompt input from untrusted sources.

  • Isolation and Control of External Content Interactions
    Isolate untrusted content from user prompts and control interactions with external content, especially with plugins that may cause irreversible actions or expose personally identifiable information (PII).

  • Trust Management
    Establish trust boundaries between LLMs, external sources, and extensible functions (e.g., plugins or downstream functions). Treat LLM as an untrusted user and keep end users in control of the decision-making process.

3. LLM02: Insecure Output Handling

When a plugin or application accepts LLM output, the lack of security controls can lead to XSS, CSRF, SSRF, privilege escalation, remote code execution, and potentially enable proxy hijacking attacks.

An insecure output handling vulnerability is a hint injection vulnerability that occurs when a plugin or application blindly accepts Large Language Model (LLM) output without proper scrutiny and passes it directly to a backend, privileged, or client function. This bug occurs. Since the content generated by the LLM can be controlled by prompting for input, this behavior is similar to providing the user with indirect access to additional functionality.

Successful exploitation of an insecure output handling vulnerability could lead to XSS and CSRF in web browsers, as well as SSRF, privilege escalation, or remote code execution on backend systems. The impact of this vulnerability is increased when an application allows LLM content to perform actions outside the intended user permissions. Additionally, this can be used in conjunction with proxy hijacking attacks to allow the attacker privileged access to the target user's environment.

3.1. Precautions

  • Treat the model like any other user and apply appropriate input validation to responses from the model to backend functions;
  • Encode the output from the model to reduce unnecessary interpretation of JavaScript or Markdown code.

4. LLM03: Training Data Poisoning

LLM learns from diverse texts, but has the risk of poisoning the training data, leading to user misinformation.

Large language models (LLM) use different raw texts to learn and generate output. Poisoning training data by an attacker introducing a vulnerability can corrupt the model, exposing users to incorrect information. The OWASP list of LLMs highlights the risks of over-reliance on AI content. Key data sources include Common Crawl, WebText, and OpenWebText for models such as T5 and GPT-3, including public news and Wikipedia and books, accounting for 3% of GPT-16 training data.

4.1. Precautions

  • Validate the supply chain for training data (if external) and maintain proof, similar to the SBOM (Software Bill of Materials) approach;
    • verify the legality of the data source and the data contained therein;
    • By crafting different models on separate training data for different use cases to create finer and more accurate generated AI output;
    • Ensuring that sufficient sandboxes exist to prevent models from grabbing unexpected data sources;
    • Use strict censorship or input filters for specific training data or categories of data sources to control the amount of fake data;
    • Implement a dedicated LLM to measure adverse outcomes and train other LLMs using reinforcement learning techniques;
    • Perform LLM-based red team exercises or LLM vulnerability scans during the testing phase of the LLM lifecycle.

5. LLM04: Denial of Service

The attacker interacts with the LLM in a particularly resource-intensive manner, causing degradation of service quality for them and other users, or incurring high resource costs.

5.1. Precautions

  • limit resource usage per request;
  • limit the resource usage of each step to speed up the execution of requests involving complex parts;
  • Limits the number of queued and total operations in the system that react to LLM responses.

6. LLM05: Supply Chain Security (Supply Chain)

Integrity risks exist in the LLM supply chain due to vulnerabilities leading to bias, security breaches or system failures. Questions come from pretrained models, crowdsourced data, and plugin extensions.

The supply chain in LLM can be vulnerable to attacks that affect the integrity of training data, ML models, deployment platforms, and lead to biased results, security breaches, or complete system failures. Vulnerabilities have traditionally focused on software components, but have been expanded in AI due to the prevalence of transfer learning, reusing pre-trained models, and crowdsourcing data. Among public LLMs, LLMs such as OpenGPT extensions are also areas vulnerable to this vulnerability.

6.1. Precautions

  • Carefully vet sources and suppliers;
  • Vulnerability scanning of components, not only when deployed to production, but also before being used for development and testing;
  • The model development environment uses its own curated package repository for vulnerability checking;
  • code signing;
  • Conduct robustness tests on the entire link providing services to prevent tampering and poisoning of models and data, as well as the entire MLOps pipeline;
  • Implement adversarial robustness training to help detect extraction queries;
  • Review and monitor vendor security and access;
  • audit.

7. LLM06: Permission Issues

The lack of authorization tracking between plugins could enable indirect hint injection or usage by malicious plugins, leading to privilege escalation, loss of confidentiality, and potential remote code execution.

Authorization is not tracked between plugins, allowing malicious actors to take action in the context of LLM users through indirect hint injection, using malicious plugins, or other methods. Depending on the plugins available, this could lead to privilege escalation, loss of confidentiality, and even remote code execution.

7.1. Precautions

  • Any action performed by sensitive plugins that requires manual authorization;
  • call no more than one plugin per user input, resetting any plugin-provided data between calls;
  • Prevent sensitive plugins from being called after any other plugins;
  • Perform taint tracking on all plug-in content, ensuring that the authorization level of the calling plug-in corresponds to the minimum authorization of any plug-in providing input to the LLM prompt.

8. LLM07: Data Leakage

Data leaks in LLM can expose sensitive information or proprietary details, leading to privacy and security breaches. Proper data cleansing and clear terms of use are critical for prevention.

A data breach occurs when an LLM accidentally leaks sensitive information, proprietary algorithms, or other confidential details through its responses. This can lead to unauthorized access to sensitive data or intellectual property, privacy violations, and other security breaches. It is important to note that users of LLM applications should understand how to interact with LLM and determine how they are at risk of inadvertently entering sensitive data.
Vice versa, LLM applications should perform sufficient data cleansing and sanitization validation to help prevent user data from entering training model data. Additionally, the company hosting the LLM should provide an appropriate terms-of-user policy to make users aware of how data is handled.

8.1. Precautions

  • Integrate appropriate data cleansing and sanitization techniques to prevent user data from entering training model data;
  • Implement strong input validation and sanitization methods to identify and filter out potentially malicious input;
  • Maintain continuous supply chain risk mitigation through techniques such as SAST (Static Application Security Testing) and SBOM (Software Bill of Materials) attestation to identify and remediate vulnerabilities in third-party software or package dependencies;
  • Implement a specialized LLM to benchmark against adverse outcomes and train other LLMs using reinforcement learning techniques;
  • Perform LLM-based red team exercises or LLM vulnerability scans during the testing phase of the LLM lifecycle.

9. LLM08: Excessive Agency

Unrestricted agency can lead to undesirable operations and operations when the LLM interfaces with other systems. Like web applications, LLMs should not be self-policing, controls must be embedded in the API.

LLMs can be granted a level of agency - the ability to interface with other systems to take action. Any unrestricted bad manipulation of the LLM (regardless of the root cause, e.g., hallucinations, direct/indirect cue injection, or just poorly designed benign cue, etc.) can lead to undesired actions being taken. Just like we never trust client-side validation in web applications, LLM should not be trusted to self-police or self-limit, controls should be embedded in the API of the connected system.

9.1. Precautions

  • Reduce the permissions granted to LMM to the minimum necessary to limit the scope of undesirable operations;
  • Implement rate limiting to reduce the number of bad operations;
  • With human-computer interaction controls, a human is required to approve all actions before they can be performed.

10. LLM09: Overreliance on LLM-generated content (Overreliance)

Over-reliance on LLM can lead to misinformation or inappropriate content due to "illusions". Without proper oversight, this can lead to legal issues and reputational damage.

Over-reliance on LLM is a security hole that occurs when a system relies too much on LLM for decision-making or content generation without adequate oversight, validation mechanisms, or risk communication. LLMs, while capable of producing creative and informative content, are also susceptible to “illusions” that produce content that is factually incorrect, absurd, or inappropriate. If left unchecked, these illusions can lead to misinformation, miscommunication, potential legal issues, and repercussions on a company's reputation.

10.1. Precautions

  • Ongoing monitoring
    LLM outputs are regularly monitored and reviewed to ensure they are factual, coherent and appropriate. Use manual review or automated tools for larger scale applications;
  • Fact checking
    to verify the accuracy of information provided by the LLM before it is used for decision-making, information dissemination or other critical functions;
  • Model Adjustment
    Adjust your LLM to reduce the possibility of hallucinations. Techniques include rapid engineering, parameter efficient tuning (PET) and full model tuning;
  • Set up validation mechanisms
    Implement automated validation mechanisms to check generated output against known facts or data;
  • Improve risk communication
    Follow risk communication literature and best practices from other sectors to facilitate dialogue with users, create actionable risk communication, and continually measure risk communication effectiveness.

11. LLM10: Insecure Plugins

Plugins that connect LLM to external resources could be exploited if they accept free-form text input, enabling malicious requests that could lead to unwanted behavior or remote code execution.

The plugin that connects LLM to some external resource accepts free-form text as input instead of parameterized and type-checked input. This allows potential attackers a lot of freedom to craft malicious requests to plugins, which can lead to all kinds of bad behavior, even remote code execution.

11.1. Precautions

 

Guess you like

Origin blog.csdn.net/hwxiaozhi/article/details/131641685