Ranger learning (1)

Preface

Recently I started to learn how to use the ranger framework. I only knew that ranger was a security component in a big data cluster. The next step is to review the official website documents through the machine and a few articles on learning how to use water.

1. Use Ranger to provide authorization in Hadoop

Ranger manages access control through the user interface to ensure consistent policy management data access components across Hadoop. Security administrators can define security policy levels on databases, tables, columns, and files, and can manage specific ldap-based groups or individual user permissions. Conditions such as time or geographic location based on dynamic rules can also be added to existing policy rules. The Ranger authorization model is pluggable and can be easily extended to any data source using service-based definitions. Once users are authenticated, their access rights must be determined. Authorization defines the authority resources that the user accesses. For example, users are allowed to create policies and view reports, but not allowed to edit users and groups. You can use Ranger to set up and manage access to Hadoop services. Ranger can create services for specific Hadoop resources (HDFS, HBase, Hive, etc.) and add policies to access these services. You can also create tag-based services and add access policies to these services. Using tag-based policies allows you to control resource access across multiple Hadoop components without having to create separate services and policies in each component. You can also use Ranger TagSync to synchronize Ranger tags and store them with external metadata services (such as Apache Atlas).

2. Overview of Ranger Policy

Ranger has two types of strategies: resource-based and tag-based.

Based on resource strategy
Ranger can configure resource-based services (HDFS, HBase, Hive, etc.) and add access policy services to them.

Tag-based strategy
Ranger allows you to create tag-based services and add access policies to these services.

Tag-based strategy

Ranger allows you to create tag-based services and add access policies to these services.

Overview of tag-based strategies

• An important feature of Ranger tag-based authorization is the separation of resource classification and access authorization. For example, resources (HDFS files/directories, Hive databases/tables/columns, etc.) containing sensitive data such as social security numbers, credit card numbers or sensitive healthcare data can be marked with PII/PCI/PHI-when the resource enters the Hadoop ecosystem Time or at a later time. Once the resource is marked, the authorization of the label will be performed automatically, thereby eliminating the need
for resources to create or update policies .
• Using tag-based policies also allows you to control access to resources across multiple Hadoop components without having to create separate services and policies in each component.
The label details are stored in the label storage. Ranger TagSync can be used to synchronize tag storage with external metadata services (such as Apache Atlas).
The detailed information of the tag associated with the resource is stored in the tag storage. The Apache Ranger plugin retrieves the tag details from the tag store for use during policy evaluation. In order to minimize the performance impact during policy evaluation (looking for tags for resources), the Apache Ranger plugin caches tags and periodically polls the tag store for any changes. When a change is detected, the plug-in will update the cache. In addition, these plugins store the tag details in the local cache file—just like the policy is stored in the local cache file. When the component restarts, if the tag storage is unreachable, the plugin will use the tag data in the local cache file.
The Apache Ranger plugin downloads tag details from the store managed by the Ranger administrator. The Ranger administrator saves the marking details in its policy store and provides a REST interface for the plug-in to download the marking details.

label

Ranger tags can have attributes. You can use tag attribute values ​​in a strategy based on Ranger tags to influence authorization decisions. For example, to deny access to resources after a certain date:

  1. Add the EXPIRES_ON flag to the resource.
  2. Add the exipry_date tag attribute and set its value to the expiration date.
  3. Create an administrator policy for the EXPIRES_ON flag.
  4. Add a condition to this policy to deny access when the date specified by the in expiry_date tag attribute is later than the current date.

Note that the EXPIRES_ON marking policy is created as the default policy in the marking service instance.

TagSync

Ranger TagSync is used to synchronize tag storage with external metadata services (such as Apache Atlas). TagSync is a daemon process, similar to the Ranger UserSync process.
Ranger TagSync receives tag details from Apache Atlas via change notifications. When adding, updating or deleting tags to resources in Apache Atlas, Ranger TagSync receives notifications and updates the tag storage.


3. Labeling and strategy evaluation

When authorizing an access request, the Apache Ranger plug-in will evaluate the applicable Ranger policy of the accessed resource. The following figure shows the details of the policy evaluation flow. More details about this workflow step will be provided in subsequent sections.
Insert picture description here
Find tags
Apache Ranger supports the service of registering a context enhancer, which is used to update context data to access requests.
The Ranger tagging service is part of the tag-based strategy feature, and it adds a context enhancer called RangerTagEnricher. This context richer is responsible for finding tags for the requested resources and adding tag details to the request context. This context richer maintains a cache of available tags; when processing an access request, it finds tags applicable to the requested resource and adds the tags to the request context. The context enricher periodically polls the administrator to update the cache to learn about changes.

Evaluate label-based strategies

Once the tag list for the requested resource is found, the Apache Ranger policy engine evaluates the tag-based strategy applicable to the tag. If a policy against one of these tags results in deny, access will be denied. If none of the tags are denied, and a policy allows one of them, then access will be allowed. If there are no marked results, or no resource marking, the policy engine will evaluate resource-based policies to make authorization decisions.

Use tags in conditions

Apache Ranger allows the use of custom conditions when evaluating authorization policies. The Apache Ranger policy engine makes various request details (such as users, groups, resources, and contexts) available to conditions. The tags in the request context added by the enhancer can be used for conditions and can be used to influence authorization decisions.

The default policy EXPIRES_ON tag in the tag service instance uses such a condition to check whether the request date is later than the value specified in the tag attribute expiry_date. Unless the EXPIRES_ON flag is created in Atlas, this default policy will not take effect.

Four, Apache Ranger access conditions

The Apache Ranger access policy model is mainly composed of two parts: specify the resources applied by the policy, such as HDFS files and directories, Hive databases, tables, and columns, HBase tables, column families, columns, etc.; and for specific users and groups The specification of access conditions.

Allow Deny and Exclude Conditions
Apache Ranger supports the following access conditions:
• Allow
• Exclude from Allow
• Prohibit
• Exclude from Deny

These access conditions enable you to set fine-grained access control policies.
For example, you can allow all users in the "finance" group to access the "finance" database, but not allow all users in the "interns" group to access the database. Suppose a member "scott" of the "interns" group needs to complete a task that requires access to the "finance" database. In this case, you can add exclusions from the rejection condition to allow the user "scott" to access the "finance" database. The following figure shows how to set this policy in Apache Ranger:
Insert picture description here
Insert picture description here
Enable Deny condition for the policy By
default, the deny condition in the policy is turned off and must be enabled before it can be used.

  1. From Ambari>Ranger>Configuration>Advanced>Customize Ranger -admin-site, add ranger.servicedef. enableDenyAndExceptionsInPolicies = true.
  2. Restart the administrator.

Policy evaluation of access conditions

Apache Ranger policies are evaluated in a specific order to ensure predictable results (if there is no access policy that allows access, authorization requests are usually denied). The following figure shows the strategy evaluation workflow:
Insert picture description here

Source The
above content is from the official website of cloudera~.

Guess you like

Origin blog.csdn.net/m0_48187193/article/details/114666163