Talking about the basics of AK management-how to manage the full life cycle of access keys?

I. Introduction:

For the typical scenario of enterprises going to the cloud, cloud account administrators generally create a corresponding user account for employees, applications, or system services. Each account can have an independent identity authentication key, commonly known as AK (AccessKey), which is used for identity authentication of Alibaba Cloud Service API. Since it is an identity certificate that proves that you are the legal owner of a certain cloud account, the consequences are really serious once it is leaked. We often hear that, for example, AK was maliciously obtained by external attackers, or employees accidentally leaked from github, which eventually led to security accidents or production accidents. The application scenarios of AK are extremely wide, so it is particularly important to manage and govern AK well. This article will analyze and introduce two typical cases of unsafe use of AK.

2. The access key was deleted by mistake and the user service was blocked

Typical case reproduction

In 2020, a customer suddenly discovered that the user APP of some of his projects failed to upload data. This data upload function used a storage service on the cloud vendor, and the customer initiated a work order that the cloud vendor’s storage service was faulty. After investigation, it was found that the production activities of other business parties in the user's Region were normal, and no obvious abnormality occurred; therefore, the network problem was suspected, and the customer was advised to check the network connection. At this time, the customer submitted the error log on the App side, and the log showed that the access key was not found. Under the guidance of the cloud customer service, no key with the same ID was found, and then in the record of the operation audit, it was found that the access key was deleted by itself.

Emergency treatment

  1. The cloud product recommends that the customer replace the used access key immediately. The customer reported that it is not easy to control the app, especially the iOS app release has to be reviewed, which is too long;
  2. The customer issued an urgent announcement to inform its users that this function is temporarily unavailable and will be restored after the upgrade.

influences

The impact is obvious. For many start-ups, such failures will lead to poor user experience from the slightest to the unavailability of key functions, and will affect the retained customers or revenue of the enterprise to varying degrees.

Analysis and conclusion

  1. This failure was mainly caused by the employee deleting the AK by mistake. Some students would say, can there be a function similar to a garbage station that can be recycled? In fact, cloud vendors generally provide a similar function called activation/deactivation, which should follow the "disable first and then delete" to ensure the normal continuation of the business;
  2. In addition, the deletion of AK caused the server to malfunction. It is worthy of attention and self-examination. Is there a strict distinction between the different scenarios used by the user as a control and the server? Is there a distinction between server-side use and management, etc.? Is there a distinction between employees and online systems?
  3. The hard-coded access key in the App application results in a high replacement cost when a leak occurs, and it cannot be rotated immediately to complete the business stop loss; in fact, App-type services are not suitable for using a permanent AK key to access OpenAPI.
  4. In addition, application decompilation and hacking are already frequent incidents. Permanent keys are stored in the code, and the risk of leakage is huge!

3. Standardized access key life cycle management operations to ensure safe production

The above-mentioned real cases not only bring us huge warnings, but in which aspects of the access key are standardized operations? What methods should be adopted for management control?

1 Create: access key

  • Knock on the blackboard again, it is not recommended to use the access key of the master account, the reason is obvious, the resources and permissions owned by the master account are too large, and the risk after leakage is unimaginable;
  • You can check through the access control pages of the cloud vendor to see if sub-users under the tenant level have been created, and the access keys of these sub-users are actually used.

2 Configuration: appropriate permissions

  • Each different application uses the access keys of different sub-users, so that application-level resources and permissions can be isolated;
  • Do the permissions of each sub-user meet the principle of minimum availability and do not expand the permissions that you don’t need; you can try to reduce the permissions in the test environment to see if the test is normal. If it is abnormal, there is a high probability that this permission cannot be removed;
  • Through RAM access to the console query, you can see the permission policy of a certain user and the specific permission description in the policy.

3 Delete: access key

The deletion of the access key is unrecoverable, so the deletion has certain risks. It can only be deleted after it is safely confirmed that the access key does not have any usage records. The standard process is as follows:

  1. First replace the place where the original access key is used with the new access key, and then monitor the last use time of the access key that needs to be deleted;
  2. Determine the expiration time of the old access key according to your own business status. For example, determine 7 days as the safe time according to your business status, that is, you can try to delete the old key without using the access key for 7 days;
  3. Therefore, in the safe time to reach the effect of deletion, but also to retrieve the deleted access key in emergencies, cloud vendors will provide a set of such operations to disable/activate, use disable instead of delete operation, disable operation can be Achieve the same effect as delete, but it can meet the retrieval of the access key in emergencies, that is, through the activation operation, the disabled access key is restored, just like a trash can is provided;
  4. After the access key is disabled, continue to observe whether there is any abnormality in the business until a final security time, such as 7 days. If there is no use record of any old access key, it can be deleted.

4 Disclosure: key rotation

Each RAM user can create up to two access keys. If your access key has been used for more than 3 months, it is recommended that you rotate the access key in time to reduce the risk of the access key being leaked.

  1. When you need to rotate, create a second access key.
  2. In all applications or systems that use the access key, update the access key in use to the newly created second access key.
    Note  : You can check the last use time of the access key in the user AccessKey list on the user details page of the console to initially determine whether the second access key has been used and whether the original access key has been used.

  1. Disable the original access key.
  2. Verify that all applications or systems that use the access key are functioning properly.
  • If it runs normally, the access key has been updated successfully, and you can safely delete the original access key.
  • If the operation is abnormal, you need to temporarily activate the original access key, and then repeat steps 2 to 4 until the update is successful.
  • Delete the original access key.

5 Development: Avoid hard coding the key into the code

System properties

Look for the environment credentials in the system properties. If the alibabacloud.accessKeyId and alibabacloud.accessKeyIdSecret system properties are defined and not empty, the program will use them to create the default credentials.

Environmental credentials

Look for the environmental credentials in the environment variables. If the ALIBABA_CLOUD_ACCESS_KEY_ID and ALIBABA_CLOUD_ACCESS_KEY_SECRET environment variables are defined and not empty, the program will use them to create the default credentials.

Configuration file

If there is a default file ~/.alibabacloud/credentials in the user's home directory (C:\Users\USER_NAME\.alibabacloud\credentials for Windows), the program will automatically create a credential of the specified type and name. The default file may not exist, but a parsing error will throw an exception. The configuration name is lowercase. This configuration file can be shared between different projects and tools, because it is not within the project and will not be accidentally submitted to version control. The path of the default file can be modified by defining the ALIBABA_CLOUD_CREDENTIALS_FILE environment variable. If you do not configure it, use the default configuration default, or you can set the environment variable ALIBABA_CLOUD_PROFILE to use the configuration.

[default]                          # 默认配置
enable = true                      # 启用,没有该选项默认不启用
type = access_key                  # 认证方式为 access_key
access_key_id = foo                # Key
access_key_secret = bar            # Secret

[client1]                          # 命名为 `client1` 的配置
type = ecs_ram_role                # 认证方式为 ecs_ram_role
role_name = EcsRamRoleTest         # Role Name

[client2]                          # 命名为 `client2` 的配置
enable = false                     # 不启用
type = ram_role_arn                # 认证方式为 ram_role_arn
region_id = cn-test                # 获取session用的region
policy = test                      # 选填 指定权限
access_key_id = foo
access_key_secret = bar
role_arn = role_arn
role_session_name = session_name   # 选填

[client3]                          # 命名为 `client3` 的配置
type = rsa_key_pair                # 认证方式为 rsa_key_pair
public_key_id = publicKeyId        # Public Key ID
private_key_file = /your/pk.pem    # Private Key 文件

6 Audit: Regularly analyze the use of access keys

By regulating the management operations of the access key life cycle, most of the security failures caused by improper operations can be solved, but many security problems can only be discovered by analyzing the usage data of the access key.

  1. Access key storage leak detection: Is it hard-coded into the code? You can use the code hosting platform to provide some services to detect, for example,  Github Token scan ;

Cloud vendors also have similar solutions to help customers perform detection, such as the AK leak detection in the Alibaba Cloud Cloud Security Center .

  1. Abnormal access key usage detection

This analysis is mainly to analyze the data and logs related to the actual use of the key itself to see if an abnormality has occurred.

Vendor Plan-Operation Audit

Open the operation log audit, and deliver it to OSS and SLS for long-term storage and audit. Store the operation log in OSS, which can be used as a confirmation in case of abnormal conditions; the operation log is delivered to SLS to help you when the number of logs is large It can also achieve efficient retrieval.

Vendor Solution-Access Log Audit

In addition to the operation logs of cloud products, there are also a large number of cloud product usage access logs. This part is often the main part of data access, such as writing, obtaining, modifying, and deleting data on the OSS bucket. This part of the logs can be collected, stored, counted, and analyzed directly through the log service provided by Alibaba Cloud . After you enable the log function in each cloud product console, you can perform log service related operations.

Local solution-self-built analysis engine

For some product access logs that are not recorded in the operation log audit, these logs can also be recorded and downloaded through the log storage function provided by cloud products. Through offline calculations and regular comparisons, the above abnormal access records are found.

Statistical Analysis

The dimensions that can be monitored and analyzed are as follows. You can observe whether unexpected access occurs in each dimension through the daily monitoring of the following relevant dimensions. If it occurs, it indicates that the access key may have been leaked, and you need to pay more attention:

    • Whether the IP using the access key is the IP of its own machine;
    • Whether the product using the access key has been purchased by yourself;
    • Whether the region using the access key is what you expect;
    • The time to use the access key is not the law of serving your own business.

Four, summary

This article analyzes and introduces the life cycle management of access keys. I hope that it can be inspiring and helpful to you in key management on the cloud. Finally, attach the AK usage tips:

It is forbidden to use the main account and the
sub-account to isolate well;
remember the password once and keep the
AK secret;
don’t mess around when leaking,
ban it and delete it. It’s essential to
allocate two AKs, and
regular audits are very important;
extremely safe No key.

Original link

This article is the original content of Alibaba Cloud and may not be reproduced without permission.

Guess you like

Origin blog.csdn.net/weixin_43970890/article/details/114820312