Nacos configuration security best practices


Preface



As an important part of software development, configuration management shoulders the responsibility of connecting code and environment, and can well separate the concerns of developers and maintainers.

The configuration management function of Nacos satisfies the configuration management requirements of cloud-native applications: it can achieve both configuration and code separation, as well as dynamic modification of configuration.

In January, Nacos had a security vulnerability that allowed external users to pretend to be Nacos-server to obtain/modify the configuration (https://github.com/alibaba/nacos/issues/4593). After confirming the problem, Nacos quickly fixed the vulnerability, and Alibaba Cloud's Microservice Engine (MSE) also backported the repair solution to the Nacos instance on MSE at the end of January.

In this article, we will start from a global perspective and discuss how to ensure the security of Nacos configuration, that is, how to ensure that configuration information is not obtained or leaked by malicious users.

Nacos configuration architecture


The overall structure of the Nacos configuration section is as follows:

For each link in the above figure, you need to consider whether there are two basic security actions: authentication (Identification) and authentication (Authentication).

As you can see from the figure above, the possible ways to leak configuration information are:

  • Obtain the configuration through Nacos-client.

  • Obtain the configuration through the console.

  • The configuration is obtained through the communication protocol between the servers.

  • Direct access to the persistence layer (such as DB) for configuration.

The possible leak points are as follows:


Certification

Authentication

Nacos client

Unlogged users obtain/modify configuration through the client

The user obtained/modified unauthorized configuration through the client

Configure the console

Unlogged users obtain/modify configuration through the console

User obtained/modified unauthorized configuration through the console

Within the Nacos cluster

The user pretends to be a Nacos cluster to obtain/modify the configuration

Not needed

Persistence layer

Users directly check DB, get/modify configuration

Not needed

Authentication and authentication of Nacos client scenario


When the Nacos client tries to obtain the configuration from the server, the server needs to confirm the identity of the client and confirm that the identity has the authority to obtain the configuration.

Open source version of Nacos

In the default Nacos server configuration, the client will not be authenticated, that is, any user who can access the Nacos server can directly obtain the configuration stored in Nacos. For example, if a hacker breaks into the company's intranet, he can obtain all the business configurations, which is sure to have security risks.

Therefore, the authentication of the Nacos server needs to be turned on first. Modify the nacos.core.auth.enabled value in application.properties to true on the Nacos server:

nacos.core.auth.enabled=true


After the above settings, when the Nacos client obtains the configuration, it needs to set the corresponding user name and password to obtain the configuration:

String serverAddr = "{serverAddr}";Properties properties = new Properties();properties.put("serverAddr", serverAddr);properties.put("username","nacos-readonly");properties.put("password","nacos");ConfigService configService = NacosFactory.createConfigService(properties);

The above talked about how to authenticate a user, that is, how to determine which user is currently accessing, but also need to identify the user's authority. When the user does not have the authority to obtain the corresponding configuration, such as the inventory service trying to obtain the configuration of the payment service, failure.

We can create users and set permissions on the open source Nacos console. Proceed as follows:

First, go to localhost: 8848 / nacos and log in, access control -> User List page, add a user:

In access control -> Role Management , bind users and roles:


Add permissions to the corresponding roles. On the  permission control -> permission management  page, add permissions:

After the above configuration, readonly-user can only access the configuration under the public namespace.


Alibaba Cloud MSE-AK/SK

For small teams, it is sufficient to use username and password for authentication. However, for large and medium-sized teams, regular password changes and frequent changes of personnel will cause frequent changes in user names and passwords.

At this time, the use of user name and password authentication authentication requires frequent modification and release of applications. In order to solve this problem, Nacos also provides an AK/SK-based authentication scheme and a scheme for ECS to associate RAM roles, which can avoid frequent publishing problems caused by user name and password modification.

Take Alibaba Cloud MSE as an example. Alibaba Cloud users have generally used Alibaba Cloud Access Control Service (RAM) as the permission system. If MSE is the same as open source, using username and password for authentication and authentication, then users need to log in to RAM and MSE Nacos configures permissions in two places. This is not only inconvenient for the unified management and review of user permissions, but also brings an inconsistent experience to users.

Therefore, MSE (Micro Service Engine) provides an authentication method based on AK/SK. The operation example is as follows:

First, on a Nacos MSE application example (and note the instance id), then the examples Details -> Preferences  interface, ConfigAuthEnabled (configure authentication) parameter is set to true, so that anonymous users can not get the configuration:

Then you can configure related permissions on the Alibaba Cloud RAM system. The permission system of RAM sub-accounts can be simply expressed as follows:

  • Step 1: Create RAM permission policy as follows:

In the figure, mse:Get*, mse:List*, mse:Query* indicate that the configuration can be read, and mse:* indicates all permissions, including modification permissions.

acs:mse:*:*:instance/${instanceId} means authorization to the instance level, acs:mse:*:*:instance/${instanceId}/${namespaceId} means authorization to the namespace level.

  • Step 2: Create users and grant permissions:

Fill in the user name:

Then get the user's AK/SK:

Give this user the corresponding permissions:

  • Finally, just add AK/SK to the code:

String serverAddr = "{serverAddr}";Properties properties = new Properties();properties.put("serverAddr", serverAddr);properties.put(PropertyKeyConst.ACCESS_KEY, "${accessKey}");properties.put(PropertyKeyConst.SECRET_KEY, "${secret}");ConfigService configService = NacosFactory.createConfigService(properties);


After the above configuration, when the client accesses the Nacos instance purchased on the MSE , the MSE will verify the AK and signature, confirm that the user is a legitimate user, and verify the authority , otherwise it will refuse to provide services.

Alibaba Cloud MSE- Ram role authentication based on ECS

Of course, in the above method of use, you still need to configure AK/SK in the initial configuration (such as the bootstrap.yml file in srping-cloud-alibaba-nacos-config). When hackers invade the intranet or leak the source code, there will also be AK/SK leaks, leading to the risk of configuration information leaks .

In this case, it is recommended to use the RAM role associated with ECS for authentication.

The authorization model corresponding to the ECS-associated RAM role is as follows:

The key steps mentioned above are role-playing. Only the cloud server associated with the RAM role can successfully play the role and obtain the permission to operate the MSE Nacos instance.

If the hacker only obtains the code, he cannot successfully play the role of RAM and cannot operate the MSE Nacos instance. If the machine is compromised, the role associated with the cloud server can also be cancelled on the Alibaba Cloud console to stop the loss in time.

The specific steps are as follows:

  • The first step is to create an instance of MSE Nacos and create a corresponding permission policy (explained above, so I won’t repeat it here).

  • The second step is to create a RAM role and authorize it.

Create RAM role:

After creating a role, add the corresponding permission policy for the role:


  • The third step is to associate the role with ECS:

In the corresponding ECS details page , click on the grant / withdraw RAM Role :

Select the corresponding role and grant :

  • The final step in the code specified RAM roles  can:

String serverAddr = "{serverAddr}";
Properties properties = new Properties();
properties.put("serverAddr", serverAddr);
properties.put(PropertyKeyConst.RAM_ROLE_NAME, "StoreServiceRole");
ConfigService configService = NacosFactory.createConfigService(properties);

After the above configuration, when the Nacos client obtains the configuration, the cloud server will play the designated RAM role, and Alibaba Cloud temporary security token (Security Token Service, STS) will access the MSE Nacos instance.

If the attacker obtains the code, it cannot be run on other machines because the attacker's machine does not have the authority to play the role of RAM.

If the attacker obtains the authentication information after the impersonation, due to the short failure of the STS (the default is 1 hour), the attacker will fail soon after obtaining it, effectively reducing the attack surface.

If you need to revoke the authorization, you only need to do it on the Alibaba Cloud console without republishing the application.

Compared with AK/SK authentication and authentication, the authentication and authentication of ECS associated roles is more controllable and safer, so this authentication and authentication method is recommended.

Configure authentication and authentication in console scenarios



Open source version of Nacos

In the open source version of the Nacos console, when logging in, it will obtain a temporary accessToken through the console's login interface, and then the subsequent operations will use the accessToken for authentication.

For example, the readonly-user user mentioned above, after logging in, can only see the configuration information under the public namespace, and cannot modify or view the configuration information under other namespaces.

In addition, if you need to create or delete a namespace, you can only log in as an administrator.

For the authentication and authorization of the open source version of Nacos, please refer to this document: https://nacos.io/zh-cn/docs/auth.html.


Alibaba Cloud MSE

Since Alibaba Cloud MSE provides services to enterprises, the division of permissions will be more refined.

Resources are divided into instance level (acs:mse:*:*:instance/${instanceId}) and namespace level (acs:mse:*:*:instance/${instanceId}/${namespaceId}).

The operation of resources is also more refined, such as:

Action

Description

CreateEngineNamespace

Create a namespace

DeleteEngineNamespace

Delete namespace

mse:Get*,mse:List*,mse:Query*

Read configuration (Nacos client and console)

mse:*

All permissions, including modification and deletion of configuration

mse:QueryNacosConfig

Client read configuration

mse:UpdateNacosConfig

Client modify configuration

For example, only the configuration under one namespace is allowed to be read, and no modification is allowed. The permission policy can be written:

{  "Action": [    "mse:Get*",    "mse:List*",    "mse:Query*"  ],  "Resource": [    "acs:mse:*:*:instance/${instanceId}/${namespaceId}"  ],  "Effect": "Allow"}




Authentication between servers


Some information needs to be synchronized between Nacos servers. At this time, the identity of the other party needs to be authenticated to confirm that the other party is really Nacos-server, rather than disguised.

Before 1.4.1, authentication was done through the User-Agent header. This original authentication method can easily be forged. As mentioned at the beginning of this article, this is the reason for the vulnerability that Nacos broke in January.

Therefore, in versions 1.4.1 and later, the certified header and the corresponding value can be configured by yourself. In application.properties, modify the following values:

# 不使用User-Agent来认证nacos.core.auth.enable.userAgentAuthWhite=false# 认证header的keynacos.core.auth.server.identity=Authorization# 认证header的valuenacos.core.auth.server.identity.value=secret


In this way, only after the header Authorization: secretrequest is sent can it be confirmed that the other party is the server and can the cluster information be synchronized; otherwise, the synchronization is rejected.

Since Nacos-server needs all permissions to synchronize configuration data, there is no need for authentication between Nacos-servers.

In this way, the communication between the servers can also be safe and reliable.

The Nacos instance purchased on Alibaba Cloud MSE has also backported the above solution to version 1.2, and there will be no corresponding security issues.

Security of the persistence layer



Nacos configuration information is stored in the persistence layer. For example, the default persistence layer of Nacos is MySQL.

In order to prevent the MySQL username and password from leaking out through git or other methods, we need to modify the MySQL username and password regularly.

The usual practice is to use two database users, such as UserA and UserB. If you want to update the password, proceed as follows:

  • Switch the user of the Nacos server to access the database from UserA to UserB.

  • Update UserA's password.

  • Switch the user of the Nacos server to access the database from UserB back to UserA.

  • Update UserB's password.

As an Alibaba Cloud product, MSE has a policy of regularly changing the database user name and password, so if you have purchased an MSE instance, you do not need to worry about this issue.

Configuration security best practices



Going through the key points of Nacos configuration security, how can we ensure configuration security? You only need to do the following best practices:

1. Regularly modify the password and ak/sk

In the case of using Nacos username and password (or AK/SK) authentication (such as using the open source Nacos authentication method), if a malicious user gets the Nacos username and password (or AK/SK), then he may get the application Configuration. However, if the password or AK/SK is changed regularly, the time period for configuration leakage can be effectively limited and the attack surface can be reduced.

2. Use ECS roles (recommended usage)

Of course, in the above solution, there will still be Nacos username and password or AK/SK in the configuration, and this information may also be leaked, and the modification after the leak needs to be republished. Therefore, it is recommended to use Alibaba Cloud's ECS role. All permission management is done on the Alibaba Cloud console .

3. Rotate the key of Nacos internal authentication

As mentioned earlier, the authentication between Nacos servers is done through nacos.core.auth.server.identity, but if a malicious user invades, it will also cause leakage, which will lead to configuration leakage.

Therefore, for self-built Nacos, nacos.core.auth.server.identity.value needs to be replaced regularly to ensure that malicious users cannot pretend to be Nacos Server to obtain and modify the configuration.

Of course, if you are using a Nacos instance hosted by MSE, MSE will automatically rotate , so you don't need to worry about this.

4. User name and password of the rotation persistence layer

In order to prevent the configuration from leaking out of the persistence layer, it is necessary to periodically modify the authentication information of the persistence layer. Usually the persistence layer of Nacos is DB, so the user name and password of the database need to be modified regularly.

For MSE users, there is no need to do anything. MSE will periodically modify the user name and password of the database.

5. Design a safety plan and execute it regularly

With the above heavy insurance, it is theoretically foolproof, but because there are always mistakes in human operations, it is still necessary to specify a safety plan:

  • Regularly check the configured listening list to confirm that there are no unauthorized machines.

  • When AK/SK is leaked, how to update AK/SK and how to revoke the leaked AK/SK.

  • For self-built Nacos, how to modify the scheme of nacos.core.auth.server.identity.value after the server is compromised.

to sum up


The open source Nacos can basically meet the needs of small and medium-sized enterprises in terms of configuration management and authority management.

For medium and large enterprises, the Alibaba Cloud product MSE supports more refined and flexible permission configuration and security management, and can also be used with other Alibaba Cloud products to achieve more secure configuration capabilities.

Of course, whether you are building Nacos yourself or using Alibaba Cloud MSE, you need to pay attention to the security points mentioned above to prevent configuration information leakage and cause business losses. The configuration security best practices mentioned at the end can also ensure that after configuration leaks, they have the ability to repair them in time to prevent problems before they occur.

Recruitment


Our Dubbo / Spring Cloud commercialization team is hiring people. In addition to EDAS, we also have independent products such as ARMS (Real-time Application Monitoring Service), MSE (Micro Service Engine), and SAE (Serverless Application Engine). What are we up to? Polishing these products is our job. The team's goal is to export Alibaba's best practices in service governance to enterprise customers on Alibaba Cloud in the form of productization, and help customers realize their business is always online.

Resume delivery method: https://job.alibaba.com/zhaopin/position_detail.htm?positionId=98290

How to ensure the distributed consistency of the snap-up business through transaction messages?

2021-02-27

Function calculation mirror acceleration: the leap from minutes to seconds

2021-02-27

Actual Combat | How to realize the video frame-cutting architecture based on Serverless technology?

2021-02-25

Click one to see, let more people see

Guess you like

Origin blog.csdn.net/weixin_39860915/article/details/114697118