Original author: Artem Konev - Senior Technical Writer
Original link:Using NGINX Unit to implement application isolation
Reprint source:NGINX Chinese official website
NGINX’s only official Chinese community, all atnginx.org.cn
One of the latest developments in the NGINX Unit feature set is support for application isolation, introduced in version 1.11.0 and implemented via Linux namespaces.
Let’s briefly review Linux
Namespace
: It is essentially a kernel mechanism that supports a process group to combine multiple systems Resources are shared separately from those shared by other process groups. The kernel ensures that processes in a namespace only access resources assigned to that namespace. Although processes in two different namespaces can share some resources, other resources are "invisible" to a process in the other namespace. The types of resources that can be isolated in a namespace vary by operating system and include process and user IDs, inter-process communication entities, mount points in the file system, network objects, and so on.
Sound a bit boring? Maybe so, especially if you don't know the operating system technology. But namespaces are one of the key factors behind the containerization revolution, and isolating application processes within a single operating system instance enables the critical security and scaling mechanisms needed to run applications in containers.
idea
Now that we've established that namespaces can be a good thing, what does NGINX Unit do with them? Before elaborating further, let’s briefly introduce the relevant background information and understand Tiago’s own thoughts:
"I'm researching better ways to effectively monitor and intercept traffic from my application. In my spare time, I've been studying the internals of NGINX Unit and thought process isolation might be a good fit. However, I haven't yet Determine if this is the best solution. Previously, I considered eBPF and looked at how it redirects packets at the kernel level, but then I had a different Idea. Since NGINX Unit runs and manages applications in a similar way to a container runtime, what if we added application isolation support to NGINX Unit and used it instead of the runtime? This idea is closely related to future ideas and designs by the NGINX Unit team. A perfect coincidence.
In the cluster, the container runtime starts and stops the application, so we know everything running in the cluster. The NGINX Unit architecture not only does the same thing, but also implements traffic monitoring and interception by default: the only way to reach the application is through NGINX Unit's shared memory model. Remarkably, we are even able to isolate the network, similar to skipping the interface settings within the container, but the application can still communicate with the outside world by sharing memory with the NGINX Unit without encountering any costly network attacks. "
Configuration
From a configuration perspective, everything relies on the new isolation object, which defines namespace-related settings in the application object.
The namespace options in the isolation object are system-dependent because the types of resources that can be isolated into a namespace vary by operating system. Here's a basic example of creating a separate user ID and mount point namespace for your app:
{
"applications": {
"isolation_app": {
"type": "external",
"executable": "/tmp/go-app",
"isolation": {
"namespaces": {
"credential": true,
"mount": "true"
}
}
}
}
}
Currently, NGINX Unit supports configuring 6 of the 7 namespace isolation types that the Linux kernel
supports
. The corresponding configuration options are cgroup, credential, pid, mount, network, and uname. The last type ipc is not supported yet.
By default, all isolation types are disabled (option set to false), which means the app resides in the NGINX Unit namespace. When you enable a specific isolation type for an app by setting its option to true, NGINX Unit creates a separate namespace of that type for the app. For example, an application can be in the same namespace as an NGINX Unit, in addition to having a separate mount or credential namespace of its own.
For more details on the options in the isolation object, see the
NGINX Unit
documentation.
Note: At the time of writing, all applications need to use the same ipc namespace as the NGINX Unit; this is a shared memory mechanism requirement. You can add the ipc option to the configuration, but its setting has no effect. This requirement may change in future releases.
User and group ID mapping
App isolation in NGINX Unit includes support for UID and GID mapping, which can be configured if credential isolation is enabled (which means your app runs in a separate credential namespace). You can map a series of IDs in the application namespace (which we call the "container namespace") to the application's parent process's credentials namespace (which we call the "host namespace"). ID ranges of the same length.
For example, suppose you have an application running with
unprivileged user credentials
and enable credential isolation to create a Container namespace. NGINX Unit enables you to map the UID of an unprivileged user in the host namespace to UID 0 (root) in the container namespace. By design, any UID with a value of 0 in a namespace has full permissions in that namespace, while its permissions to map the corresponding UID in the host namespace remain restricted. Therefore, the app appears to have root capabilities, but only for resources within its namespace. The same goes for GID mapping.
Here, we map the 10 UID range values starting from UID 500 in the host namespace to the UID range values starting from UID 0 in the container namespace (host: 500-509, container: 0-9). Likewise, we map the 20 GID range values starting at GID 1000 in the host namespace to the range values starting at GID 0 in the container namespace (host: 1000-1019, container: 0-19):
{
"applications": {
"isolation_app": {
"type": "external",
"executable": "/bin/app",
"isolation": {
"namespaces": {
"credential": true
},
"uidmap": [
{
"container": 0,
"host": 500,
"size": 10
}
],
"gidmap": [
{
"container": 0,
"host": 1000,
"size": 20
}
]
}
}
}
}
If you do not create an explicit UID and GID mapping, by default the currently effective UID (EUID) of the unprivileged NGINX Unit process in the host namespace is mapped to the root UID in the container namespace. Also note that UID/GID mapping is only available if the host operating system supports user namespaces. With that said, let’s move on to understand the impact of application isolation on applications running in NGINX Unit.
Getting Started: Basic Application Isolation
Let’s start from the basics and understand the runtime behavior of this feature. To do this, we will use a Go app from our official
repository
(to be run when testing new versions):
This code responds to the request with a JSON-formatted list of application process and namespace IDs, enumerating the contents of the /proc/self/ns/ directory. Next we configure the application in NGINX Unit, ignoring the isolation object for now:
{
"listeners": {
"*:8080": {
"pass": "applications/go-app"
}
},
"applications": {
"go-app": {
"type": "external",
"executable": "/tmp/go-app"
}
}
}
HTTP response from running application instance:
$ curl -X GET http://localhost:8080
{
"PID": 5778,
"UID": 65534,
"GID": 65534,
"NS": {
"USER": 4026531837,
"PID": 4026531836,
"IPC": 4026531839,
"CGROUP": 4026531835,
"UTS": 4026531838,
"MNT": 4026531840,
"NET": 4026531992
}
}
Now we add the isolation object to enable application isolation. The isolation mechanism needs to restart the application to take effect. NGINX Unit will perform this task behind the scenes, so updates are very transparent from the end user's perspective.
{
"listeners": {
"*:8080": {
"pass": "applications/go-app"
}
},
"applications": {
"go-app": {
"type": "external",
"user": "root",
"executable": "/tmp/go-app",
"isolation": {
"namespaces": {
"cgroup": true,
"credential": true,
"mount": true,
"network": true,
"pid": true,
"uname": true
},
"uidmap": [
{
"host": 1000,
"container": 0,
"size": 1000
}
],
"gidmap": [
{
"host": 1000,
"container": 0,
"size": 1000
}
]
}
}
}
}
Note that the user option is set to root. This setup is required to enable mapping to UID/GID 0 in the container namespace.
We issue the command again:
$ curl -X GET http://localhost:8080
{
"PID": 1,
"UID": 0,
"GID": 0,
"NS": {
"USER": 4026532180,
"PID": 4026532184,
"IPC": 4026531839,
"CGROUP": 4026532185,
"UTS": 4026532183,
"MNT": 4026532181,
"NET": 4026532187
}
}
We now have application isolation enabled, and the namespace IDs have changed—they are now IDs in the container namespace, not the host namespace. The only thing that remains unchanged is IPC,
for the reasons stated above
.
Dive Deeper: Web Application Isolation
To gain a deeper understanding, let's explore the practical impact of application isolation on the network, which is very important for web applications. Our tool of choice for this is nsenter, which works on many operating system distributions supported by NGINX Unit. This utility allows us to run arbitrary commands within the process namespace, and we will use it to demonstrate the changes caused by different settings in the isolation object of the same Go application configured earlier. First, we find out the host PID
Once the PID is determined, we can go into the container namespace and see its internals
Note that only the loopback interface is available; however, the application is fully capable of handling external HTTP requests through the NGINX Unit. Next, we will remove the network option from the configured namespace list to see the final network interface configuration for the app with network isolation disabled
We then repeat the same steps above
There are now application processes that inherit the NGINX Unit's interface (eth0) when starting up.
NGINX is the only official Chinese community, all at
nginx.org.cn