The linux systemd-logind process CPU occupies 100%

When I started working remotely, I received a text message warning that the CPU usage of the system was too high. I immediately logged in to the system to check. The login process was abnormally slow, but I finally logged in.

The ABRT report found a problem

ABRT is an automatic error reporting tool, mainly to provide users with concise and comprehensive error information

For system users, it mainly queries the possible strings from the system log, such as oops, Machine-check, Xorg failure, etc. In addition to querying matches in the system log, it also checks and extracts files that record failures such as kdump. System error information, it provides abrt-cli command to report and view

View by prompt command

Found that it is a systemd-logind problem, combined with top view

systemd-logind occupies 100% of the CPU, causing the system load to soar

What is systemd-logind?

systemd-logind is a system service for managing user logins. Its responsibilities are as follows:

  • Continuously track the user's session, progress, and idle state. This will be under user.slice, each user will be assigned a slice unit, and each user's current session will be assigned a scope unit. At the same time, for each logged-in user, a dedicated service manager (as an instance of the [email protected] template) will be started.
  • Generate and manage "session ID". If auditing is enabled and the auditing "session ID" has been set for a session, then this ID will also be used as the "session ID", otherwise an independent session counter will be used (that is, a "session ID" will be generated independently ).
  • Provide polkit-based authentication and authorization for users' privileged operations (such as shutting down or hibernating the system)
  • Implement logic to prevent shutting down/sleeping the system for the application
  • Handle the actions of the hardware shutdown/sleep button
  • Multi-seat management
  • Session switching management
  • Manage user access to the device
  • Automatically start the text login program (agetty) when starting the virtual terminal, and manage the user's runtime directory

Then why is the login slow, and systemd-logind was killed after logging in, and I can see the following by searching for the message

Obviously, the system buffer is not enough. It cannot be read when reading /run/systemd/users/0. It cannot create a session for the logged-in user. After 3 minutes there is no response and it is detected and killed by wachdog. Restart and try again.

As you can see in the figure, there are still root users assigned to the session. I think this is because the resources are released at this time, and they can just apply for the resources to create the session.

In order to solve the doubt, look at the call of systemd-logind when the user logs in through strace

First check the pid of systemd-logind

Then use strace -p to trace the call

It is found that during the login process, the file under /run/systemd/users is called

Let's look at the contents stored in the /run/systemd/users/0 file

The user id is 0, which is the session information of the root user. It is found that there are many active sessions, but the user I logged in only logged in one session

Because every session will create a slice, view through slice, first check the system slice

Then check user.0.silce

It is found that in addition to user login, there are a large number of sessions of crond timing tasks. You can see the detailed commands or scripts under each session.

There is a bug in systemd-logind, that is, when there is crond, the session is often not recovered in time. This is also the reason why a new session cannot be opened due to resource occupation.

From the above figure, you can see that user.0.slice is managed through cgroups, and the resource management of the user's process can be configured in the /run/systemd/system/ directory according to the user's slice or session

For example, the configuration below user.0.slice.d

At this point, the problem is generally understood. Systemd-logind applies for resources when a user logs in. Due to insufficient system resources, it is impossible to create a session and log in.

There are suggestions on the Internet to turn off systemd-logind, I personally suggest not to turn it off, because it has a more important function that is more convenient for the system to manage user resources through cgroups

A better approach is to release resources regularly, and try to write timing tasks to different users instead of writing them all to the root user.

Generally, administrators write timed tasks to root. In this way, whether it is temporary files or log files generated by execution, they all have root permissions. For example, executing web commands to produce root-privileged files will cause the original web The user cannot call and reports an error, so try to reduce the timing tasks under the root user by crontab -e -u web user.

 

 

Guess you like

Origin blog.csdn.net/whatday/article/details/115334635