Capture hidden processes and processes that hide CPU utilization

I have introduced many tricks to hide the process before, and then I gave targeted countermeasures for each trick. You can read my articles from 2020/03 to 2020/08. There are too many, not one by one. Enumerate.

Now, I want to introduce a super simple method, a must for craftsmen.

Whether you hide the process or hide the CPU utilization of the process, as long as it runs on the CPU, any hidden means will be futile in front of the following script:

#!/usr/local/bin/stap

global tbase
global tdelta

probe scheduler.cpu_on
{
    
    
	a = gettimeofday_us()
	tbase[pid(), execname()] = a
}

probe scheduler.cpu_off
{
    
    
	t = tbase[pid(), execname()]
	a = gettimeofday_us();
	if (t != 0) {
    
    
		delete tbase[pid(), execname()]
		d = a - t
		b = tdelta[pid(), execname()]
		tdelta[pid(), execname()] = b + d
	}
}

probe timer.ms($1)
{
    
    
	exit()
}

// 结束时将这段时间内所有运行进程的CPU累加时间按照降序打印。
probe end
{
    
    
	foreach ([pid, name] in tdelta-) {
    
    
		printf("%s[%d] = %d\n", name, pid, tdelta[pid, name])
	}
}

Yes, as long as your process is running, it cannot escape the scheduling of the kernel. As long as the process gets the CPU, it will take samples. When the process is switched, take samples again. The difference between the two is the time the process is running this time. Add up to get the CPU time occupied by any process.

Unless your process is not running on the CPU, what's the use of a non-running process...

Come and see the effect:

[root@localhost test]# /root/loop &
[1] 5814
[root@localhost test]# /root/loop &
[2] 5815
[root@localhost test]#
[root@localhost test]# ./times.stp 5000  # 采样5秒
loop[5814] = 2492109
loop[5815] = 2490044
top[5919] = 1417
kworker/0:1[31879] = 1218
stapio[7125] = 1191
xfsaild/dm-0[397] = 1028
tuned[1003] = 744
systemd-udevd[7126] = 397
sshd[1384] = 174
systemd-udevd[496] = 157
rcuos/0[11] = 105
systemd[1] = 105
kworker/0:2[6831] = 82
systemd-logind[645] = 62
rcu_sched[10] = 43
kworker/u2:2[285] = 7
watchdog/0[12] = 7
ksoftirqd/0[3] = 3
[root@localhost test]#

One grasps one standard.

Come, now do something business based on the above principles.

This time we no longer write scripts to capture someone, this time we pretend to optimize the scheduler algorithm.

We want to count the waiting time of all processes from entering the ready queue to the actual running to check whether there is process starvation.

#!/usr/local/bin/stap

global tbase
global tdelta

probe kernel.function("activate_task")
{
    
    
	a = gettimeofday_us()
	tbase[task_pid($p), task_execname($p)] = a
}

probe scheduler.cpu_on
{
    
    
	t = tbase[pid(), execname()]
	a = gettimeofday_us();
	if (t != 0) {
    
    
		delete tbase[pid(), execname()]
		d = a - t
		b = tdelta[pid(), execname()]
		tdelta[pid(), execname()] = b + d
	}
}

probe timer.ms($1)
{
    
    
	exit()
}

probe end
{
    
    
	foreach ([pid, name] in tdelta-) {
    
    
		printf("%s[%d] = %d\n", name, pid, tdelta[pid, name])
	}
}

Come and see the effect:

[root@localhost test]# ./wtime.stp 5000
stapio[7727] = 1034
rcuos/0[11] = 747
systemd-udevd[7728] = 244
kworker/0:1[31879] = 236
tuned[1003] = 159
khungtaskd[24] = 80
rcu_sched[10] = 64
systemd-udevd[496] = 58
khugepaged[27] = 20
kworker/u2:2[285] = 20
watchdog/0[12] = 18
auditd[609] = 18
kworker/0:0[7139] = 12
[root@localhost test]#

Finally, it is worth noting that the operation overhead of stap is huge, the internal implementation of the binary key array is very complicated, and cpu_on/cpu_off is an absolutely absolute hot hotspot path in the system! The above methods are only used to discover system abnormalities, not routine operations, and should not be performed for a long time in a production environment.

Of course, except for the manager.

The leather shoes in Wenzhou, Zhejiang are wet, so they won’t get fat in the rain.

Capture hidden processes and processes that hide CPU utilization

Guess you like