background
In execution top
/ ps
time command, in COMMAND
one, we will find that some processes were being []
enclosed in a, for example,
PID PPID USER STAT VSZ %VSZ %CPU COMMAND
1542 928 root R 1064 2% 5% top
1 0 root S 1348 2% 0% /sbin/procd
928 1 root S 1060 2% 0% /bin/ash --login
115 2 root SW 0 0% 0% [kworker/u4:2]
6 2 root SW 0 0% 0% [kworker/u4:0]
4 2 root SW 0 0% 0% [kworker/0:0]
697 2 root SW 0 0% 0% [kworker/1:3]
703 2 root SW 0 0% 0% [kworker/0:3]
15 2 root SW 0 0% 0% [kworker/1:0]
27 2 root SW 0 0% 0% [kworker/1:1]
In addition to exploring top paper in []
meaning, but more importantly, how do we locate the information only from the problem?
From the application code to the kernel code, delegate to fish than giving the fishing, do you think?
Analysis of the process of children's shoes are not interested, you can jump directly to the conclusion
Application code logic analysis
Keywords: COMMAND
After obtaining busybox source code, try brute search keyword
[GMPY@12:22 busybox-1.27.2]$grep "COMMAND" -rnw *
It was found that much of the data match
applets/usage_pod.c:79: printf("=head1 COMMAND DESCRIPTIONS\n\n");
archival/cpio.c:100: --rsh-command=COMMAND Use remote COMMAND instead of rsh
docs/BusyBox.html:1655:<p>which [COMMAND]...</p>
docs/BusyBox.html:1657:<p>Locate a COMMAND</p>
docs/BusyBox.txt:93:COMMAND DESCRIPTIONS
docs/BusyBox.txt:112: brctl COMMAND [BRIDGE [INTERFACE]]
docs/BusyBox.txt:612: ip ip [OPTIONS] address|route|link|neigh|rule [COMMAND]
docs/BusyBox.txt:614: OPTIONS := -f[amily] inet|inet6|link | -o[neline] COMMAND := ip addr
docs/BusyBox.txt:1354: which [COMMAND]...
docs/BusyBox.txt:1356: Locate a COMMAND
......
At this point I found that the first match because there are a lot of non-source files, so it is a lot, then I can only retrieve C documents?
[GMPY@12:25 busybox-1.27.2]$find -name "*.c" -exec grep -Hn --color=auto "COMMAND" {} \;
The results of only 71 lines, simply sweep under the matching file, there is an interesting discovery
......
./shell/ash.c:9707: if (cmdentry.u.cmd == COMMANDCMD) {
./editors/vi.c:1109: // get the COMMAND into cmd[]
./procps/lsof.c:31: * COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
./procps/top.c:626: " COMMAND");
./procps/top.c:701: /* PID PPID USER STAT VSZ %VSZ [%CPU] COMMAND */
./procps/top.c:841: strcpy(line_buf, HDR_STR " COMMAND");
./procps/top.c:854: /* PID VSZ VSZRW RSS (SHR) DIRTY (SHR) COMMAND */
./procps/ps.c:441: { 16 , "comm" ,"COMMAND",func_comm ,PSSCAN_COMM },
......
In busybox, each command is a single file , the logical structure of the code which is good, we go directly to procps / top.c file 626 line
Function: display_process_list
procps / top.c of 626 lines belong function display_process_list , simply look at the code logic
static NOINLINE void display_process_list(int lines_rem, int scr_width)
{
......
/* 打印表头 */
printf(OPT_BATCH_MODE ? "%.*s" : "\033[7m%.*s\033[0m", scr_width,
" PID PPID USER STAT VSZ %VSZ"
IF_FEATURE_TOP_SMP_PROCESS(" CPU")
IF_FEATURE_TOP_CPU_USAGE_PERCENTAGE(" %CPU")
" COMMAND");
......
/* 遍历每一个进程对应的描述 */
while (--lines_rem >= 0) {
if (s->vsz >= 100000)
sprintf(vsz_str_buf, "%6ldm", s->vsz/1024);
else
sprintf(vsz_str_buf, "%7lu", s->vsz);
/*打印每一行中除了COMMAND之外的信息,例如PID,USER,STAT等 */
col = snprintf(line_buf, scr_width,
"\n" "%5u%6u %-8.8s %s%s" FMT
IF_FEATURE_TOP_SMP_PROCESS(" %3d")
IF_FEATURE_TOP_CPU_USAGE_PERCENTAGE(FMT)
" ",
s->pid, s->ppid, get_cached_username(s->uid),
s->state, vsz_str_buf,
SHOW_STAT(pmem)
IF_FEATURE_TOP_SMP_PROCESS(, s->last_seen_on_cpu)
IF_FEATURE_TOP_CPU_USAGE_PERCENTAGE(, SHOW_STAT(pcpu))
);
/* 关键在这,读取cmdline */
if ((int)(col + 1) < scr_width)
read_cmdline(line_buf + col, scr_width - col, s->pid, s->comm);
......
}
}
Excluding independent code, the logic function clear
- Before this function code has been through all processes, and build the structure described
- Traversing the structure described in display_process_list and press a predetermined print order information
- By read_cmdline, obtaining and printing process name
We enter into the function read_cmdline
Function: read_cmdline
void FAST_FUNC read_cmdline(char *buf, int col, unsigned pid, const char *comm)
{
......
sprintf(filename, "/proc/%u/cmdline", pid);
sz = open_read_close(filename, buf, col - 1);
if (sz > 0) {
......
while (sz >= 0) {
if ((unsigned char)(buf[sz]) < ' ')
buf[sz] = ' ';
sz--;
}
......
if (strncmp(base, comm, comm_len) != 0) {
......
snprintf(buf, col, "{%s}", comm);
......
} else {
snprintf(buf, col, "[%s]", comm ? comm : "?");
}
}
Excluding independent code, I found
- By
/proc/<PID>/cmdline
acquiring process name - If
/proc/<PID>/cmdline
is empty, it is usedcomm
, this time with[]
enclosed - If
cmdline
the basename andcomm
inconsistent with the{}
enclosed
For ease of reading, no longer to analyze cmdline
and comm
.
We focus on the question, under what circumstances, /proc/<PID>/cmdline
is empty?
Kernel code logic analysis
Keywords: cmdline
/ proc mount is proc , a special kind of file system, cmdline certainly is its unique features,
We assume that the kernel is white, then we can do is search key cmdline kernel source code proc
[GMPY@09:54 proc]$cd fs/proc && grep "cmdline" -rnw *
There are two key found matching file base.c and cmdline.c
array.c:11: * Pauline Middelink : Made cmdline,envline only break at '\0's, to
base.c:224: /* Check if process spawned far enough to have cmdline. */
base.c:708: * May current process learn task's sched/cmdline info (for hide_pid_min=1)
base.c:2902: REG("cmdline", S_IRUGO, proc_pid_cmdline_ops),
base.c:3294: REG("cmdline", S_IRUGO, proc_pid_cmdline_ops),
cmdline.c:26: proc_create("cmdline", 0, NULL, &cmdline_proc_fops);
Makefile:16:proc-y += cmdline.o
vmcore.c:1158: * If elfcorehdr= has been passed in cmdline or created in 2nd kernel,
cmdline.c code logic is very simple and very easy to find it is / proc / cmdline implementation, not our needs
Let focused its attention to base.c , the relevant code
REG("cmdline", S_IRUGO, proc_pid_cmdline_ops),
Experienced intuition tells me,
- cmdline: is the file name
- S_IRUGO: file permissions
- proc_pid_cmdline_ops: it is a file operation corresponding to the structure
Sure, entering proc_pid_cmdline_ops
we found is defined as
static const struct file_operations proc_pid_cmdline_ops = {
.read = proc_pid_cmdline_read,
.llseek = generic_file_llseek,
}
Function: proc_pid_cmdline_read
static ssize_t proc_pid_cmdline_read(struct file *file, char __user *buf,
size_t _count, loff_t *pos)
{
......
/* 获取进程对应的虚拟地址空间描述符 */
mm = get_task_mm(tsk);
......
/* 获取argv的地址和env的地址 */
arg_start = mm->arg_start;
arg_end = mm->arg_end;
env_start = mm->env_start;
env_end = mm->env_end;
......
while (count > 0 && len > 0) {
......
/* 计算地址偏移 */
p = arg_start + *pos;
while (count > 0 && len > 0) {
......
/* 获取进程地址空间的数据 */
nr_read = access_remote_vm(mm, p, page, _count, FOLL_ANON);
......
}
}
}
White this time probably wondering, how do you know access_remote_vm
is doing it?
Very simple, jump to the access_remote_vm
function, you can see that this function is annotated
/**
* access_remote_vm - access another process' address space
* @mm: the mm_struct of the target address space
* @addr: start address to access
* @buf: source or destination buffer
* @len: number of bytes to transfer
* @gup_flags: flags modifying lookup behaviour
*
* The caller must hold a reference on @mm.
*/
int access_remote_vm(struct mm_struct *mm, unsigned long addr,
void *buf, int len, unsigned int gup_flags)
{
return __access_remote_vm(NULL, mm, addr, buf, len, gup_flags);
}
Linux kernel source code, many functions have explained very standardized function, parameter descriptions, precautions, etc. , we should make full use of these resources to learn code.
Digress, let's get back on topic.
From proc_pid_cmdline_read
, we found that reading /proc/<PID>/cmdline
actually read the arg_start
address space of the data began. So, when these data address space is empty, of course, can not attend any of the data. So the question is, address space when data arg_start identity is empty?
Keywords: arg_start
Address space-related, not mere proc thing, we try to retrieve the kernel source global keyword
[GMPY@09:55 proc]$find -name "*.c" -exec grep --color=auto -Hnw "arg_start" {} \;
Match a lot, I do not want to see one by one, and no direction from the retrieved code
./mm/util.c:635: unsigned long arg_start, arg_end, env_start, env_end;
......
./kernel/sys.c:1747: offsetof(struct prctl_mm_map, arg_start),
......
./fs/exec.c:709: mm->arg_start = bprm->p - stack_shift;
./fs/exec.c:722: mm->arg_start = bprm->p;
......
./fs/binfmt_elf.c:301: p = current->mm->arg_end = current->mm->arg_start;
./fs/binfmt_elf.c:1495: len = mm->arg_end - mm->arg_start;
./fs/binfmt_elf.c:1499: (const char __user *)mm->arg_start, len))
......
./fs/proc/base.c:246: len1 = arg_end - arg_start;
......
But give me inspiration from matching file names:
/ proc / <PID> / cmdline is a property of each process, from task_struct to mm_struct are described processes and related resources, and that when it will modify the arg_start mm_struct where it? Process initialization time!
Lenovo to further the process of creating a user space no more than two steps:
- fork
- exec
When fork just to create a new task_struct
, father and son share a process mm_struct
, only in exec
time, will separate out mm_struct
, it must have been arg_start exec
is modified! The matching arg_start
file, just have exec.c
.
View the fs/exec.c
keywords where the function setup_arg_pages
later, did not find the key to the code, so continue to view the matching file names, resulting in a further association:
exec implementation of a new program, the new program is actually loaded bin files, keyword matching is just there binfmt_elf.c
!
Positioning problem not only to be able to understand the code, Lenovo is sometimes very effective
Function: create_elf_tables
binfmt_elf.c match keywords arg_start is a function of create_elf_tables , function very long, we streamlined look
static int
create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec,
unsigned long load_addr, unsigned long interp_load_addr)
{
......
/* Populate argv and envp */
p = current->mm->arg_end = current->mm->arg_start;
while (argc-- > 0) {
......
if (__put_user((elf_addr_t)p, argv++))
return -EFAULT;
......
}
......
current->mm->arg_end = current->mm->env_start = p;
while (envc-- > 0) {
......
if (__put_user((elf_addr_t)p, envp++))
return -EFAULT;
......
}
......
}
In this function, to achieve the argv and envp parties do not deposit arg_start and env_start address space.
Next, we try by tracing the source, along with traceability function create_elf_tables
calls
First of all, create_elf_tables
is declared as static , it said it is unlikely to exceed the effective range where the file. In file search, we found a higher level function
static int load_elf_binary(struct linux_binprm *bprm)
Or even static , and then continue to retrieve this file load_elf_binary
, locate the following code:
static struct linux_binfmt elf_format = {
.module = THIS_MODULE,
.load_binary = load_elf_binary,
.load_shlib = load_elf_library
.core_dump = elf_core_dump,
.min_coredump = ELF_EXEC_PAGESIZE,
};
static int __init init_elf_binfmt(void)
{
register_binfmt(&elf_format);
return 0;
}
core_initcall(init_elf_binfmt);
Here is retrieved, the code structure is very clear, load_elf_binary
the function assigned to struct linux_binfmt
by ` register_binfmt
register with the upper layer, provides the upper callback.
Keywords: load_binary
Why lock keyword load_binary it? Now .load_binary = load_elf_binary,
, the call should indicate the upper XXX->load_binary(...)
, thus locking keyword load_binary to locate, where to call this callback.
[GMPY@09:55 proc]$ grep "\->load_binary" -rn *
Very lucky, this is only the callback fs/exec.c
calling
fs/exec.c:78: if (WARN_ON(!fmt->load_binary))
fs/exec.c:1621: retval = fmt->load_binary(bprm);
Enter the fs / exex.c the 1621 line, attributable to the function search_binary_handler
, but unfortunately EXPORT_SYMBOL(search_binary_handler);
existence, he said he was likely to have multiple function is called, this time to continue forward analysis clearly very difficult, why not try to reverse engineer it ?
When a dead end road, another point of view, the answer is at hand
Since the search_binary_handler continue the analysis is not easy, we take a look at execve
whether the system calls can step into search_binary_handler
?
Keywords: exec
On Linux-4.9, system calls are generally defined SYSCALL_DEFILNE<参数数量>(<函数名>...
, so we global search key, first determine where the system call is defined in?
[GMPY@09:55 proc]$ grep "SYSCALL_DEFINE.*exec" -rn *
Locate the filefs/exec.c
fs/exec.c:1905:SYSCALL_DEFINE3(execve,
fs/exec.c:1913:SYSCALL_DEFINE5(execveat,
fs/exec.c:1927:COMPAT_SYSCALL_DEFINE3(execve, const char __user *, filename,
fs/exec.c:1934:COMPAT_SYSCALL_DEFINE5(execveat, int, fd,
kernel/kexec.c:187:SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments,
kernel/kexec.c:233:COMPAT_SYSCALL_DEFINE4(kexec_load, compat_ulong_t, entry,
kernel/kexec_file.c:256:SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,
Later follow-up call function is no longer a burden to summarize the call relationship
execve -> do_execveat -> do_execveat_common -> exec_binprm -> search_binary_handler
After all, is a return to the search_binary_handler
With this analysis, we determined the assignment logic:
- In
execve
the implementation of the new program will initializemm_struct
- The
execve
passed in argv and envp saved to arg_start and env_start address specified - In
cat /proc/<PID>/cmdline
from time arg_start virtual address to obtain data
Therefore, as long as the user-space process created after the system call execve, there will be/proc/<PID>/cmdline
, but still did not clarify when it will cmdline be empty?
We know that in Linux, the process can be divided into user-space processes and kernel space processes, since user-space process cmdline is not empty, we look at the kernel process.
Function: kthread_run
Kernel-mode driver, often through the kthread_run
creation kernel process, we use this function as the entry point, when you create a kernel process analysis, will assign cmdline?
Kthread_run directly from the start, track calling relationships, discover the real work is a function of__kthread_create_on_node
kthread_run -> kthread_create -> kthread_create_on_node -> __kthread_create_on_node
Remove redundant code, focus on what function do
static struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data),
void *data, int node, const char namefmt[], va_list args)
{
/* 把新进程相关的属性存于 kthread_create_info 的结构体中 */
struct kthread_create_info *create = kmalloc(sizeof(*create), GFP_KERNEL);
create->threadfn = threadfn;
create->data = data;
create->node = node;
create->done = &done;
/* 把初始化后的create加入到链表,并唤醒kthreadd_task进程来完成创建工作 */
list_add_tail(&create->list, &kthread_create_list);
wake_up_process(kthreadd_task);
/* 等待创建完成 */
wait_for_completion_killable(&done)
......
task = create->result;
if (!IS_ERR(task)) {
......
/* 创建后,设置进程名,此处的进程名属性为comm,不同于cmdline */
vsnprintf(name, sizeof(name), namefmt, args);
set_task_comm(task, name);
......
}
}
Similar analysis of the text keep, not tired. In summary, the function does two things
- Wake process
kthread_task
to create a new process - Set process attributes, attributes include comm, but does not include cmdline
Recalling user code analysis , if /proc/<PID>/cmdline
empty, COMM is used, this time with [] enclosed **
Therefore, after the kernel process kthread_run / ktrhread_create created /proc/<PID>/cmdline
content is empty
to sum up
In this paper top
, ps
the process name is displayed in the command whether or not containing []
as an entry point, from the user program to implement the principle of kernel code in-depth analysis.
In this analysis process, mainly by the following analytical methods
- Keyword search - from COMMAND top of the program to the kernel source arg_start, load_binary, exec
- Notes function - a function description function access_remote_vm
- Lenovo - Lenovo attributes from the process to create a user-space process, and then navigate to the handler arg_start keyword
- Reverse thinking - is derived from search_binary_handler difficult to call up the relationship, instead analyze whether execve system calls can be a step by step to search_binary_handler?
According to this analysis, we conclude that
1. 用户空间创建的进程在top/ps显示不需要[]
2. 内核空间创建的进程在top/ps显示会有[]
Ps from the actual results, in line with the above analysis results.
Due to limited capacity, if the above analysis is not precise enough, and hope to learn together to discuss