Linux-simple shell

1. Shell related knowledge

1.1 External commands and built-in commands

​ I think the shell can actually be seen as consisting of a front-end for string processing and a back-end for calling other files. The front-end parses the string entered by the user, and then passes the parsed result to the back-end, allowing the back-end to call other files. The files here refer to "external commands". The reason for this name is precisely because the function implementation of external commands is not within the shell. It is precisely because the shell does not need to implement external commands, so the difficulty of this big assignment is not high. The function of corresponding internal commands is implemented within the shell (that is, it needs to be implemented in the code of the shell). Fortunately, large jobs do not require the implementation of internal commands.

​ Regarding whether a command is an internal command or an external command, you can use typethe command to check, for example, enter

type cd

​ will have output

cd 是 shell 内建

​ Description is an internal command, and when input

type less

​ The output is

less 是 /usr/bin/less

​ Description is an external command.

1.2 File descriptors

​ Under the design philosophy of "everything is a file", what we call "redirection" is the process of replacing the standard input and output files with files of our own choice. Here will involve some knowledge of the operating system, as follows:

insert image description here
The above figure is a complete schematic diagram of the relationship between processes and files. On the far left is a unique file descriptor table for each process. Its essence is a file retrieval table. We can use file descriptors (file descriptor, fd) to retrieve corresponding entries.

​ Here first introduce the nature of fd:

  • Each process has its own fd increment space. Positive integers occupied by closed fds may be reused. The number of fds that a single process can open at the same time is limitlimited by system settings.
  • According to the agreement, shellwhen starting a new application, always open the three-number descriptors of 0, 1, as , , . They are named with macros in C, respectively , , .2stdinstdoutstderrSTDIN_FILENOSTDOUT_FILENOSTDERR_FILENO

The entries we retrieve point to something called a file table entry, which is still not a real file. It can be regarded as the state of the file, which records information such as our permissions on this file, the current offset of reading and writing. It is easy to think that two different file entries can correspond to the same file, but there are differences in status information such as permissions and offsets. This file entry table is shared by all processes.

​The file entry will contain a pointer to the v-nodenode , on which is the control block that records the static information of the file, and is v-nodethe one-to-one correspondence between each file and each .

​ In short, when we program in C language, we either use fd(the most essential) or use FILE*to manipulate files (should be the package provided by C). Here we use fd, because we generally use it to implement redirection and pipes, and the parameters of related system calls are file descriptors.

1.3 Opening and closing of files

​ In the user process, it can be implemented through system calls. For opening files, we have

int open(char *filename, int flags, int mode);

usable. This function will open the file filenamenamed permissions flagsdescribed by , we have a series of macros, and support and operation

macro meaning
O_RDONLY read only
O_WRONLY just write
O_RDWR readable and writable
O_CREATE If the file does not exist, create a truncated (empty) file
O_TRUNC If the file already exists, truncate it
O_APPEND Before each write operation, set the file position to the end

modeThe access permission bit of the new file is specified, and there is also a macro definition, but it will not be expanded. Generally, there 0666will .

This function returns the file descriptor for the open file fd.

When we need to close a file, we can do this

int close(int fd);

1.4 Reading and writing files

​ This actually has nothing to do with the implementation of the shell, but this is the first time I understand it. Let me record it, that is, we will make a system call every time we read and write files, but this is undoubtedly a high cost, because it is frequently used in user mode and kernel mode. to switch between.

​ The functions we getcusually are called buffered read and write functions. What he said is that he will read the information of the entire buffer size after opening the file, and then follow getcthe call of , one by one from the buffer to the outside Delivery, until there is no more, it is more convenient to call the system again to fill the entire buffer.

1.5 Redirection

​ With the above knowledge, we can introduce redirection. The function we use is

int dup2(int oldfd, int newfd);

​ This function says to copy the oldfdcorresponding descriptor entry to newfdthe entry. If newfdthere is no corresponding file, then newfdwhen is used again, the corresponding file is the entry corresponding to oldfdthe file . If newfdthere is a corresponding file, then dup2will be closed oldfdbefore newfd. If the return value is negative, it means failure.

For example, before calling

insert image description here

execute statement

dup2(4,1)

it became like this

insert image description here

In fact, this is the redirection stdoutof , and all future stdoutoperations on will point to the file fd=4of .

1.6 Pipeline

​ The pipeline is also based on the previous understanding

int pipe(int fd[2]);

Returns if successful 0, otherwise returns -1.

When it succeeds, it will modify the contents fdof the array , as stipulated: fd[0] → r; fd[1] → w. Reading and writing data to the pipe file is actually reading and writing the kernel buffer. No open, but manual close.

Its implementation has a schematic diagram:

insert image description here

1.7 Calling external commands

We can use the following function

int execvp(const char* command, char* argv[]);

It should be noted that this function generally does not return, that is, the statement after it will not be executed, so if it is executed, it will report an error. Also, argvthe last item must be NULL.

For example, let's say we want to enter the following command

ls -a -l ~

Then the corresponding parameters should be

command = "ls";
argv = {
    
    "ls", "-a", "-l", "~", NULL};

2. Basic functions

2.1 Demand Analysis

​ At the beginning, I was very scared, because I felt that the shell was related to the operating system, so it might be because of insufficient knowledge of the operating system that I couldn't write it. Later, with in-depth research, I found that it is not that difficult to write a shell. A simple shell that does not implement redirection, pipes, built-in commands, and background commands can be written in more than 100 lines. In fact, its essence can be summarized Implement a systemfunction .

​ For redirection, in fact, you only need to identify the redirection symbol <,>,>>separately , and then record the redirected file. Before calling the external command, perform the redirection operation of the file, and then call it.

​ For the pipeline command, because the title requires only to realize the pipeline connection of two commands, two command variables can be maintained, and then the position |of and then it is cut into two commands accordingly, and then respectively Redirection is done and the requirement is met. But the "two commands" are not general, so I expanded it into a connection of any pipeline command, the effect will be demonstrated later, and the implementation principle will be introduced later.

2.2 shell program flow

insert image description here

Because the analysis process is relatively complicated, it is too cumbersome to display in the general flow chart. Therefore, a sub-flow chart is drawn to describe the analysis process.

insert image description here

2.3 Function display

2.3.1 Command prompt with identity feature

insert image description here

It can be seen that when my shell is started, the shell name will be printed first Thysrael Shell, and then on the leftmost side of the command prompt in each line, there will ThyShellbe the words, these are characters with the identity of the writer.

2.3.2 Running an external command without parameters

insert image description here

We selected it lsas the test object and found that it can be run.

2.3.3 Support I/O redirection

The redirection of the standard output, you can see that whether it is >or >>is functioning normally

insert image description here

input redirection

insert image description here

functioning normally

2.3.4 Pipeline commands

Two commands can be piped together

insert image description here

2.3.5 Combination of pipes and redirection

It can be seen that there is no problem with the combination of input redirection or output redirection and pipeline.

insert image description here

2.3.6 Code size

ThyShellAll codes are implemented ThyShell.cin , with a total of 322 lines, which meets the requirements of the question.

2.4 Implementation and system calls

3. Advanced features

3.1 Print and compress the path

​ At the command prompt, I printed the path and implemented path compression, that is, when the home directory appears in the path, it will be /home/user_namecompressed~

insert image description here

The specific implementation method is to call getcwdthe function , you can get the current path, with the help getenvof the function, you can get the current home directory path, and then you can compare and compress, the specific implementation code is as follows

void print_prompt()
{
    
    
    char *path = getcwd(NULL, 0);
    const char *home = getenv("HOME");
    if (strstr(home, path) == 0)
    {
    
    
        path[0] = '~';
        size_t len_home = strlen(home);
        size_t len_path = strlen(path);
        memmove(path + 1, path + len_home, len_path - len_home);
        path[len_path - len_home + 1] = '\0';
    }
    printf("ThyShell \033[0;32m%s\033[0m $ ", path);
    free(path);
}

3.2 quit built-in command

The effect is as follows

insert image description here

The specific method is to regard quit as a command, and then make a judgment before calling the external command. If it meets the requirement, it will exit directly. The implementation code is as follows

int builtin_command(Command command)
{
    
    
    if (!strcmp(command.argv[0], "quit"))
    {
    
    
		quit();
    }
    else if (!strcmp(command.argv[0], "cd"))
    {
    
    
        if (chdir(command.argv[1]) != 0)
        {
    
    
            fprintf(stderr, "Error: cannot cd :%s\n", command.argv[1]);
        }
        return 1;
    }

    return 0;
}

3.3 The cd built-in command

The effect demonstration is as follows

insert image description here

You can see that you can switch freely under user permissions, and the method to achieve it is to use chdirthe function . The specific code is in Section 3.2.

3.4 Error detection

​ In addition to the operation of normal functions, ThyShellit also has anomaly detection function, which can detect fork anomalies, waitpid anomalies and syntax anomalies. The specific implementation is to wrap the system call function, which not only ensures the normal function, but also makes the code concise , the specific implementation is as follows

void unix_error(char *msg)
{
    
    
	fprintf(stderr, "%s: %s\n", msg, strerror(errno));
	exit(0);
}

pid_t Fork()
{
    
    
	pid_t pid;
	if ((pid = fork()) < 0)
	{
    
    
		unix_error("Fork error");
	}
	return pid;
}

void Wait(pid_t pid)
{
    
    
	int status;
	waitpid(pid, &status, 0);
	if (!WIFEXITED(status))
	{
    
    
		printf("child %d terminated abnormally\n", pid);
	}
}

3.5 Multi-pipeline commands

ThyShellMulti-pipeline commands can be implemented, and the specific demonstration is as follows:

insert image description here

Input cat filename | wc -l | lesscan appear as follows

insert image description here

Indicates that the function is normal.

The specific implementation can refer to the flow chart, the idea is to abstract the command line into a separate level, and the command line can include one or more commands. When a pipeline command appears, the pipeline needs to be opened and then redirected.

3.6 Commands with parameters

This can be achieved when parsing the command line, and the parameters will be passed as the parameters execvpof , the specific implementation is as follows

execvp(command.argv[0], command.argv);

Guess you like

Origin blog.csdn.net/living_frontier/article/details/129965789