Create a producer process and N consumer processes (N > 1)
Create a shared buffer with a file
The producer process sequentially writes integers 0, 1, 2, ..., M to the shared buffer (M >= 500)
The consumer process reads from the shared buffer, one at a time, and deletes the read numbers from the buffer, and then outputs "this process ID + deleted number" to the standard output
The buffer can only save up to 10 numbers at the same time

[Example] A possible output effect

The order of the process ID may change greatly, but the number after the colon must start from 0 and increase by 1

pc.c In addition, system calls related to semaphores such as , , and will be used in , which need to be implemented in Linux 0.11 by ourselves. sem_open()sem_close()sem_wait()sem_post()

(2) Realize the semaphore and use the producer-consumer program to check

Linux version 0.11 does not yet implement semaphores, Linus leaves this challenging job to you. If you can realize a copycat version of the semaphore that fully complies with the POSIX (Portable Operating System Interface of UNIX) specification, you will undoubtedly have a sense of accomplishment. But time does not allow us to do so for the time being, so we can first implement a set of reduced version of POSIX-like semaphores. Its function prototype is not exactly the same as the standard, and it only includes the following four system calls:

sem_t *sem_open(const char *name, unsigned int value);
int sem_wait(sem_t *sem);
int sem_post(sem_t *sem);
int sem_unlink(const char *name);

The specific functions and related parameters of the above four functions are explained as follows:

sem_open()

Function Create a semaphore, or open an existing semaphore.

parameter

sem_t: The type of semaphore, customized according to the needs of the implementation.

name: The name of the semaphore. If the semaphore does not exist, create a new semaphore name named ; if the semaphore exists, open the existing namesemaphore named . Different processes can share the same semaphore nameby semaphore.

value: The initial value of the semaphore, this parameter is valid only when creating a new semaphore, otherwise it value is ignored.

return value When the creation or opening is successful, the return value is the unique identifier of the semaphore (such as: the address in the kernel, ID, etc.), which is used by the other two system calls; when it fails, the return value is NULL. sem_open()

sem_wait()

Function	It is the P atomic operation of the semaphore, and its function is to subtract 1 from the value of the semaphore. If the conditions for continuing to run are not met, the calling process is made to wait on the semaphore sem.
parameter	`sem` : Pointer to the semaphore.
return value	Returns 0 for success and -1 for failure.

sem_post()

Function	It is the V atomic operation of the semaphore, and its function is to add 1 to the value of the semaphore. If there are processes waiting on sem, it wakes up one of them.
parameter	`sem` : Pointer to the semaphore.
return value	Returns 0 for success and -1 for failure.

sem_unlink()

Function	Deletes the semaphore `name` named .
parameter	`name`: The name of the semaphore.
return value	Returns 0 for success and -1 for failure.

【Experiment Tips】

We can create a new file in kernel the directory to realize the functions of the above four system calls. sem.c Then it will be ported pc.cfrom Ubuntu to run under Linux 0.11 to test the implemented semaphore.

3. Experiment preparation

1, signal amount

Semaphore, semaphore in English, was first designed by Dutch scientist and Turing Award winner EW Dijkstra. The "process synchronization" part of any operating system textbook will be described in detail. The semaphore ensures that the cooperation of multiple processes becomes reasonable and orderly.

The Linux semaphore adheres to the POSIX specification, and users man sem_overview can view related information.

The semaphore-related system calls involved in this experiment include: and .sem_open()、sem_wait()、sem_post() sem_unlink()

producer-consumer problem

The solution to the producer-consumer problem is found in almost all operating system textbooks, and its basic structure is:

Producer()
{
    // 生产一个产品 item;

    /* 空闲缓存资源 */
    P(Empty);

    /* 互斥信号量 */
    P(Mutex);

    // 将item放到空闲缓存中;

    V(Mutex);

    /* 产品资源 */
    V(Full);
}

Consumer()
{
    P(Full);
    P(Mutex);

    //从缓存区取出一个赋值给item;

    V(Mutex);

    // 消费产品item;
    V(Empty);
}

Obviously, when demonstrating this process, two types of processes need to be created, one type executes the function , and the other type executes the functionProducer() Consumer()

2. Multi-process shared files

Using C language under Linux, you can use the following three methods to read and write files (but only the first two methods can be used on Linux 0.11):

(1) Use standard C's , , , and etc.fopen()fread()fwrite()fseek() fclose()

(2) Use system calls , , , and etc.open()read()write()lseek()close()

(3) Through the memory image file, use the system call.mmap()

fork() After the call is successful, the created child process will inherit most of the resources owned by the parent process, including the files opened by the parent process. So the child process can directly use the file pointer/descriptor/handle created by the parent process, and access the same file as the parent process.

When using standard C file operation functions, it should be noted that they use the file buffer in the process space , and the buffer is not shared between the parent process and the child process. Therefore, after any process completes the write operation, it must force the data to be updated to the disk, so that other processes can read the required data. fflush()

In summary, it is recommended to use system calls directly for file operations.

3. The terminal is also a critical resource

It is natural to use to output information printf()to the terminal, but when multiple processes output at the same time, the terminal also becomes a critical resource, so mutual exclusion protection is also required, otherwise the output information may be disordered.

In addition, printf()after that , the information is only saved in the output buffer , and has not been actually output to the standard output (usually the terminal console), which may also cause the timing of the output information to be inconsistent. So printf() after stdio.h call fflush(stdout), to ensure that the data is sent to the terminal.

4. Atomic operations, sleep and wakeup

Linux 0.11 is a modern operating system that supports concurrency. Although it has not implemented any locks or semaphores for applications, it must use a lock mechanism internally , that is, when multiple processes access shared kernel data, they must be implemented through locks. Mutex and synchronization.

The lock must be an atomic operation (an operation that will not be interrupted by the scheduling mechanism. Once this operation starts, it will run until it ends). Semaphores can be implemented by emulating the locks of Linux 0.11.

For example, concurrent access to disk by multiple processes is a place where locks are needed. The basic processing method for Linux 0.11 to access the disk is to set aside a section of disk cache in the memory to speed up the access to the disk. The disk access request made by the process must first be searched in the disk cache, and if found, it will be returned directly; if not found, it will apply for a free disk cache, and initiate a disk read and write request with this disk cache as a parameter. After the request is sent, the process has to sleep and wait (because the disk read and write is very slow, at this time the CPU should be given up to other processes for execution). This approach is the more general one adopted by many operating systems (including modern Linux, UNIX, etc.). This involves multiple processes operating the disk cache together, and the process may be scheduled during the operation and lose the CPU. Therefore, mutual exclusion issues need to be considered when operating the disk cache, so locks must be used, and the process must also be used to sleep. and wake up.

[Example] The following are two functions taken from the file: kernel/blk_drv/ll_rw_blk.c

static inline void lock_buffer(struct buffer_head * bh)
{
    // 关中断
    cli();

    // 将当前进程睡眠在 bh->b_wait
    while (bh->b_lock)
        sleep_on(&bh->b_wait);
    bh->b_lock = 1;

    // 开中断
    sti();
}

static inline void unlock_buffer(struct buffer_head * bh)
{
    if (!bh->b_lock)
        printk("ll_rw_block.c: buffer not locked\n\r");
    bh->b_lock = 0;

    // 唤醒睡眠在 bh->b_wait 上的进程
    wake_up(&bh->b_wait);
}

lock_buffer()It can be seen from the analysis that when accessing the lock variable b_lock , the atomic operation is realized by turning on and off the interrupt to prevent the occurrence of process switching. Of course, this method also has disadvantages and is not suitable for use in a multiprocessor environment, but for Linux 0.11, it is a simple, straightforward and effective mechanism. Because the Linux 0.11 simulated by bochs in our experiment is a single-CPU system.

In addition, the above function shows that Linux 0.11 provides such an interface: use to sleep_on() realize the sleep of the process, and use wake_up() to realize the wake-up of the process. Their parameter is a structure pointer - struct task_struct *(that is, the PCB of the process, defined in sched.h), that is to say, the process sleeps or wakes up on a process PCB structure linked list pointed to by this parameter.

Therefore, in this experiment, we can also implement atomic operations by switching interrupts, and realize sleep_on() process sleep and wake-up by calling and that comes with Linux 0.11. wake_up()

【Note】

sleep_on() The function is to sleep the current process on the linked list specified by the parameter (note that this linked list is a very hidden linked list, see "Notes" for details)
wake_up() The function of is to wake up all processes sleeping on the linked list . These processes will be scheduled to run, so after they are woken up, they have to re-judge whether they can continue to run. Refer to the while loop in lock_buffer()

4. Experimental process

In general, the basic content of this experiment is to realize the semaphore in the Linux 0.11 kernel, and provide the user with an interface using the semaphore, and the user uses this interface to solve an actual process synchronization problem.

(1) Writing a producer-consumer inspection program

1. Write pc.c

oslab/exp_06Create a new directory underpc.c。

【pc.c】

#define __LIBRARY__
#include <unistd.h>
#include <linux/sem.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <linux/sched.h>

/* 添加系统调用API */
_syscall2(sem_t *,sem_open,const char *,name,unsigned int,value)
_syscall1(int,sem_wait,sem_t *,sem)
_syscall1(int,sem_post,sem_t *,sem)
_syscall1(int,sem_unlink,const char *,name)

const char *FILENAME = "/usr/root/buffer_file";  /* 消费or生产的产品存放的缓冲文件的路径 */
const int NR_CONSUMERS = 5;                      /* 消费者数量 */
const int NR_ITEMS = 520;                        /* 产品最大量 */
const int BUFFER_SIZE = 10;                      /* 缓冲区大小,表示可同时存在的产品数量 */
sem_t *mutex, *full, *empty;                     /* 3个信号量 */
unsigned int item_pro, item_used;                /* 刚生产的产品号,刚消费的产品号 */
int fi, fo;                                      /* 供生产者写入或消费者读取的缓冲文件的句柄 */

int main(int argc, char *argv[])
{
    char *filename;
    int pid;
    int i;

    filename = argc > 1 ? argv[1] : FILENAME;

    /* 
     * O_TRUNC 表示：当文件以只读或只写打开时，若文件存在，则将其长度截为0（即清空文件）
     * 0222 表示：文件只写（前面的0是八进制标识）
     * 0444 表示：文件只读
    */

    /* 以只写方式打开文件给生产者写入产品编号 */
    fi = open(filename, O_CREAT| O_TRUNC| O_WRONLY, 0222);
    /* 以只读方式打开文件给消费者读出产品编号 */
    fo = open(filename, O_TRUNC| O_RDONLY, 0444);

    mutex = sem_open("MUTEX", 1);    /* 互斥信号量，防止生产和消费同时进行 */
    full = sem_open("FULL", 0);      /* 产品剩余信号量，大于0则可消费 */
    empty = sem_open("EMPTY", BUFFER_SIZE);    /* 空信号量，它与产品剩余信号量此消彼长，大于0时生产者才能继续生产 */

    item_pro = 0;

    if ( (pid = fork()) )    /* 父进程用来执行生产者动作 */
    {
        printf("pid %d:\tproducer created....\n", pid);

        /* 
         * printf输出的信息不会马上输出到标准输出(通常为终端控制台),而是先保存到输出缓冲区。
         * 为避免偶然因素的影响造成输出信息时序不一致,
         * 每次printf()后都调用一下 stdio.h 中的 fflush(stdout),
         * 来确保将输出内容立刻输出到标准输出。 
        */

        fflush(stdout);

        while (item_pro <= NR_ITEMS)    /* 生产完所需产品 */
        {
            sem_wait(empty);  /* P(empty) */
            sem_wait(mutex);  /* P(mutex) */

            /* 
             * 生产完一轮产品(文件缓冲区只能容纳 BUFFER_SIZE 个产品编号)后,
             * 将缓冲文件的位置指针重新定位到文件首部。
            */
            if( !(item_pro % BUFFER_SIZE) )  /* item_pro = 10 */
                lseek(fi, 0, 0);

            write(fi, (char *) &item_pro, sizeof(item_pro));  /* 写入产品编号 */ 
            printf("pid %d:\tproduces item %d\n", pid, item_pro);
            fflush(stdout);
            item_pro++;

            sem_post(full);        /* 唤醒消费者进程 */
            sem_post(mutex);
        }
    }
    else    /* 子进程来创建消费者 */
    {
        i = NR_CONSUMERS;
        while(i--)
        {
            if( !(pid=fork()) )    /* 创建i个消费者进程 */
            {
                pid = getpid();
                printf("pid %d:\tconsumer %d created....\n", pid, NR_CONSUMERS-i);
                fflush(stdout);

                while(1)
                {
                    sem_wait(full);
                    sem_wait(mutex);

                    /* read()读到文件末尾时返回0，将文件的位置指针重新定位到文件首部 */
                    if(!read(fo, (char *)&item_used, sizeof(item_used)))
                    {
                        lseek(fo, 0, 0);
                        read(fo, (char *)&item_used, sizeof(item_used));
                    }

                    printf("pid %d:\tconsumer %d consumes item %d\n", pid, NR_CONSUMERS-i+1, item_used);
                    fflush(stdout);

                    sem_post(empty);    /* 唤醒生产者进程 */
                    sem_post(mutex);

                    if(item_used == NR_ITEMS)    /* 如果已经消费完最后一个商品，则结束 */
                        goto OK;
                }
            }
        }
    }
OK:
    close(fi);
    close(fo);
    return 0;
}

2. Mount pc.c

pc.c Copy to the virtual machine Linux 0.11 directory./usr/root/

// oslab 目录下
sudo ./mount-hdc
cp ./exp_06/pc.c ./hdc/usr/root/
sudo umount hdc/

(2) Realize semaphore

This part of the content can refer to the system call of Experiment 3: "Operating System" by Li Zhijun | Experiment 3 - System Call_Amentos' Blog-CSDN Blog

1. Add system call API

Add the following code pc.cto (added above).

_syscall2(sem_t *,sem_open,const char *,name,unsigned int,value)
_syscall1(int,sem_wait,sem_t *,sem)
_syscall1(int,sem_post,sem_t *,sem)
_syscall1(int,sem_unlink,const char *,name)

2. New sem.h

linux-0.11/include/linuxCreate a new one under the directory to define the data structure of the semaphore, including the semaphore name, semaphore value and a waiting process queue. sem.h

【sem.h】

#ifndef _SEM_H
#define _SEM_H

#include <linux/sched.h>

#define SEMTABLE_LEN    20
#define SEM_NAME_LEN    20

typedef struct semaphore
{
    char name[SEM_NAME_LEN];    /* 信号量名称 */
    int value;                  /* 信号量值 */
    struct task_struct *queue;  /* 信号量等待队列 */
} sem_t;

extern sem_t semtable[SEMTABLE_LEN];  /* 定义一个信号量表 */

#endif

The role of #ifndef, #define, and #endif here is to prevent repeated compilation of header files caused by repeated references. For specific principles, please see this article: Why add #ifndef #define #endif to the header file

3. New sem.c

linux-0.11/kernel Under the directory , create a new source code file sem.cto implement four semaphore functions.

【sem.c】

#include <linux/sem.h>
#include <linux/sched.h>
#include <unistd.h>
#include <asm/segment.h>
#include <linux/tty.h>
#include <linux/kernel.h>
#include <linux/fdreg.h>
#include <asm/system.h>
#include <asm/io.h>
//#include <string.h>

sem_t semtable[SEMTABLE_LEN];  /* 定义一个信号量表 */
int cnt = 0;

sem_t *sys_sem_open(const char *name,unsigned int value)
{
    char kernelname[100];   
    int isExist = 0;
    int i = 0;
    int name_cnt = 0;

    while( get_fs_byte(name+name_cnt) != '\0' )
        name_cnt++;

    if( name_cnt > SEM_NAME_LEN )
        return NULL;

    /* 从用户态复制到内核态 */
    for(i=0;i<name_cnt;i++)
        kernelname[i] = get_fs_byte(name+i);

    int name_len = strlen(kernelname);
    int sem_name_len = 0;
    sem_t *p = NULL;

    for(i=0;i<cnt;i++)
    {
        sem_name_len = strlen(semtable[i].name);
        if(sem_name_len == name_len)
        {
                if( !strcmp(kernelname,semtable[i].name) )
                {
                    isExist = 1;
                    break;
                }
        }
    }

    if(isExist == 1)
    {
        p = (sem_t*)(&semtable[i]);
        //printk("find previous name!\n");
    }
    else
    {
        i = 0;
        for(i=0;i<name_len;i++)
        {
            semtable[cnt].name[i] = kernelname[i];
        }
        semtable[cnt].value = value;
        p = (sem_t*)(&semtable[cnt]);
        //printk("creat name!\n");
        cnt++;
    }
    return p;
}


int sys_sem_wait(sem_t *sem)
{
    cli();   /* 关中断 */

    while( sem->value <= 0 )
        sleep_on( &(sem->queue) );    /* 所有小于0的进程都阻塞 */
    sem->value--;
             
    sti();   /* 开中断 */
    return 0;   
}


int sys_sem_post(sem_t *sem)
{
    cli();
    sem->value++;
    if( (sem->value) <= 1 )
        wake_up( &(sem->queue) );
    sti();
    return 0;
}


int sys_sem_unlink(const char *name)
{
    char kernelname[100];   /* 应该足够大了 */
    int isExist = 0;
    int i = 0;
    int name_cnt = 0;

    while( get_fs_byte(name+name_cnt) != '\0' )
        name_cnt++;

    if( name_cnt > SEM_NAME_LEN )
        return NULL;

    for(i=0;i<name_cnt;i++)
        kernelname[i] = get_fs_byte(name+i);

    int name_len = strlen(name);
    int sem_name_len = 0;

    for(i=0;i<cnt;i++)
    {
        sem_name_len = strlen(semtable[i].name);
        if(sem_name_len == name_len)
        {
            if( !strcmp(kernelname,semtable[i].name) )
            {
                isExist = 1;
                break;
            }
        }
    }

    if(isExist == 1)
    {
        int tmp = 0;

        for(tmp=i;tmp<=cnt;tmp++)
        {
            semtable[tmp] = semtable[tmp+1];
        }
        cnt = cnt-1;
        return 0;
    }
    else
        return -1;
}

4. Modify unistd.h

Four new system calls have been added, enter linux-0.11/includethe directory , open it unistd.h , and add a new system call number.

#define __NR_sem_open	xx
#define __NR_sem_wait	xx
#define __NR_sem_post	xx
#define __NR_sem_unlink	xx

5. Modify system_call.s

Enter the directory, open it , and modify the total number of system calls. linux-0.11/kernel system_call.s

6. Modify sys.h

Enter linux-0.11/include/linux, open , add system call function names for the four new system calls and maintain the system call table. sys.h

Note that the position of the system call function name in the sys_call_table array must be the same as the value of __NR_name in unistd.h

7. Modify the Makefile

linux-0.11/kernelMake the following changes to under the directory Makefile.

In the first place, add after [OBJS]:

sem.o

In the second place, add after [Dependencies]:

sem.s sem.o: sem.c ../include/linux/sem.h ../include/linux/kernel.h \
../include/unistd.h

8. Mount the file

Copy the written sem.hand modified unistd.hto the Linux 0.11 system, which is the same as the principle of "system call" in experiment three.

// oslab 目录下
sudo ./mount-hdc
cp ./linux-0.11/include/unistd.h ./hdc/usr/include/
cp ./linux-0.11/include/linux/sem.h ./hdc/usr/include/linux/
sudo umount hdc/

9. Recompile

// linux-0.11 目录下
make all

(3) Run the producer-consumer program

1. Compile and run pc.c

Enter Linux 0.11 under the oslab directory ./run , compile and run pc.c , and redirect the output information to pc.txt the file .

gcc -o pc pc.c
./pc > pc.txt
sync

Note that you must sync at the end!

2. View output

pc.txtCopy and check it under Ubuntu.

sudo ./mount-hdc
sudo cp ./hdc/usr/root/pc.txt ./exp_06
sudo chmod 777 exp_06/pc.txt
cat exp_06/pc.txt | more

You can view it through catthe command , or you can directly double-click pc.txt to open it.

Note that if "You do not have the required permissions to open the file" is displayed, modify the permissions by issuing the following command:

sudo chmod 777 exp_06/pc.txt

3. Output the result

……

【Experiment Tips】

1. Dealing with the chaotic bochs virtual screen

I don't know if it's a bug of Linux 0.11 or bochs, if more information is output to the terminal, the virtual screen of bochs will be confused. At this time, press Ctrl+L to re-initialize the screen, but if there is too much output information, it will still be confusing. For example, by running the program directly at the beginning ./pc , the results are displayed as follows.

Therefore, it is recommended to redirect the output information to a file: ./pc > pc.txt （即重定向到 pc.txt）, and then use vi, more and other tools to view this file by pressing the screen, which can basically solve this problem. You can also copy the file to the Ubuntu system for viewing.

vi pc.txt：

2. Tips about string.h

The problems described below may not have universal significance, and are just a reminder, please pay attention to the experimenters.

include/string.h implements a full set of C language string operations, and they are all optimized by assembly + inline. But in use, some strange problems may be encountered in some cases. For example, someone encountered the problem that strcmp() would destroy the content of the parameters. If you encounter some "weird" situations during debugging, you can try not including header files, which can usually be resolved. Because string.h is not included, these functions will not be called inline, and they will work more normally.

"Operating System" by Li Zhijun | Experiment 6 - Implementation and Application of Semaphore

1. Purpose of the experiment

2. Experimental content

(1) Use semaphores to solve the producer-consumer problem