Linux VFS and Read/Write system calls

I. Introduction

VFS (Virtual File System, Virtual File System) is the interface layer between the physical file system and the service. It provides a standard interface to the file system downward to facilitate the porting of other file systems, and provides a standard file operation interface to the application layer upward, so that open( ), read(), write() and other system calls can be executed across various file systems and different media.
write picture description here

2. VFS objects and data structures

super block object super_block, corresponding to the mounted file
system inode object inode, corresponding to a file on the medium
directory entry object dentry, corresponding to a directory entry
file object file, corresponding to the file opened by the process
(defined in linux/fs.h )

1. Super block object

  • The superblock is used to describe the information of the entire file system
  • Each specific filesystem has its own superblock
  • All superblock objects are connected in the form of a doubly circular linked list
  • Superblock objects are created when the filesystem is mounted, stored in memory, and automatically deleted when the filesystem is unmounted

2, the index node object

  • The inode object contains all the information the kernel needs to operate on a file or directory

3. Directory item object

  • The directory entry object has no corresponding on-disk data structure
  • Three states: used, unused, negative

4. File object

  • A file object represents a file that a process has opened
  • Created by the open() system call and cancelled by the close() system call
  • Multiple processes open and operate the same object at the same time, and there are multiple corresponding file objects
  • Similar to the directory entry object, the file object has no corresponding disk data. The multiple dentry pointers point to the relevant directory entry object, the directory entry will point to the relevant inode, and the inode will record whether the file is dirty.

三、read、write

write picture description here
write picture description here
The entry functions of read and write system calls in the kernel are sys_read and sys_write, which are defined in fs/read_write.c. sys_read and sys_write call fget_light to get the corresponding file structure through fd, and then call vfs_read and vfs_write to call the read of a specific file system. Write operations (including permissions and file lock checks, synchronous and asynchronous read and write operations, block IO, etc.), and finally call fput_light to release the file object, sys_read and sys_write return, and the read and write system calls end.
sys_read:

asmlinkage ssize_t sys_read(unsigned int fd, char __user * buf, size_t count)
{
    struct file *file;
    ssize_t ret = -EBADF;
    int fput_needed;

    file = fget_light(fd, &fput_needed); //通过fd从current进程文件对象table里找出file对象
    if (file) {
        ret = vfs_read(file, buf, count, &file->f_pos); //调用vfs_read,这里会调用特定文件系统的file_operations->read做读动作
        fput_light(file, fput_needed); //释放file对象
    }

    return ret;
}

sys_write:

asmlinkage ssize_t sys_write(unsigned int fd, const char __user * buf, size_t count)
{
    struct file *file;
    ssize_t ret = -EBADF;
    int fput_needed;

    file = fget_light(fd, &fput_needed); //通过fd从current进程文件对象table里找出file对象
    if (file) {
        ret = vfs_write(file, buf, count, &file->f_pos); //调用vfs_write,这里会调用特定文件系统的file_operations->write做写动作
        fput_light(file, fput_needed); //释放file对象
    }

    return ret;
}

vfs_read:

ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
{
    struct inode *inode = file->f_dentry->d_inode;
    ssize_t ret;

    if (!(file->f_mode & FMODE_READ))
        return -EBADF;
    if (!file->f_op || (!file->f_op->read && !file->f_op->aio_read))
        return -EINVAL;

    ret = locks_verify_area(FLOCK_VERIFY_READ, inode, file, *pos, count);
    if (!ret) {
        ret = security_file_permission (file, MAY_READ);
        if (!ret) {
            if (file->f_op->read)
                ret = file->f_op->read(file, buf, count, pos);
            else
                ret = do_sync_read(file, buf, count, pos);
            if (ret > 0)
                dnotify_parent(file->f_dentry, DN_ACCESS);
        }
    }

    return ret;
}

vfs_write:

ssize_t vfs_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
{
    struct inode *inode = file->f_dentry->d_inode;
    ssize_t ret;

    if (!(file->f_mode & FMODE_WRITE))
        return -EBADF;
    if (!file->f_op || (!file->f_op->write && !file->f_op->aio_write))
        return -EINVAL;

    ret = locks_verify_area(FLOCK_VERIFY_WRITE, inode, file, *pos, count);
    if (!ret) {
        ret = security_file_permission (file, MAY_WRITE);
        if (!ret) {
            if (file->f_op->write)
                ret = file->f_op->write(file, buf, count, pos);
            else
                ret = do_sync_write(file, buf, count, pos);
            if (ret > 0)
                dnotify_parent(file->f_dentry, DN_MODIFY);
        }
    }

    return ret;
}

Please correct me if there is any mistake

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325984770&siteId=291194637