Linux System Programming - Chapter 4 Advanced File IO

Scatter / gather IO: a plurality of buffers at the same time in a single system call to read or write operation, different data structures for uniform aggregate IO operations.

epoll: when poll () and select () the improved version, in a program to deal with hundreds of useful file descriptor

Memory mapped file IO: to map the file into memory, files can be handled with simple memory management

File IO Tip: Allow process will provide some tips on using the kernel file IO, IO can improve performance

Circumstances permit process to issue multiple IO request and without waiting for its completion, applicable to the case without the use of threads handling a heavy load of IO: asynchronous IO

 

Scatter / gather IO (vector IO):

#include <sys/uio.h>
struct iovec {
void *iov_base;
size_t iov_len;
};
ssize_t readv (int fd, const struct iovec *iov, int count);
ssize_t writev (int fd, const struct iovec *iov, int count);

Function Description: sequentially reading or writing of each segment (iovec represents a structure), and returns the number of bytes read or written in total.

 

Event Poll Interface:

https://blog.csdn.net/weixin_38812277/article/details/90634146

Memory map:

Memory mapped file:

Data files are usually already present in the kernel page cache file may be mapped by a memory mapping to the cache linear region, i.e. mapped to a portion of a file. Then write memory, through a transparent mechanism for mapping user process can directly access files on the memory access data through the linear region . [This operation bypasses the kernel, as modified unmapped when cache files are not written to be written to the file must be used msync () to synchronize]

Shared (MAP_SHARED): any files on the linear region of operation will modify the disk, if the process of mapping area to be modified, other processes are mapped to the file are visible.

Private type (MAP_PRIVATE): Generally when the map was created just to read the file, use this map. This mapping of the write operation, the disk file is not modified, but stopped map pages in the file, remapped to a new page, and not visible to other processes.

#include <sys/mman.h>
void * mmap (void *addr, size_t len, int prot, int flags, int fd, off_t offset);
int munmap (void *addr, size_t len);

int mprotect (const void *addr, size_t len, int prot);
int msync (void *addr, size_t len, int flags);
int madvise (void *addr, size_t len, int advice);/给内核提供mmap的实现建议,实现优化/

[Note] page is the smallest unit of memory China with different privileges and behavior.

 

Anonymous file mapping :( For larger memory allocation, do not use the heap, but the use of anonymous memory mapping)

A memory mapping anonymous except using a large block of memory has been initialized to 0, for the user. Think of it as a separate heap allocation and use for a time, but not the heap, and therefore will not produce debris within the heap.

Advantages: no need to be concerned about fragmentation, anonymous mapping can be resized, you can set permissions, you can also accept the proposal as like an ordinary map, each distribution exists in a separate memory map.

The disadvantage (relative to pile it): Certificate page of the times, there is waste; relative allocated from the heap, is more complicated.

glibc malloc function uses the heap to satisfy the allocation of small memory (using system calls sbrk or brk), the use of anonymous memory mapping to meet the large memory allocation (mmap).

Implementation 1 Use MAP_ANONYMOUS logo

p= mmap (NULL, /* do not care where */
512 * 1024, /* 512 KB */
PROT_READ | PROT_WRITE, /* read/write */
MAP_ANONYMOUS | MAP_PRIVATE, /*anonymous, private */
-1, /* fd (ignored) */
0); /* offset (ignored) */

ret = munmap (p, 512 * 1024);

One of the benefits allocated by anonymous mapping is all the pages have been initialized with 0 (the memory block can be viewed as a "file" everything is a file), due to the replication mechanism used by the kernel when writing to memory mapped to an all-0 on the page, thus avoiding the use of additional overhead . At the same time there is no need to use memset to get the memory initialization. In fact calloc than using malloc re-use memset effect good reason.

Implementation 2: is mapped to the / dev / zero file (this file device and provides the same anonymous memory semantics)

fd = open (”/dev/zero”, O_RDWR);
p = mmap (NULL, /* do not care where */
getpagesize (), /* map one page */
PROT_READ | PROT_WRITE, /* map read/write */
MAP_PRIVATE, /* private mapping */
fd, /* map /dev/zero */
0); /* no offset */

[Map can be turned off after fd that is close (fd), anonymous memory mapped and can be used to give his son-process communication]

Trivial File IO tips:

 

Asynchronous IO: IO operation initiated by the user, direct return, the kernel is responsible for copying the data to the user memory, and notification process.

Perform asynchronous IO support needs underlying kernel.

 

Guess you like

Origin blog.csdn.net/weixin_38812277/article/details/93318049