coreutils-4.5.1 head.c源码分析03

今天真是把head.c看懂了。
今天天真冷，我网上购了电热取暖器，没送到。但边听音乐，边读代码，感觉也很惬意。
看代码不能着急，要慢慢看，也许就像有人讲的，郝培强讲的，一开始要慢，开始慢，后面才能越看越快。是的。看代码，开始要慢。不要着急。
head有几个选项
-n 行数
-c 字节数
-q 不打印文件名
-v 打印文件名
先把命令用法搞明白。这是重要。

./head -q -n 3 head.c
/* head -- output first part of file(s)
Copyright (C) 89, 90, 91, 1995-2002 Free Software Foundation, Inc.
./head -v -n 3 head.c
==> head.c <==
/* head -- output first part of file(s)
Copyright (C) 89, 90, 91, 1995-2002 Free Software Foundation, Inc.
./head -v -n 3 head.c cat.c
==> head.c <==
/* head -- output first part of file(s)
Copyright (C) 89, 90, 91, 1995-2002 Free Software Foundation, Inc.

==> cat.c <==
/* cat -- concatenate files and print on the standard output.
Copyright (C) 88, 90, 91, 1995-2002 Free Software Foundation, Inc.

可以把ubuntu命令行的结果复制到gvim，真是幸福。
我以为，要在本地编译命令，我是进入到src文件夹下，
再sudo make一把，然后执行本地命令，注意前面的./
然后，再慢慢看。

static void
write_header (const char *filename)
{
static int first_file = 1;

printf ("%s==> %s <==\n", (first_file ? "" : "\n"), filename);
first_file = 0;
}
这个函数控制是否打印文件名，但此处，first_file这个静态变量感觉有些不明白，每次用之前设置为1，用完又改为0，感觉是多此一举。
int
main (int argc, char **argv)
{

have_read_stdin = 0;

print_headers = 0;

if (1 < argc && argv[1][0] == '-' && ISDIGIT (argv[1][1]))
{
}
while ((c = getopt_long (argc, argv, "c:n:qv", long_options, NULL)) != -1)
{处理选项，基本是同一套路，不过，要注意的是，此处是设置全局变量，然后函数中根据全局变量进行处理。
}

if (header_mode == always
|| (header_mode == multiple_files && optind < argc - 1))
print_headers = 1;

if (optind == argc)
exit_status |= head_file ("-", n_units, count_lines);

for (; optind < argc; ++optind)
exit_status |= head_file (argv[optind], n_units, count_lines);

if (have_read_stdin && close (STDIN_FILENO) < 0)
error (EXIT_FAILURE, errno, "-");

exit (exit_status == 0 ? EXIT_SUCCESS : EXIT_FAILURE);
}
可以看出，可以一次传多个文件名，也就是说，head命令要能一次处理多个文件。如
./head h1 h2 h3 h4
这也符合unix习惯，通过shell展开*之类通配符，具体的命令 head要能处理多个文件。
for (; optind < argc; ++optind)
exit_status |= head_file (argv[optind], n_units, count_lines);
再分析head_file
根据-n -c先项，来决定是按行显示还是按字符个数显示，其中按字符个数显示较简单，代码也少些。
static int
head_bytes (const char *filename, int fd, uintmax_t bytes_to_write)
{
char buffer[BUFSIZE];
int bytes_read;
size_t bytes_to_read = BUFSIZE;
SET_BINARY2 (fd, fileno (stdout));
我把错误处理删除了
while (bytes_to_write)
{
if (bytes_to_write < bytes_to_read)
   bytes_to_read = bytes_to_write;
bytes_read = safe_read (fd, buffer, bytes_to_read);
if (bytes_read == 0)
   break;
if (fwrite (buffer, 1, bytes_read, stdout) == 0)
   error (EXIT_FAILURE, errno, _("write error"));
bytes_to_write -= bytes_read;
}
return 0;
}
这段代码较精练，大意就是：
读bytes_to_write个字符
当没读完时
   按缓冲区大小进行读
   读到的字节数写入bytes_read
   如果读到0个字符，就表示到了文件末尾。
   再将读到的bytes_read个字符写到标准输出
   bytes_to_write -= bytes_read
所以这段代码很精练，其中safe_read我没再细究了。

按行读的函数
head_lines 我没有看懂，其中有一个错误处理：
while (bytes_to_write < bytes_read)
   if (buffer[bytes_to_write++] == '\n' && --lines_to_write == 0)
   {
   /* If we have read more data than that on the specified number
   of lines, try to seek back to the position we would have
   gotten to had we been reading one byte at a time. */
   if (lseek (fd, bytes_to_write - bytes_read, SEEK_CUR) < 0)
   {
       int e = errno;
       struct stat st;
       if (fstat (fd, &st) != 0 || S_ISREG (st.st_mode))
       error (0, e, _("cannot reposition file pointer for %s"),
           filename);
   }
   break;
   }
作者调用lseek不知是干什么？其中当读到的字符是'\n'时，进行行的计数，这个很关键。我喜欢留点儿尾巴。慢慢看，读代码也是一种生活方式。
其中对

while ((c = getopt_long (argc, argv, "c:n:qv", long_options, NULL)) != -1)
{
的处理，getopt_long的源码一直没看明白。我读sed时，读到这儿，就直想吐，你看懂了吗？哈哈哈，下次吧。

coreutils-4.5.1 head.c源码分析03

猜你喜欢