12 readdir 函数

前言

在之前 ls 命令 中我们可以看到, ls 命令的执行也是依赖于 opendir, readdir, stat, lstat 等相关操作系统提供的相关系统调用来处理业务 

因此 我们这里来进一步看一下 更细节的这些 系统调用 

我们这里关注的是 readdir 这个函数, 入口系统调用是 getdents 

如下调试基于命令 "ls -l /jerry"

如下调试基于 linux 4.10

readdir 的词条 

getdents 

封装 getdents_callback, 然后 迭代 f 中的各个文件, getdents_callback 中藏有回调 

将数据最终存放于 buf 中 

ext4_dx_readdir

如下图 第一个 if block 为填充 info 用于迭代, 大致的工作是 将当前文件夹下的各个文件的相关信息填充到 info.root 中

然后 后面的处理为 迭代 info.root 的整棵树, 然后调用 call_filldir 来填充各个文件的信息, call_filldir 中会委托调用上面的 getdents_callback.ctx.actor 

htree_dirblock_to_tree

如下会 迭代 dir 中的各个文件, 然后调用 ext4_htree_store_dirent 将各个文件的信息放到 info.root 中

ext4_htree_store_dirent 

复制给定的文件的相关信息到 info.root 

这个 info.root 是基于 file->private_data 进行传输的, 具体的外面处理是在 ext4_dx_readdir 函数中

这里是 获取当前节点的 hash, minor_hash, inode, name, name_len, file_type 封装到 fname 中, 然后插入到 info.root[红黑树], 根据 hash, minor_hash 进行排序 

关于这个顺序, 我们待会儿会有一个 case 来论证

call_filldir

接着来到外部 ext4_dx_readdir 中, 迭代目录中的各个文件信息, 调用回调填充 数据到 buf

filldir 

调用 filldir 向 buf.current_dir 中填充当前 dir 的各个文件信息 

这个 buf.current_dir 是从参数传入的一个 用户空间的 dirent, 因此 这里使用了 __put_user 函数 

这里向 dirent 中填充了 inode_no, record_len, d->name, 0[字符串结束符], file_type, offset 等相关信息 

我们来看一下 填充之后的相关信息 

内存中的数据 可以对号入座一下, 这里 省略 

(gdb) x /30bc 0xaa3a40
0xaa3a40:	18 '\022'	0 '\000'	0 '\000'	0 '\000'	0 '\000'	0 '\000'	0 '\000'	0 '\000'
0xaa3a48:	0 '\000'	0 '\000'	0 '\000'	0 '\000'	0 '\000'	0 '\000'	0 '\000'	0 '\000'
0xaa3a50:	32 ' '	0 '\000'	84 'T'	101 'e'	115 's'	116 't'	48 '0'	50 '2'
0xaa3a58:	77 'M'	97 'a'	108 'l'	108 'l'	111 'o'	99 'c'

确认一下 Test02Malloc 的 inode_no, 确实是 18 

(initramfs) ls -ail /jerry/
     29 -rwxr-xr-x    1      8912 Test19UdpClient02
     32 -rw-r--r--    1       860 Test19UdpServer.c
     21 -rw-r--r--    1      1036 Test05SocketClient.c
     12 -rw-r--r--    1         7 1.txt
     35 -rw-r--r--    1         2 4.txt
     16 -rwxr-xr-x    1    913944 Test01SumStatic
     22 -rwxr-xr-x    1      9232 Test05SocketServer
     14 -rwxr-xr-x    1      9784 Test01Sum
     11 drwx------    2     12288 lost+found
      2 drwxr-xr-x    3      1024 .
     25 -rw-r--r--    1      2213 Test18UdpClient.c
     30 -rw-r--r--    1       790 Test19UdpClient.c
     28 -rwxr-xr-x    1      8912 Test19UdpClient
     24 -rwxr-xr-x    1     13656 Test18UdpClient
     13 -rwxr-xr-x    1     44168 ping
     23 -rw-r--r--    1      1839 Test05SocketServer.c
     34 -rw-r--r--    1         2 3.txt
     17 -rw-r--r--    1      8828 Test01Sum.txt
     19 -rw-r--r--    1       112 Test02Malloc.c
     36 -rw-r--r--    1         0 2.xml
      1 drwxr-xr-x   18         0 ..
     33 -rw-r--r--    1         2 2.txt
     31 -rwxr-xr-x    1      9008 Test19UdpServer
     20 -rwxr-xr-x    1      9208 Test05SocketClient
     15 -rw-r--r--    1       127 Test01Sum.c
     27 -rw-r--r--    1      2139 Test18UdpServer.c
     26 -rwxr-xr-x    1     13656 Test18UdpServer
     18 -rwxr-xr-x    1      9898 Test02Malloc
(initramfs) 

回顾一下 ls 中的使用

使用的是 系统调用获取到的 file_type, file_name, inode_no 等等 

readdir 中获取的文件顺序

如下 摘录出 /jerry 中各个文件, 以及其 hash 

然后根据 hash 进行排序, 输出各个文件的顺序, 我们比较一下 和 "ls -l /jerry" 的顺序的一下关系, 联系 

/**
 * Test13ResolveFileAndHash
 *
 * @author Jerry.X.He <[email protected]>
 * @version 1.0
 * @date 2022-08-06 10:58
 */
public class Test13ResolveFileAndHash {

    // Test13ResolveFileAndHash
    public static void main(String[] args) {

        String lines = "(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
                ". 2361201130\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=1798131950, minor_hash=3795156168, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
                ".. 1798131950\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2638309314, minor_hash=220112255, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
                "lost+found 2638309314\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=4147007512, minor_hash=1467808689, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
                "1.txt 4147007512\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2218817754, minor_hash=2900089684, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
                "ping\u000E 2218817754\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2722591116, minor_hash=3228507950, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
                "Test01Sum 2722591116\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=582633220, minor_hash=3262287479, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
                "Test01Sum.c 582633220\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=3631608018, minor_hash=2725415301, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
                "Test01SumStatic 3631608018\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2142992528, minor_hash=928223017, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
                "Test01Sum.txt 2142992528\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=220227180, minor_hash=2471538305, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
                "Test02Malloc\u0013 220227180\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=1840751322, minor_hash=137392396, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=857331192, minor_hash=434642936, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "Test05SocketClient 857331192\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=4158704484, minor_hash=312643960, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "Test05SocketClient.c\u0016 4158704484\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=3462659828, minor_hash=2883930437, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "Test05SocketServer 3462659828\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2205009970, minor_hash=314116339, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "Test05SocketServer.c\u0018 2205009970\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2263127060, minor_hash=2266183803, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "Test18UdpClient 2263127060\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2342703042, minor_hash=2944388140, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "Test18UdpClient.c 2342703042\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=428460190, minor_hash=2134201002, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "Test18UdpServer 428460190\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=493987328, minor_hash=2169830099, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "Test18UdpServer.c 493987328\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2302229242, minor_hash=322069301, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "Test19UdpClient 2302229242\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=4262479554, minor_hash=1848801320, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "Test19UdpClient02 4262479554\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2304662458, minor_hash=1793028086, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "Test19UdpClient.c 2304662458\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=888054094, minor_hash=304564512, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "Test19UdpServer 888054094\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=4198063328, minor_hash=3673609025, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "Test19UdpServer.c 4198063328\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=1673380854, minor_hash=394531314, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "2.txt 1673380854\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2170292718, minor_hash=758117636, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "3.txt 2170292718\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=4053642864, minor_hash=2642363966, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "4.txt 4053642864\n" +
                "Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=1832402012, minor_hash=2399099147, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
                "459\t\tnew_fn->name[ent_name->len] = 0;\n" +
                "2.xml 1832402012 \n";

        Map<Long, String> hash2FileName = new TreeMap<>();
        for (String line : lines.split("\n")) {
            if (line.contains("ext4_htree_store_dirent")) {
                continue;
            }
            if (line.contains("new_fn->name")) {
                continue;
            }
            if (line.contains("printf")) {
                continue;
            }

//            System.out.println(line);
            String[] splits = line.split("\\s+");
            hash2FileName.put(Long.parseLong(splits[1]), splits[0]);
        }

        for (Map.Entry<Long, String> entry : hash2FileName.entrySet()) {
            System.out.println(entry.getValue());
        }

    }

}

文件顺序如下, 呵呵 是不是和 "ls -l /jerry" 的顺序差不多, 只是顺序是反的 

从 readdir 中读取的文件的顺序和 上面的 Test13ResolveFileAndHash 的顺序一致, 那就是 外围的 ls 的处理可能导致的这个顺序上的差异 

类似于 coreutils 中 ls 是有 根据文件名排序, 根据扩展名排序, 根据文件大小排序, 根据版本排序, 根据时间排序

只是 qemu虚拟机中的 ls 的排序 是另外一种排序, 并且有一些 奇怪 

Test02Malloc
Test18UdpServer
Test18UdpServer.c
Test01Sum.c
Test05SocketClient
Test19UdpServer
2.txt
..
2.xml
Test01Sum.txt
3.txt
Test05SocketServer.c
ping
Test18UdpClient
Test19UdpClient
Test19UdpClient.c
Test18UdpClient.c
.
lost+found
Test01Sum
Test05SocketServer
Test01SumStatic
4.txt
1.txt
Test05SocketClient.c
Test19UdpServer.c
Test19UdpClient02

readdir 的顺序, 抽样前两个 

 

"ls -ail /jerry" 的顺序

(initramfs) ls -ail /jerry
     29 -rwxr-xr-x    1      8912 Test19UdpClient02
     32 -rw-r--r--    1       860 Test19UdpServer.c
     21 -rw-r--r--    1      1036 Test05SocketClient.c
     12 -rw-r--r--    1         7 1.txt
     35 -rw-r--r--    1         2 4.txt
     16 -rwxr-xr-x    1    913944 Test01SumStatic
     22 -rwxr-xr-x    1      9232 Test05SocketServer
     14 -rwxr-xr-x    1      9784 Test01Sum
     11 drwx------    2     12288 lost+found
      2 drwxr-xr-x    3      1024 .
     25 -rw-r--r--    1      2213 Test18UdpClient.c
     30 -rw-r--r--    1       790 Test19UdpClient.c
     28 -rwxr-xr-x    1      8912 Test19UdpClient
     24 -rwxr-xr-x    1     13656 Test18UdpClient
     13 -rwxr-xr-x    1     44168 ping
     23 -rw-r--r--    1      1839 Test05SocketServer.c
     34 -rw-r--r--    1         2 3.txt
     17 -rw-r--r--    1      8828 Test01Sum.txt
     19 -rw-r--r--    1       112 Test02Malloc.c
     36 -rw-r--r--    1         0 2.xml
      1 drwxr-xr-x   18         0 ..
     33 -rw-r--r--    1         2 2.txt
     31 -rwxr-xr-x    1      9008 Test19UdpServer
     20 -rwxr-xr-x    1      9208 Test05SocketClient
     15 -rw-r--r--    1       127 Test01Sum.c
     27 -rw-r--r--    1      2139 Test18UdpServer.c
     26 -rwxr-xr-x    1     13656 Test18UdpServer
     18 -rwxr-xr-x    1      9898 Test02Malloc

调试虚拟机 ls 命令帮助文档如下 

(initramfs) ls -help
ls: invalid option -- 'h'
BusyBox v1.22.1 (Ubuntu 1:1.22.0-15ubuntu1) multi-call binary.

Usage: ls [-1AaCxdLHFplins] [FILE]...

List directory contents

	-1	One column output
	-a	Include entries which start with .
	-A	Like -a, but exclude . and ..
	-C	List by columns
	-x	List by lines
	-d	List directory entries instead of contents
	-L	Follow symlinks
	-H	Follow symlinks on command line
	-p	Append / to dir entries
	-F	Append indicator (one of */=@|) to entries
	-l	Long listing format
	-i	List inode numbers
	-n	List numeric UIDs and GIDs instead of names
	-s	List allocated blocks

呵呵 从宿主机 ubuntu 拿到的顺序又不一样 

root@ubuntu:/jerryDisk/linux-4.10.14# ls -ail images/share/
total 1075
     2 drwxr-xr-x 3 root root   1024 May  4 00:57 .
414752 drwxr-xr-x 5 root root   4096 May  4 00:58 ..
    12 -rw-r--r-- 1 root root      7 May  4 00:57 1.txt
    33 -rw-r--r-- 1 root root      2 Jul 30 19:03 2.txt
    36 -rw-r--r-- 1 root root      0 Jul 30 20:13 2.xml
    34 -rw-r--r-- 1 root root      2 Jul 30 19:03 3.txt
    35 -rw-r--r-- 1 root root      2 Jul 30 19:03 4.txt
    14 -rwxr-xr-x 1 root root   9784 May  4 00:57 Test01Sum
    15 -rw-r--r-- 1 root root    127 May  4 00:57 Test01Sum.c
    17 -rw-r--r-- 1 root root   8828 May  4 00:57 Test01Sum.txt
    16 -rwxr-xr-x 1 root root 913944 May  4 00:57 Test01SumStatic
    18 -rwxr-xr-x 1 root root   9898 Jul 30 19:05 Test02Malloc
    19 -rw-r--r-- 1 root root    112 May  4 00:57 Test02Malloc.c
    20 -rwxr-xr-x 1 root root   9208 May  4 00:57 Test05SocketClient
    21 -rw-r--r-- 1 root root   1036 May  4 00:57 Test05SocketClient.c
    22 -rwxr-xr-x 1 root root   9232 May  4 00:57 Test05SocketServer
    23 -rw-r--r-- 1 root root   1839 May  4 00:57 Test05SocketServer.c
    24 -rwxr-xr-x 1 root root  13656 May  4 00:57 Test18UdpClient
    25 -rw-r--r-- 1 root root   2213 May  4 00:57 Test18UdpClient.c
    26 -rwxr-xr-x 1 root root  13656 May  4 00:57 Test18UdpServer
    27 -rw-r--r-- 1 root root   2139 May  4 00:57 Test18UdpServer.c
    28 -rwxr-xr-x 1 root root   8912 May  4 00:57 Test19UdpClient
    30 -rw-r--r-- 1 root root    790 May  4 00:57 Test19UdpClient.c
    29 -rwxr-xr-x 1 root root   8912 May  4 00:57 Test19UdpClient02
    31 -rwxr-xr-x 1 root root   9008 May  4 00:57 Test19UdpServer
    32 -rw-r--r-- 1 root root    860 May  4 00:57 Test19UdpServer.c
    11 drwx------ 2 root root  12288 May  4 00:56 lost+found
    13 -rwxr-xr-x 1 root root  44168 May  4 00:57 ping

上面提到的 hash 的计算方式存在于 hash.ext4fs_dirhash 中

readdir/ls 中的文件顺序关联的问题?

可以关联到如下问题中的 "WebappClassloader 如何加载 ?", 它的类加载顺序 依赖于 File.list 

40 classpath中存在多个jar存在同限定名的class classloader会如何加载_蓝风9的博客-CSDN博客_xbootclasspath 多个jar

File.list 实现来自于 FileSystem.list 

其实现 也取决于 readdir 相关具体的实现 

完 

猜你喜欢

转载自blog.csdn.net/u011039332/article/details/126192001