How to use JuiceFS performance tools for file system analysis and tuning

JuiceFS is a high-performance POSIX file system designed for cloud-native environments, released under the AGPL v3.0 open source license. As a distributed file system on the cloud, any data stored in JuiceFS will be divided into data blocks and stored in object storage (such as Amazon S3) according to certain rules, and the corresponding metadata will be persisted in an independent database. This structure determines that the storage space of JuiceFS can be elastically scaled according to the amount of data, and can store large-scale data reliably. At the same time, it supports shared mounting among multiple hosts, and realizes data sharing and migration across clouds and regions.

Since the release of v0.13, JuiceFS has added a number of functions related to performance monitoring and analysis. To a certain extent, the development team hopes that JuiceFS can not only provide excellent performance in large-scale distributed computing scenarios, but also widely to cover more daily usage scenarios.

In this article, we start with stand-alone applications to see if applications that rely on stand-alone file systems can also be used on JuiceFS, and analyze their access characteristics for targeted tuning.

test environment

The next test will be performed on the same Amazon cloud server, and the configuration is as follows:

  • Server configuration : Amazon c5d.xlarge: 4 vCPUs, 8 GiB memory, 10 Gigabit network, 100 GB SSD
  • JuiceFS : Use the local self-built Redis as the metadata engine, and the object storage uses S3 in the same region as the server.
  • EXT4 : Create directly on local SSD
  • Data sample : use the Redis source code as a test sample

Test project 1: Git

Commonly used git series commands include clone, status, add, diff, etc. Among them, clone is similar to compilation operation, which involves writing a large number of small files. Here we mainly test the status command.

Clone the code to the local EXT4 and JuiceFS respectively, and then execute the git statuscommand time-consuming as follows:

  • Ext4:0m0.005s
  • JuiceFS:0m0.091s

It can be seen that there is an order of magnitude difference in the time-consuming. If only from the sample of the test environment, such a performance difference is very small, and the user is almost imperceptible. But if you use a larger code repository, the performance gap between the two will gradually become apparent. For example, assuming that both time-consuming times are multiplied by 100 times, the local file system needs about 0.5s, which is still within the acceptable range; but JuiceFS will need about 9.1s, and users can already feel a significant delay. To figure out the main time-consuming points, we can use the profile tool provided by the JuiceFS client:

$ juicefs profile /mnt/jfs

In the analysis process, it is more practical to record the log first, and then use the playback mode :

$ cat /mnt/jfs/.accesslog > a.log
# 另一个会话:git status
# Ctrl-C 结束 cat
$ juicefs profile --interval 0 a.log

The result is as follows:

It can be clearly seen here that there are a large number of lookup requests. We can extend the cache time of metadata in the kernel by adjusting the mount parameters of FUSE, such as remounting the file system with the following parameters:

$ juicefs mount -d --entry-cache 300 localhost /mnt/jfs

Then we use the profile tool to analyze, and the results are as follows:

As you can see, lookup requests have been reduced a lot, but they have all turned into getattr requests, so attribute caching is also required:

$ juicefs mount -d --entry-cache 300 --attr-cache 300 localhost /mnt/jfs

At this time, the test found that the status command took time to drop to 0m0.034s, and the profile tool results are as follows:

It can be seen that the most time-consuming lookup at the beginning has been reduced a lot, and the readdir request has become a new bottleneck. We can also try the settings --dir-entry-cache, but the improvement may be less noticeable.

Test project 2: Make

Compilation times for large projects are often measured in hours, so compile-time performance is even more important. Still taking the Redis project as an example, the test compilation time is:

  • Ext4:0m29.348s
  • JuiceFS:2m47.335s

We tried to increase the metadata cache parameters, and the overall time was reduced by about 10s. The analysis results through the profile tool are as follows:

Obviously the data reading and writing here is the key to performance, we can use the stats tool for further analysis:

$ juicefs stats /mnt/jfs

One of the results is as follows:

It can be seen from the above figure that the ops of fuse is close to meta, but the average latency is much larger than meta, so it can be judged that the main bottleneck is on the object storage side. It is not difficult to imagine that a large number of temporary files are generated in the early stage of compilation, and these files will be read in the later stages of compilation, and it is difficult to directly meet the requirements with the performance of general object storage. Fortunately, JuiceFS provides a data writeback mode, which can create a write cache on the local disk first, which is suitable for compilation scenarios:

$ juicefs mount -d --entry-cache 300 --attr-cache 300 --writeback localhost /mnt/jfs

At this time, the total compilation time has dropped to 0m38.308s, which is very close to the local disk. The monitoring results of the stats tool in the later stage are as follows:

It can be seen that basically all read requests are hit in the blockcache, and there is no need to access the object storage; the ops statistics on the fuse and meta sides have also been greatly improved, which is in line with expectations.

Summarize

This article takes the Git warehouse management and Make compilation tasks that the local file system is better at, and evaluates the performance of these tasks on JuiceFS storage, and uses the profile and stats tools that come with JuiceFS for analysis. By adjusting the file system mount parameters Do targeted optimization.

There is no doubt that there are natural differences between local file systems and distributed file systems such as JuiceFS, and their application scenarios are also completely different. This article chooses two special application scenarios, just to introduce how to perform performance tuning for JuiceFS in different situations, and aims to attract others. If you have relevant ideas or experiences, please share and discuss in the JuiceFS forum or user group.

Recommended reading Zhihu x JuiceFS: Using JuiceFS to Accelerate Flink Container Startup

Project address : https://github.com/juicedata/juicefs If you are helpful, please follow us!  (0ᴗ0✿)

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324291069&siteId=291194637