JuiceFS Performance Evaluation Guide

JuiceFS is a high-performance POSIX file system designed for cloud-native environments. Any data stored in JuiceFS will be split into data blocks and stored in object storage (such as Amazon S3) according to certain rules, and the corresponding metadata will be persisted in in a separate database. This structure determines that the storage space of JuiceFS can be elastically scaled according to the amount of data, and can store large-scale data reliably. At the same time, it supports shared mounting among multiple hosts, and realizes data sharing and migration across clouds and regions.

During the operation of JuiceFS, the actual performance may be different due to differences in hardware and software, system configuration, file size and other reasons. I shared before [How to use JuiceFS performance tools for analysis and tuning] In this article, we will further introduce how to perform accurate performance evaluation of JuiceFS, hoping to help you.

foreword

Before doing performance testing, it's a good idea to write down a rough description of the usage scenario, including:

  1. What is the docking application? Such as Apache Spark, PyTorch, or programs written by yourself, etc.
  2. Resource configuration for application running, including CPU, memory, network, and node size
  3. Estimated data size, including number and size of files
  4. File size and access mode (large or small, sequential or random)
  5. Performance requirements, such as the amount of data to be written or read per second, the QPS accessed or the latency of the operation, etc.

The clearer and more detailed the above content is, the easier it is to formulate an appropriate test plan and performance indicators that need attention to determine the application's requirements for various aspects of the storage system, including JuiceFS metadata configuration, network bandwidth requirements, and configuration parameters. Of course, it is not easy to write all the above content clearly at the beginning, and some content can be gradually made clear during the test process, but at the end of a complete test, the above usage scenario description and corresponding test methods, test data , the test results should be complete .

If the above content is not clear, it does not matter, the built-in test tool of JuiceFS can get the core indicators of the single-machine benchmark performance with one line of commands. At the same time, this article will also introduce two built-in performance analysis tools in JuiceFS. When doing more complex tests, these two tools can help you analyze the reasons behind the performance of JuiceFS simply and clearly.

Quick start with performance testing

The following example describes the basic usage of the bench tool built into JuiceFS.

Environment configuration

  • Test host: one Amazon EC2 c5.xlarge
  • OS: Ubuntu 20.04.1 LTS (Kernel 5.4.0-1029-aws)
  • Metadata engine: Redis 6.2.3, storage (dir) is configured on the system disk
  • Object Storage: Amazon S3
  • JuiceFS version:0.17-dev (2021-09-23 2ec2badf)

JuiceFS Bench

JuiceFS benchcommands can help you quickly complete the single-machine performance test, and judge whether the environment configuration and performance are normal through the test results. Assuming that you have mounted JuiceFS to the /mnt/jfslocation (if you need help in initializing and mounting JuiceFS, please refer to the Quick Start Guide and execute the following commands (recommended -pparameters are set to the number of CPU cores of the test machine):

$ juicefs bench /mnt/jfs -p 4

The test results will display various performance indicators in green, yellow or red. If there is a red indicator in your results, please check the relevant configuration first. If you need help, you can describe your problem in detail in GitHub Discussions.

The specific process of the JuiceFS benchbenchmark performance test is as follows (its implementation logic is very simple, if you are interested in the details, you can directly look at the source code :

  1. N concurrently write 1 large file of 1 GiB each, IO size is 1 MiB
  2. N concurrently read 1 large file of 1 GiB previously written, IO size is 1 MiB
  3. N writes 100 small files of 128 KiB concurrently, and the IO size is 128 KiB
  4. N concurrently read 100 previously written 128 KiB small files, the IO size is 128 KiB
  5. N concurrently stat 100 small files of 128 KiB previously written
  6. Clean up the temporary directory for testing

The value of the concurrent number N is specified benchby the -pparameter in the command.

Here is a performance comparison with several common storage types provided by AWS:

  • EFS 1TiB capacity, read 150MiB/s, write 50MiB/s, the price is $0.08/GB-month
  • EBS st1 is a throughput-optimized HDD, with a maximum throughput of 500MiB/s, a maximum IOPS (1MiB I/O) of 500, and a maximum capacity of 16TiB. The price is $0.045/GB-month
  • EBS gp2 is a general-purpose SSD with a maximum throughput of 250MiB/s, a maximum IOPS (16KiB I/O) of 16000, and a maximum capacity of 16TiB. The price is $0.10/GB-month

It is not difficult to see that in the above test, the sequential read and write capability of JuiceFS is significantly better than that of AWS EFS, and the throughput capability also exceeds that of the commonly used EBS. However, the speed of writing small files is not fast, because every time a file is written, the data needs to be persisted to S3, and calling the object storage API usually has a fixed overhead of 10~30ms.

Note 1: The performance and capacity of Amazon EFS are linearly related to the reference ( AWS official document ), so it is not suitable for use in scenarios with small data volume and high throughput.

Note 2: The prices refer to the AWS US East, Ohio Region , and the prices of different regions are slightly different.

Note 3: The above data comes from AWS official documents , and the performance index is the maximum value. The actual performance of EBS is related to the volume capacity and the type of EC2 instance mounted. Generally speaking, the larger the capacity, the higher the EC2 configuration, the EBS obtained. The better the performance, but does not exceed the maximum mentioned above.

Performance observation and analysis tools

Next, two performance observation and analysis tools are introduced, which are necessary tools in the process of testing, using, and tuning JuiceFS.

JuiceFS Stats

JuiceFS statsis a tool for real-time statistics of JuiceFS performance indicators. Similar to the dstatcommands , it can display the indicator changes of JuiceFS client in real time (see the performance monitoring document for details and usage ). juicefs benchWhen executing , execute the following commands in another session:

$ juicefs stats /mnt/jfs --verbosity 1

The results are as follows, which can be more easily understood by comparing it with the benchmarking process described above:

The specific meanings of the indicators are as follows:

  • usage
    • cpu: The CPU consumed by the JuiceFS process
    • mem: physical memory occupied by the JuiceFS process
    • buf: The size of the read-write buffer inside the JuiceFS process, --buffer-sizelimited
    • cache: internal indicators, don't pay attention
  • fuse
    • ops/lat: The number of requests processed by the FUSE interface per second and its average delay (unit: milliseconds, the same below)
    • read/write: The bandwidth value of the FUSE interface processing read and write requests per second
  • meta
    • ops/lat: The number of requests processed by the metadata engine per second and its average delay (please note that some requests that can be directly processed in the cache are not included in the statistics to better reflect the interaction between the client and the metadata engine. Time)
    • txn/lat: The number of write transactions processed by the metadata engine per second and its average latency (read-only requests such as getattronly ops but not txn)
    • retry: The number of times the metadata engine retries the write transaction per second
  • blockcache
    • read/write: read and write traffic per second of the client's local data cache
  • object
    • get/get_c/lat: The bandwidth value of object storage processing read requests per second , the number of requests and their average latency
    • put/put_c/lat: The bandwidth value of object storage processing write requests per second , the number of requests and their average latency
    • del_c/lat: The number of delete requests processed by the object store per second and the average delay

JuiceFS Profile

On the one hand, JuiceFS profileis used to output all access logs of JuiceFS clients in real time, including information about each request. At the same time, it can also be used to play back and count JuiceFS access logs, so that users can intuitively understand the operation of JuiceFS (for detailed instructions and usage, see the performance diagnosis document ). juicefs benchWhen executing , execute the following commands in another session:

$ cat /mnt/jfs/.accesslog > access.log

Among them .accesslogis a virtual file, which usually does not generate any data, only when reading (such as executing cat) will there be JuiceFS's access log output. Use <kbd>Ctrl-C</kbd> to end the catcommand when finished, and run:

$ juicefs profile access.log --interval 0

The --intervalparameter sets the sampling interval of the access log. When it is set to 0, it is used to quickly replay a specified log file and generate statistical information, as shown in the following figure:

As can be seen from the description of the previous benchmark test process, a total of (1 + 100) * 4 = 404 files were created in this test process, and each file went through "create → write → close → open → read → close → delete" process, so there are a total of:

  • 404 create, open and unlink requests
  • 808 flush requests: a flush is automatically called every time the file is closed
  • 33168 write/read requests: each large file writes 1024 1 MiB IOs, and the default maximum request at the FUSE layer is 128 KiB, which means that each application IO will be split into 8 FUSE requests, So a total of (1024 * 8 + 100) * 4 = 33168 requests. Read IO is similar, and the count is the same.

All of the above values ​​correspond exactly to profilethe results of . In addition, the results also show that the average delay of write is very small (45 microseconds), and the main time-consuming point is flush. This is because the write of JuiceFS first writes to the memory buffer by default, and then calls flush to upload the data to the object storage when the file is closed, which is in line with expectations.

Other Test Tool Configuration Examples

Fio stand-alone performance test

Fio is a commonly used performance testing tool in the industry. After completing JuiceFS bench, you can use it to do more complex performance testing.

Environment configuration

Consistent with the JuiceFS Bench test environment.

test task

Perform the following four Fio tasks to perform sequential write, sequential read, random write, and random read tests:

# Sequential
$ fio --name=jfs-test --directory=/mnt/jfs --ioengine=libaio --rw=write --bs=1m --size=1g --numjobs=4 --direct=1 --group_reporting
$ fio --name=jfs-test --directory=/mnt/jfs --ioengine=libaio --rw=read --bs=1m --size=1g --numjobs=4 --direct=1 --group_reporting

# Random
$ fio --name=jfs-test --directory=/mnt/jfs --ioengine=libaio --rw=randwrite --bs=1m --size=1g --numjobs=4 --direct=1 --group_reporting
$ fio --name=jfs-test --directory=/mnt/jfs --ioengine=libaio --rw=randread --bs=1m --size=1g --numjobs=4 --direct=1 --group_reporting

Parameter Description:

  • --name: User-specified test name, which will affect the test file name
  • --directory: test directory
  • --ioengine: The way to issue IO during testing; usually libaio can be used
  • --rw: Commonly used are read, write, randread, randwrite, which represent sequential read and write and random read and write respectively
  • --bs: The size of each IO
  • --size: total IO size per thread; usually equal to the size of the test file
  • --numjobs: The number of concurrent test threads; by default each thread runs a separate test file
  • --direct: Add the O_DIRECTflag without using system buffering, which can make the test result more stable and accurate

The result is as follows:

# Sequential
WRITE: bw=703MiB/s (737MB/s), 703MiB/s-703MiB/s (737MB/s-737MB/s), io=4096MiB (4295MB), run=5825-5825msec
READ: bw=817MiB/s (856MB/s), 817MiB/s-817MiB/s (856MB/s-856MB/s), io=4096MiB (4295MB), run=5015-5015msec

# Random
WRITE: bw=285MiB/s (298MB/s), 285MiB/s-285MiB/s (298MB/s-298MB/s), io=4096MiB (4295MB), run=14395-14395msec
READ: bw=93.6MiB/s (98.1MB/s), 93.6MiB/s-93.6MiB/s (98.1MB/s-98.1MB/s), io=4096MiB (4295MB), run=43773-43773msec

Vdbench multi-machine performance test

Vdbench is also a common file system evaluation tool in the industry, and supports multi-machine concurrent testing well.

test environment

Similar to the JuiceFS Bench test environment, only two more hosts with the same configuration are opened, for a total of three.

Ready to work

You need to install vdbench in the same path on each node:

  1. Download the 50406 version from the official website
  2. Install Java:apt-get install openjdk-8-jre
  3. Test the successful installation of vdbench:./vdbench -t

Then, assuming that the names of the three nodes are node0, node1 and node2, you need to create a configuration file on node0 as follows (to test a large number of small files to read and write):

$ cat jfs-test
hd=default,vdbench=/root/vdbench50406,user=root
hd=h0,system=node0
hd=h1,system=node1
hd=h2,system=node2

fsd=fsd1,anchor=/mnt/jfs/vdbench,depth=1,width=100,files=3000,size=128k,shared=yes

fwd=default,fsd=fsd1,operation=read,xfersize=128k,fileio=random,fileselect=random,threads=4
fwd=fwd1,host=h0
fwd=fwd2,host=h1
fwd=fwd3,host=h2

rd=rd1,fwd=fwd*,fwdrate=max,format=yes,elapsed=300,interval=1

Parameter Description:

  • vdbench=/root/vdbench50406: specifies the installation path of the vdbench tool
  • anchor=/mnt/jfs/vdbench: specifies the path to run the test task on each node
  • depth=1,width=100,files=3000,size=128k: defines the test task file tree structure, that is, create another 100 directories under the test directory, each directory contains 3000 files with a size of 128 KiB, a total of 300,000 files
  • operation=read,xfersize=128k,fileio=random,fileselect=random: Defines the actual test task, that is, randomly selects a file and sends a 128 KiB read request

The result is as follows:

FILE_CREATES        Files created:                              300,000        498/sec
READ_OPENS          Files opened for read activity:             188,317        627/sec

The overall system creates 128 KiB files at 498 per second and reads at 627 per second.

Summarize

This article comprehensively shares the overall process of performance evaluation of JuiceFS from the perspectives of environment configuration, tool introduction, and instance testing. Accurate performance evaluation can help optimize JuiceFS to better adapt to your application scenarios. Finally, everyone is welcome to actively record and share their testing process and results to the JuiceFS forum or user group.

Recommended reading: Zhihu x JuiceFS: Using JuiceFS to Accelerate Flink Container Startup

If it is helpful, please follow our project Juicedata/JuiceFS ! (0ᴗ0✿)

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324064117&siteId=291194637