Amazon s3 storage: aws cli upload tool speed and size of each file to explore the relationship between

1. Background

The company recently has recently unified storage environment, converted by full-ftp file storage capacity of ceph storage. Business groups have expressed bulk upload file ftp access to 300,000 1.3GB file only needs about 16 minutes. Ceph need to store switching over one hour, which is now only the conventional speed 369kb / s. We can ask how to improve.

The company's internal network and storage environments stress tests.
research report ceph storage system read and write performance test.

We are using the s3 file upload interface, that is, for the above cephrgw interface. Probably speed 20MB / s of.

2, first of all I use java program for authentication.

I found that he sent the data set is indeed a very slow upload speeds. And general file upload has about 10 times the gap. After resizing of aws-java-s3 sdk thread pool. The effect is not obvious. I used to observe the use jconsole thread

3, using AWS CLI for quick verification

AWS CLI uses to write python, with a more complete log.
Installation Guide: https: //docs.amazonaws.cn/cli/latest/userguide/install-windows.html
adjust and configure the number of concurrent log: https: //amazonaws-china.com/cn/blogs/china/amazon-s3 -depth-of-practice-series- s3-cli-depth-parsing-and-performance-testing /
batch generate files of different sizes I used Bandizip (7zip only support more than 1MB volume size): http: // www .bandisoft.com / bandizip /
number of concurrent requests and the segment size is set to I:
S3 =
max_concurrent_requests = 50
multipart_threshold = 10MB
multipart_chunksize = 6MB
upload command:
AWS S3 CP C: \ App \ qolfile \ S3: // IBAS-Cu / oss / public --endpoint-url http: //oss.ts-pfecs.epay --recursive
command parameters as follows:
CP: denotes a copy file
C: \ app \ qolfile: local directory
s3: // cu-ibas / oss / public: s3 distal end address, the name and path comprising a bucket
--endpoint-url http: //oss.ts-pfecs.epay: Specifies the remote endpoint address
--recursive: represents the recursive upload the files inside the folder

Upload and store in a log file:
AWS S3 cp C: \ App \ qolfile \ S3: // Cu-IBAS / oss / public --endpoint-url HTTP: //oss.ts-pfecs.epay --recursive - -debug> upload.txt 2> & 1

upload.txt的文件部分内容摘要
2019-07-17 15:34:50,100 - ThreadPoolExecutor-1_2 - s3transfer.tasks - DEBUG - Executing task UploadSubmissionTask(transfer_id=2, {'transfer_future': <s3transfer.futures.transferfuture object="" at="" 0x0000026005c559e8="">}) with kwargs {'client': <botocore.client.s3 object="" at="" 0x0000026005b972b0="">, 'config': <s3transfer.manager.transferconfig object="" at="" 0x0000026005bf8e48="">, 'osutil': <s3transfer.utils.osutils object="" at="" 0x0000026005bf8eb8="">, 'request_executor': <s3transfer.futures.boundedexecutor object="" at="" 0x0000026005c43080="">, 'transfer_future': <s3transfer.futures.transferfuture object="" at="" 0x0000026005c559e8="">}
2019-07-17 15: 34: 50,101 - ThreadPoolExecutor -1_3 - s3transfer.tasks - DEBUG - UploadSubmissionTask (transfer_id = 3, { 'transfer_future': <s3transfer.futures.transferfuture object = "" at = "" 0x0000026005c6b128 = "" >}) to the wait for About The following Futures []
2019-07-17 15: 34 is: 50,101 - the ThreadPoolExecutor-1_3 - s3transfer.tasks - the DEBUG - UploadSubmissionTask (= transfer_id. 3, { 'transfer_future':
the ThreadPoolExecutor with 49,50 . and other information on the number of concurrent requests on behalf of the entry into force and then I did a test speeds different file sizes:

Sub-volume file size Observation upload speeds
1MB 1.5MB/s
500KB 800KB/s
10KB 390KB/s
1KB 100KB/s

This means clear the file size of each file for upload speed is influential. If the size of every file is larger than 1MB or more. In the bandwidth settings appropriate s3 upload can be run over. If each file is less than 1KB under. Upload speed will be very slow.

4, summary

This first case given the small file data sets, there is no way to increase the upload speed. The application layer may be employed while forming the file, while the file upload mode to speed up the batch process. That is, the use of producer-consumer model. Queuing can use memory queue, queue redis can be used to place the consumer task.

 

Guess you like

Origin www.cnblogs.com/paxlyf/p/11201744.html