YoMo
YoMo is a set of open source real-time edge computing gateway, development framework and microservice platform. The communication layer is based on the QUIC protocol ( updated to Draft-31 version on 2020-09-25 ), which better releases the next-generation low-latency networks such as 5G the value of. The codec yomo-codec designed for Streaming Computing can greatly improve the throughput of computing services; based on the plug-in development model, your IoT real-time edge computing processing system can be launched in 5 minutes. YoMo has been deployed in the industrial Internet field.
Official website: https://yomo.run
Introduction to YoMo Codec
yomo-codec-golang is a SPEC description of YoMo Codec implemented in golang language ; it provides the ability to encode and decode basic data types, and provides YoMo with encoding and decoding tools that support its message processing. You can extend it to handle more data types, and even extend and apply it to other frameworks that require codecs.TLV结构
Project introduction: README.md
Why do you need YoMo-Codec?
As we all know, in HTTP communication, we often use JSON as the message codec, because it is simple in format, easy to read and write, and supports multiple languages, so it is very popular in Internet applications, so why do we need to develop YoMo by ourselves Codec to support YoMo applications?
- YoMo streams messages and extracts monitored key-value pairs for business logic processing. If JSON is used for encoding and decoding, it is required to wait for the complete data packet to be received before deserializing the data packet into an object, and then extracting the corresponding key-value value; but for YoMo Codec, by describing the object data as a group
TLV结构
, When decoding a data packet, it is possible to knowT
whether the current key is the monitored key earlier in the decoding process , so as to determine whether to skip directly to the next groupTLV结构
, without the need for redundant data packets that are not monitored. decoding operation, thereby improving the decoding efficiency. - The decoding of JSON usually uses a lot of reflection, which will affect its performance. However, because YoMo Codec only decodes the key-value that is actually monitored, the use of actual reflection will be greatly reduced.
- In the industrial Internet or network applications with strict computing resource requirements, less CPU resources are required for the same encoding and decoding operations, so that limited computing resources can be more fully utilized.
This performance test is to verify that YoMo Codec has higher data decoding performance and less resource consumption than JSON, thereby providing YoMo with more real-time, efficient, and low-loss message processing capabilities.
test introduction
1. Test method
-
Benchmarking through Benchmark provides both serial and parallel modes, the latter in order to see the performance under the full utilization of CPU resources.
-
The data package to be tested is generated by the program, and it is guaranteed that the value of the key-value pair contained in the data used by the Codec and JSON tests is exactly the same.
-
The data of the key-value pairs contained in the tested data are divided into 3 pairs , 16 pairs , 32 pairs , and 63 pairs. The value is the middle value of its quantity, for example: K08 means to monitor the value of the 8th key. In this way, the following dimensions are obtained, which are then expressed in the graph of the test results.
Symbolic representation Number of Key-values The key position to be monitored C63-K32 A total of 63 pairs of key-value Listen to extract the value of the 32nd key of the key C32-K16 A total of 32 pairs of key-value Listen to extract the value of the 16th digit key C16-K08 A total of 16 pairs of key-value Listen to extract the value of the 08th key of the key C03-K02 A total of 03 pairs of key-value Listen to extract the value of the key in the 02nd digit -
The results of the test include:
- A performance comparison of the operation of decoding and extracting the value corresponding to the monitored key from the data packet.
- Compare its CPU time in the same decoded extracted scene.
2. Data structure
-
Y3 test data
0x80 0x01 value .... 0x3f value
-
Structure of JSON test data
{ "k1": value, ... "k63" value }
3. Data processing logic
4. Test project
-
The code for this test report is available from the yomo-y3-stress-testing project.
-
Main code structure description (only list the file descriptions directly related to this test):
5. Test environment
- Hardware environment:
- CPU:2.6 GHz 6P intel Core i7,GOMAXPROCS=12
- RAM: 32GB
- Hard Disk: SSD
- Software Environment:
- macOS Catalina
- go version go1.14.1 darwin/amd64
- yomo-y3-stress-testing
Benchmark test
1. Serial test process
-
Tested code:
./internal/decoder/report_serial/report_benchmark_test.go
, such as:// 针对YoMo Codec Y3进行基准测试 func Benchmark_Codec_C63_K32(b *testing.B) { var key byte = 0x20 data := generator.NewCodecTestData().GenDataBy(63) b.ResetTimer() for i := 0; i < b.N; i++ { if decoder.TakeValueFromCodec(key, data) == nil { panic(errors.New("take is failure")) } } } // 针对JSON进行基准测试 func Benchmark_Json_C63_K32(b *testing.B) { key := "k32" data := generator.NewJsonTestData().GenDataBy(63) data = append(data, decoder.TokenEnd) b.ResetTimer() for i := 0; i < b.N; i++ { if decoder.TakeValueFromJson(key, data) == nil { panic(errors.New("take is failure")) } } }
- Benchmark_Codec_C63_K32: Indicates that the data value of the 32nd key is extracted from the data set whose key-value is 63 groups, and a serial benchmark test is performed on this.
- Default: GOMAXPROCS=12
-
Start the test script:
./internal/decoder/report_serial/report_benchmark_test.sh
temp_file="../../../docs/temp.out" report_file="../../../docs/report.out" go test -bench=. -benchtime=3s -benchmem -run=none | grep Benchmark > ${temp_file} \ && echo 'finished bench' \ && cat ${temp_file} \ && cat ${temp_file} | awk '{print $1,$3}' | awk -F "_" '{print $2,$3"-"substr($4,1,3),substr($4,7)}' | awk -v OFS=, '{print $1,$2,$3}' > ${report_file} \ && echo 'finished analyse' \ && cat ${report_file}
The test result set is generated and saved to a file by running the benchmark on the report_benchmark_test.go test file
./docs/report.out
. -
Generate the resulting graph:
./docs/report_graphics.ipynb
python --version # Python version > 3.2.x pip install runipy bar_ylim=70000 barh_xlim=20 runipy ./report_graphics.ipynb
2. Parallel testing process
In order to maximize the utilization of the CPU, observe the performance of the decoder in the multi-core scenario, and add the Parallel test item
-
Tested code:
./internal/decoder/report_parallel/report_benchmark_test.go
, such as:func Benchmark_Codec_C63_K32(b *testing.B) { var key byte = 0x20 data := generator.NewCodecTestData().GenDataBy(63) b.ResetTimer() b.RunParallel(func(pb *testing.PB) { for pb.Next(){ if decoder.TakeValueFromCodec(key, data) == nil { panic(errors.New("take is failure")) } } }) }
- The code is the same as the main body of the serial, the difference is the use of RunParallel for parallel testing
- Default: GOMAXPROCS=12
-
Start test script:
./internal/decoder/report_parallel/report_benchmark_test.sh
Generate test result set and save to./docs/report.out
file. -
Generate the resulting graph:
bar_ylim=18000 barh_xlim=25 runipy ./report_graphics.ipynb
3. Test results
-
Serial Benchmark test results:
- Time-consuming comparison of single decoding extraction: Figure 3.1
-
Ratio of Y3 to JSON time-consuming growth: Figure 3.2
- Chart Description:
- The coordinates in Figure 3.1: C63-K32, indicating that the data packet contains 63 pairs of key-values, and the same 32nd key is monitored to extract its value.
- The Y coordinate of Figure 3.1: Indicates the number of nanoseconds that a single operation takes.
- The X coordinate of Figure 3.2: indicates the increase in (JSON decoding time/Y3 decoding time). Such as: 43010/2077=20.07
-
Parallel Benchmark test results:
-
Time-consuming comparison of single decoding extraction: Figure 3.3
-
Ratio of Y3 to JSON time-consuming growth: Figure 3.4
-
4. Test Analysis
The above test results can be seen:
-
The decoding performance of Y3 is greatly improved than that of JSON. The more key-value pairs contained in the data packet, the more obvious the performance improvement, with an average increase of 10 times. (20.7+15.8+6.2+3.3)/4=11.5
-
Using multi-core for parallel decoding, the performance of ns/op is also greatly improved. There is a 3x improvement in parallel and serial comparison:
C63-K32 C32-K16 C16-K08 C03-K02 Serial test 2077 1361 1667 610 Parallel testing 706 505 515 175 increase 290% 260% 320% 350%
CPU resource analysis
1. Test process
-
Tested code:
./cpu/cpu_pprof.go
func main() { dataCodec := generator.NewCodecTestData().GenDataBy(63) dataJson := generator.NewJsonTestData().GenDataBy(63) dataJson = append(dataJson, decoder.TokenEnd) // pprof fmt.Printf("start pprof\n") go pprof.Run() time.Sleep(5 * time.Second) fmt.Printf("start testing...\n") for { if decoder.TakeValueFromCodec(0x20, dataCodec) == nil { panic(errors.New("take is failure")) } if decoder.TakeValueFromJson("k32", dataJson) == nil { panic(errors.New("take is failure")) } } }
- pprof.Run(): used to start pprof
-
The program continuously decodes Y3 and JSON in a loop, and observes the resource ratio of its CPU by observing the sampling map of the cpu profile
-
Run the test:
# 运行被观察代码,pprof默认启动6060端口 go run ./cpu_pprof.go # 进行取样,通过8081端口观察分析图 go tool pprof -http=":8081" http://localhost:6060/debug/cpu/profile
2. Test results
3. Test analysis
As can be seen from the above figure, YoMo Codec Y3 has to decode much less CPU resources than JSON, and the difference is more than 10 times ( 0.73/0.07=10.4 ). This observation can correspond to Benchmark, which has low CPU resource usage and at the same time The decoding speed is also improved qualitatively.
Test conclusion
Compared with JSON, Y3 has an order of magnitude improvement in decoding performance. The more keys in the data package, the more obvious the performance improvement. At the same time, Y3's CPU resource usage is also reduced by an order of magnitude; this performance test can verify that YoMo Codec Y3's performance The decoding capability can provide real-time, efficient, and low-loss message processing capabilities for YoMo or other scenarios that require high-performance decoding.