Protobuf is a high-performance serialization framework produced by Google. It has the advantages of small packet data after serialization, and supports a variety of programming languages (c/c++, java, php, python and other mainstream languages). The disadvantage is that the binary is unreadable. important.
1. Installation
Download the source code to compile
2. Development process
2.1 Prepare the helloworld.proto file
package com; message helloworld{ required int32 id = 1; required string str = 3; optional int32 age = 2; }
Required field
Optional optional field, used more, easy to upgrade the system smoothly later
Repeated repeated fields, equivalent to passing an array
Num 1,2,3 Field serial number, cannot be repeated
package package name, corresponding to namespace in c++, and corresponding to package name in java
Common data types, corresponding to c/c++ data structures
bool
int32/uint32 int64/uint32
float double
string can only handle ASCII characters
bytes is used to handle multi-byte language characters, such as Chinese
enum enumeration
2.2 Generate bundles for each language
protoc -I=. --cpp_out=. helloworld.proto protoc -I=. --java_out=./java helloworld.proto
2.3 Network test
The biggest advantage of protobuf is that it is cross-language, and the Java client message is processed through the C++ udp server.
C++ UDP Server:
/** * C++ Udp server */ void udpServer() { int s; struct sockaddr_in addr_serv; struct sockaddr_in client; s = socket (AF_INET, SOCK_DGRAM, 0); memset(&addr_serv, 0, sizeof(addr_serv)); addr_serv.sin_family = AF_INET; addr_serv.sin_addr.s_addr = htonl(INADDR_ANY); addr_serv.sin_port = htons(PORT_SERV); bind(s, (struct sockaddr*)&addr_serv, sizeof(addr_serv)); int n; char buff [BUFF_LEN]; socklen_t len; while(1) { len = sizeof(client); n = recvfrom(s, buff, BUFF_LEN, 0, (struct sockaddr*)&client, &len); // unserialize helloworld rmsg; rmsg.ParseFromArray( buff, BUFF_LEN ); printf( "Recv: %s\n", rmsg.DebugString().c_str() ); } }
Java UDP Client:
/** * UDP send pb packets */ public static void sendPbPacket() { // Builder Helloworld.helloworld.Builder builder = Helloworld.helloworld.newBuilder(); builder.setId(505100).setStr("hello world"); builder.setAge(18); // Make object Helloworld.helloworld hw = builder.build(); System.out.println( hw.toString() ); // Serialization byte[] buf = hw.toByteArray(); try { // deserialize Helloworld.helloworld hw1 = Helloworld.helloworld.parseFrom(buf); System.out.println( hw1.toString() ); } catch (InvalidProtocolBufferException e) { e.printStackTrace (); } // UDP send DatagramSocket client = null; try { client = new DatagramSocket(); InetAddress addr = InetAddress.getByName(host); DatagramPacket sendPacket = new DatagramPacket(buf, buf.length, addr, port); client.send(sendPacket); } catch (Exception e) { e.printStackTrace (); }finally{ client.close(); } }
Server prints:
Recv: id: 505100 str: "hello world" age: 18
It can be seen that protobuf can perfectly serialize across languages.
3. protobuf format analysis
Protobuf encoding is actually similar to tlv (tag length value) encoding, which is a combination of (tag, length, value) internally, where tag is calculated by (field_number<<3)|wire_type, and field_number is defined by us in the proto file.
Wireshark captures the above communication process. data pack:
Data segment, a total of 19 bytes:
08 8c ea 1e 12 0b 68 65 6c 6c 6f 20 77 6f 72 6c 64 18 12
1. int id = 505100
08 08 = (1<<3)|0, id serial number, the Type corresponding to int32 is 0 from the above table
8c ea 1e three bytes represent the number 505100
Why is 505100 0x8cea1e? The following is the conversion process:
Decimal: 505100
Binary: 1111011010100001100
Split every 7 digits: 001 1110 110 1010 000 1100
Swap high and low bits, fill high bits (1 or 0): 1000 1100 1110 1010 0001 1110
Hex: 0x08 0x0c 0xe 0xa 0x1 0xe
2. string str = "hello world";
12 0x12 = (2<<3)|2
0b length is 11
68 65 6c 6c 6f 20 77 6f 72 6c 64 hello world
3. int age = 18
18 0x18 = (3<<3)|0
12 decimal 18