[Internship] weekly serialization using the principle of Android in ProtoBuf

[Internship] weekly serialization using the principle of Android in ProtoBuf

I. Overview

protobuf (Protocol Buffer) is unrelated Google launched a language, platform-independent, scalable serializing structured data. Compatibility and transmission efficiency than Json and Xml.

II. Protobuf use in Android

1. Access protobuf in the project

. (1) add a dependency in build.gradle project root directory:

     dependencies {
        classpath 'com.google.protobuf:protobuf-gradle-plugin:0.8.8'
     }

. (2) add a plugin protobuf in build.gradle used in the module protobuf:

     apply plugin: 'com.google.protobuf'

(3) Adding protobuf build.gradle module is disposed in the

          protobuf {
             protoc {
                  // You still need protoc like in the non-Android case
                  artifact = 'com.google.protobuf:protoc:3.6.1'
             }
             generateProtoTasks {
                  all().each { task ->
                    task.builtins {
                      java {}
                    }
                  }
             }
          }

(4) Add the proto file path in the build.gradle module

         android {
            sourceSets {
                main {
                   proto {
                     srcDir 'src/main/proto'  //proto文件所在路径
                     include '**/*.proto'
                   }
                   java {
                     srcDir 'src/main/java'
                   }
               }
            }
         }

(5). Add to build.gradle module in protobuf-java protoc and dependence, which is very important protoc dependence, Lite version does not require use of added, so it is easy to miss.

dependencies {
             compile 'com.google.protobuf:protobuf-java:3.6.1'
             compile 'com.google.protobuf:protoc:3.1.0'
}

(6). Mangling rules

-keep class com.pl.longlink.** {*;} //protobuf生成类的路径
-keep class com.google.protobuf.Any {*;} //如果要用到Any,需要keep

2. Create a file .proto

.proto file syntax rules
. (1) syntax : Specifies the proto version
EG: syntax = "proto3";
(2). Package Penalty for : file generated by default package name
EG: Package Penalty for com.leeduo.protobuf;
(3). the Option java_outer_classname : Specifies the class name of the generated class
EG: the Option java_outer_classname = "the Person";
(4). the Option java_package : Specifies the package name of the generated file
EG: the Option java_package = "com.example.leeduo";
(5). the Message : that can be created examples of class
(6) within a class variable format: [modifier] digital type variable name = [[default =]] [[= packed the to true]];
     . a) modifiers:
                  optional : indicates that a modified variables can have or may not, proto3 remove
                  required : indicates that a modified variable must have, proto3 remove
                  a rEPEATED : indicates that a modified variables can be repeated, can represent an array
     B). default : Set default values for variables used to bring the brackets, proto3 remove
     C). packed The : When the option exists when repeated, can change the sequence of the modified variable transmission is repeated.
     EG: String name = 1 [default = "Jack"];
     D) Type:.
Here Insert Picture Description
     . E) numbers: The number for each variable (key) in the same local unique identifier (Tag), is greater than or equal to 1, 0 left uninitialized when used.

3. protobuf code

1. Preparation .proto file
2. Compile the program generates intermediate class
3. The intermediate class calls the method generates a message object
4. serialization: writeTo (); deserialization: parseFrom ();
   Description: Construction of the provided objects builder
               normally, using the set methods to fill data
               to be repeated modified variables added using the add method
               when creating an enumeration class in .proto file, tag from zero, otherwise. Starting at 1.

Three .protobuf serialization principle

1. Serialization

Protocol Buffer in each field in the message after encoding, reuse T - L - V storage for storing data, the resulting byte stream is a binary
sequence of encoding = & Storage
Protocol Buffer for different data types with different the serialization
Here Insert Picture Description

2. Data transmission mode: TLV

Tag - Length - Value,标识 - 长度 - 字段值
Here Insert Picture Description
(1).Tag:字段标识符,由字段的数据类型(wire_type)和标识号(field_number)构成
                  Tag的计算公式:Tag = (field_number << 3) | wire_type
                  通常Tag占用一个字节的长度,如果标识号超过了16,则占用多一个字节的位置
(2).Length:Value的字节长度采用Varint编码
(3).Value:编码后的消息字段的值

3.编码方式

Here Insert Picture Description

(1).64-bit和32-bit

编码后的数据具备固定大小 = 64位(8字节) / 32位(4字节)
两种情况下,都是高位在后,低位在前

(2).Varint

取出字节串末7位,在最高位添加1构成一个字节。
如果是最后一次取出,则在最高位添加0构成1个字节。
通过将字节串整体往右移7位,继续从字节串的末尾选取7位,直到取完为止。
将形成的每个字节按序拼接成一个字节串,该字节串就是经过Varint编码后的字节。
a).Varint 编码方式的不足
计算机定义负数的符号位为数字的最高位,采用Varint编码方式表示一个负数,需要5个byte。
b).解决
sint32 / sint64 类型表示负数,通过先采用 Zigzag 编码(将有符号数转换成无符号数),再采用Varint编码,从而用于减少编码后的字节数。

(3).Zigzag

编码:sint32:(n <<1) ^ (n >>31)
            sint64:(n <<1) ^ (n >>63)
解码:(n >>> 1) ^ -(n & 1)

(4).UTF-8

string类型采用UTF-8编码

4.存储方式

T-V或T-L-V
说明:
repeated修饰的字段
无packed: 以多个 T - V对存储
有packed:以T - L - V - V – V-……方式存储

5.使用建议

(1) Field Identification Number (Field_Number) try to use only 1-15 use and do not beat, because Tag is the need to account for in Field_Number byte space. If Field_Number> when 16, Field_Number coding will occupy 2 bytes,
then in coding the Tag will also occupy more bytes.

(2) field values ​​if desired to use negative numbers, use sint32 / sint64, do not use the int32 / int64 because the use sint32 / sint64 data type indicates a negative number, will start with the Zigzag encoded and then using Varint encoding, to more effectively compress data .

Published 10 original articles · won praise 5 · Views 588

Guess you like

Origin blog.csdn.net/LeeDuoZuiShuai/article/details/101280030