Protocol Buffers Developer's Guide

Welcome to the developer's guide protocol buffers. protocol buffers is a language-neutral, platform-neutral protocol for communication, data storage and other areas of structured data serialization extension methods.

This document is aimed primarily at Java, C ++ or Python developers who want to use Protocol Buffers in the application development. The summary of information about Protocol Buffers will tell you how to start using Protocol Buffers. If you want a more in-depth understanding about the content of Protocol Buffers, you can enter  tutorials  or  protocol buffer encoding  page to learn more about.

About the API reference documentation, please refer to page: Reference Documentation  provides all reference to these three languages here, but also for .proto  Language  and  style  to provide relevant guidance.

What's Protocol Buffers?

Protocol buffers is structured data serialization of a flexible, efficient, automated tools - you can imagine the Protocol buffers into XML, but smaller, faster and easier.

You can define your own data structure, then you can use a specific code generation tool easy to your structured data reading and writing. Reading and writing of these data streams may be series of data and using a different computer programming languages. Update your data structure in the case that you can not even have been deployed in the program were destroyed.

How is the work of Protocol Buffers

You need to develop how you want your data to be serialized. Are you define your message through proto structured data files.

Each protocol buffer message is a small information recorded logical, this message includes a number of names, variable control sequences. The following are some basic .proto files that defines a message, the message includes a message person:

message Person {
   required string name =  1 ;
   required int32 id =  2 ;
   optional string email =  3 ;
 
   enum  PhoneType {
     MOBILE =  0 ;
     HOME =  1 ;
     WORK =  2 ;
   }
 
   message PhoneNumber {
     required string number =  1 ;
     optional PhoneType type =  2  [ default  = HOME];
   }
 
   repeated PhoneNumber phone =  4 ;
}

Through the above you can see the format of this message is very simple - each message type has one or more unique fields were numbered, each field contains a name and variable type.

Variable can be a number (integer or floating point) (Numbers), boolean (Booleans), string (strings), the native binary (raw bytes) even other protocol buffer message types, you can allow a hierarchical structure your data.

You can specify the field is an optional field (optional fields), must field ( required Fields ) and field duplicate (repeated fields). You can start with the following  Protocol Buffer Language Guide  Find out more about the definition of .proto page.

Once you successfully defined your message, you can use .proto you defined for the language you use to run the protocol buffer compiler (protocol buffer compiler) to generate data access classes.

For each field, provides a simple method of accessing the class data access (e.g., name () and set_name ()) and to the native sequence of binary data and the binary data from the native deserialization process.

For the above definition, if you are using the C ++ language, when you put the message to be compiled, you will be called a  Person class. You can use this class to fill data in your application, and the data sequence from the retrieved sequence of Person in the data (protocol buffer message) data.

Then you can write something similar Person person; code.

Person person;
person.set_name( "John Doe" );
person.set_id( 1234 );
person.set_email( "[email protected]" );
fstream output( "myfile" , ios::out | ios::binary);
person.SerializeToOstream(&output);

Then, you can read the message:

fstream input( "myfile" , ios::in | ios::binary);
Person person;
person.ParseFromIstream(&input);
cout <<  "Name: "  << person.name() << endl;
cout <<  "E-mail: "  << person.email() << endl;

You can add new fields to your message without damaging the old news. This is because the old message handling for the new field is completely neglected. So, if you use protocol buffers in your communications protocol for data structure, you can extend your protocols and message without having to worry about the old code that is no way to compile, or damage to the old code.

You can access the  API Reference section  page content to generate understanding and use of the full protocol buffer code.

You can also in the  Protocol Buffer Encoding  understand how pages more protocol buffer message is encoded.

Why not use XML

XML Protocol Buffers for it to have a data structure of a sequence of more advantages.

  • Easier
  • XML 3 to 10 times less than
  • Faster than XML 20 to 100 times
  • Loosely coupled
  • Using the program tools to create data access classes, the number of classes easier access

Suppose you need to speak to this person data definition, you need to use XML:

<person>
   <name>John Doe</name>
   <email>jdoe @example .com</email>
</person>

To be defined.

Protocol Buffers for in the above text of the message ( text the format displayed after) as:

# Textual representation of a protocol buffer.
# This is *not* the binary format used on the wire.
person {
   name:  "John Doe"
   email:  "[email protected]"
}

When the above message is encoded as a binary format Protocol Buffer ( binary the format ) the text above may be less than 28 bytes, and may require a 100-200 ns (nanoseconds) for processing.

We will convert the above to be human readable intended primarily for debugging and editing.

If you're using XML, then the above information requires at least 69 bytes (you need to remove all spaces), and you need 5,000-10,000 ns (nanoseconds) for processing.

Meanwhile, the protocol buffer operation is very easy:

cout <<  "Name: "  << person.name() << endl;
cout <<  "E-mail: "  << person.email() << endl;

如果使用的是 XML 的话,你需要进行下面的操作:

cout <<  "Name: "
      << person.getElementsByTagName( "name" )->item( 0 )->innerText()
      << endl;
cout <<  "E-mail: "
      << person.getElementsByTagName( "email" )->item( 0 )->innerText()
      << endl;

但是,protocol buffers 并不是任何时候都会比 XML 好。例如,针对基于文本的标记语言(例如,XML),protocol buffers 就不是一个很好的选项,因为你不能使用 protocol buffer 更好的在文档中进行交换。更主要的是 HTML 是人类可以阅读和编辑的。protocol buffer 也不是不可以人为的读取,但是针对原生的 protocol buffer 格式是没有办法人为进行读取和编辑的。

XML 与  HTML 一样,在某种程度上是一种自我描述数据。protocol buffer 只针对你在 .proto 文件中描述的内容进行表达。

看起来像一个解决方案,我应该如何开始呢?

Download the package – 这包中含有针对 Java, Python, 和 C++ protocol buffer 编译器源代码,和你需要进行 I/O 和测试的类。希望对你的编译器进行编译和构建,请参考代码中的 README 文件。

一旦你完成了所有的设置,请参考 tutorial 页面中的内容来选择你需要的语言——这个能够帮助你使用 protocol buffer 创建一个简单的应用程序。

介绍 proto3

在我们最新的 version 3 发行版 中推出了新的语言版本 —— Protocol Buffers language version 3(另称 proto3),在这个版本中针对我们已经存在的语言版本(proto2)使用了一些新的特性。

Proto3 简化了 protocol buffer 语言,使其更加容易使用并且能够支持更多的语言:我们当前发行的 proto3 能够让你创建 Java, C++, Python, Java Lite, Ruby, JavaScript, Objective-C, and C#。

另外你也可以通过使用 Go protoc 插件来用 proto3 创建 Go 代码,这个插件你可以到 golang/protobuf Github 中下载到。更多的语言还在逐步进行支持中。

请注意,这 2 个版本的 API 并不是完全兼容的。为了照顾还在使用老版本的用户,我们将会在新的 protocol buffers 发行中同时支持老的版本。

你可以在下面的发行日志(release notes)查看 2 个版本的主要不同。有关 proto3 的句法,请参考 Proto3 Language Guide 中的内容,针对 proto3 的完整文档还没有编写完成,将会随后推出。

看起来 proto2 和 proto3 可能会产生一些混淆,这是因为原始的开源  protocol buffers 实际上是 Google 内部语言的第二个版本,同时我们的开源版本也是从 v2.0.0 开始的。简单来说就是 proto 最开始的版本是 Google 内部使用的,在 proto 第二个版本的时候,Google 决定进行开源了,所以开源的 proto 是从 proto2 开始的。

一个简短的历史

Protocol buffers 最开始是在 Google 内部进行开发的,用于处理在索引服务器上请求/响应(request/response)的协议。

Before Protocol buffers, for requests and responses, using marshalling / unmarshalling, this can support a range of protocols. But the results look is very ugly, for example:

if  (version ==  3 ) {
   ...
else  if  (version >  4 ) {
   if  (version ==  5 ) {
     ...
   }
   ...
}

The agreement also clearly formatted so that the new version of the protocol more difficult to launch because the developer must be able to understand how the old protocol is handled between servers, but also need to understand the new agreement. Only after the old and new protocols have to understand in order to gradually replace the old agreement to use the new protocol.

Protocol buffers are designed to solve many problems above:

  • The relatively new field can easily be defined, intermediate server does not need to check the data, process the data directly, but also can directly transfer data without the need to understand how data is defined.
  • Self-describing format, can more easily support additional languages ​​(C ++, Java, etc.).

However, users still need to manually write their own processing diam.

As the evolution of the system, it won many other features and uses:

  • Automatically generating serialization and deserialization code to avoid manually written code.
  • In addition to short start using RPC (Remote Procedure Call) request, people started using protocol buffers as an efficient self-describing data format structure (mainly for short-term data is present, for example, in Bigtable).
  • Server RPC interface starts to be declared as part of the agreement document, the compiler generates root class protocol, the user interface can be implemented by the server and overload them.

Protocol buffers become the common language for the Google data - over time, within Google already has more than 348,952 .proto file is defined. RPC is used in these systems and the storage system to store data.

https://www.cwiki.us/display/ProtocolBuffers/Developer+Guide

Guess you like

Origin www.cnblogs.com/huyuchengus/p/11241849.html