Serialization: Interprocess communication and persistent storage Features: compact fast Scalability Interoperable, cross-language java serialization: ObjectInput(Output)Stream hadoop的writable: PersonWritable // java, not cross-language euro Created by doug cutting, the father of hadoop avro and hadoop serialization comparison: =============================== writable: not cross-language avro: Cross-language, the supported languages are as follows c/ cpp/ c#/ java/ js/ perl/ php/ py/ py3/ ruby/ 1. Create the emp.avsc file with the following contents { "namespace": "tutorialspoint.com", "type": "record", "name": "Emp", "fields": [ {"name": "name", "type": "string"}, {"name": "id", "type": "int"}, {"name": "salary", "type": "int"}, {"name": "age", "type": "int"}, {"name": "address", "type": "string"} ] } 2. Put the avro-1.8.2.jar and avro-tools- 1.8.2.jar files in the same directory as emp.avsc 3. Compile the schema file java -jar avro-tools-1.8.2.jar compile schema emp.avsc . 4. View the generated file tutorialspoint\com\Emp.java文件 content include: Constructor builder get && set Serialization and Deserialization Methods 5. Load this file into the ide, 1. Modify the pom file <dependency> <groupId>org.apache.avro</groupId> <artifactId>avro</artifactId> <version>1.8.2</version> </dependency> <dependency> <groupId>org.apache.avro</groupId> <artifactId>avro-tools</artifactId> <version>1.8.2</version> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.12</version> </dependency> 2. Create a new package, named tutorialspoint.com 3. Copy the Emp.java file into the package 4. Solve the code error 6. Start writing serialization code @Test public void testAvroSerial() throws Exception { Emp e = new Emp(); e.setId( 10 ); e.setName("tom"); e.setAge(20); e.setSalary ( 1000 ); e.setAddress( "shahe" ); // Initialize writer DatumWriter<Emp> dw = new SpecificDatumWriter<Emp>(Emp.class ) ; // Initialize file writer DataFileWriter<Emp> dfw = new DataFileWriter<Emp> (dw ); // Start serializing the file dfw.create(Emp.SCHEMA$, new File("F:/avro/emp.avro" )); // Append the object to the sequence file dfw.append(e); dfw.close(); System.out.println("ok"); } } 7. Test java, hadoop, avro to compare the serialization speed and size of 1,000,000 objects java writable avro ------------------------------------------------------------- size 4,883kb 23,438kb 13,677kb serial 3025ms 29410ms 1384ms 8. Write deserialization code @Test public void testAvroDeSerial() throws Exception { long start = System.currentTimeMillis(); //初始化reader DatumReader<Emp> dr = new SpecificDatumReader<Emp>(Emp.class); //初始化文件阅读器 DataFileReader<Emp> dfr = new DataFileReader<Emp>(new File("F:/avro/emp.avro"),dr); while (dfr.hasNext()){ Emp emp = dfr.next(); //System.out.println(emp.toString()); } System.out.println(System.currentTimeMillis() - start); } 9. Test java, hadoop, avro to compare the deserialization speed of 1,000,000 objects java writable avro ------------------------------------------------------------- size 4,883kb 23,438kb 13,677kb serial 3025ms 29410ms 1384ms 1802ms dessert 3860ms 26232ms 1972ms 1689ms 10. avro serializes objects by directly using schema without generating code Google Protobuf ================================ Simple and efficient serialization technology, published by Google in 2008 Cross-language support: Java, C++, and Python C, C#, Erlang, Perl, PHP, Ruby java - avro - pb(protobuf) javaBean schema(json) proto 1. Create emp.proto self-describing file (non-java file) package tutorial; option java_package = "tutorialspoint.com"; option java_outer_classname = "Emp2"; message Emp { required int32 id = 1; required string name = 2; required int32 age = 3; required int32 salary = 4; required string address = 5; } 2. Put emp.proto and protobuf\src\protoc.exe in the same folder (F:/ avro) 3. Enter cmd and compile emp.proto protoc --java_out=. emp.proto 4. Place Emp2.java under F:\avro\tutorialspoint\com in idea, package name tutorialspoint.com