avro is a sub-project of hadoop under apache, with serialization, deserialization, and RPC functions. The serialization efficiency is higher than jdk, comparable to Google's protobuffer, and better than facebook's open source Thrift (later managed by apache).
Because avro uses schema, if you are serializing a large number of objects of the same type, you only need to save a copy of the class structure information + data, which greatly reduces the amount of network communication or data storage
example:
Create a new maven project
pom.xml
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.jv</groupId>
<artifactId>avro</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>avro</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<compiler-plugin.version>2.3.2</compiler-plugin.version>
<avro.version>1.7.5</avro.version>
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.10</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-simple</artifactId>
<version>1.6.4</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
<version>1.7.5</version>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro-ipc</artifactId>
<version>1.7.5</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>${compiler-plugin.version}</version>
</plugin>
<plugin>
<groupId>org.apache.avro</groupId>
<artifactId>avro-maven-plugin</artifactId>
<version>1.7.5</version>
<executions>
<execution>
<id>schemas</id>
<phase>generate-sources</phase>
<goals>
<goal>schema</goal>
<goal>protocol</goal>
<goal>idl-protocol</goal>
</goals>
<configuration>
<sourceDirectory>${project.basedir}/src/main/avro/</sourceDirectory>
<outputDirectory>${project.basedir}/src/main/java/</outputDirectory>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
This section configures the code generation plugin, you need to create a src/main/avro source file directory for the project
The specific steps are:
Write the schema file user.avsc:
{
"namespace": "com.jv.avro",
"type": "record",
"name": "User",
"fields": [
{"name": "username","type": "string"},
{"name": "age","type": ["int", "null"]},
{"name": "address","type": ["string", "null"]}
]
}
namespace: namespace, when using the plugin to generate code, the package name of the User class is it
type: There are records, enums, arrays, maps, unions , fixed values, records are equivalent to ordinary classes
name: class name, the full name of the class consists of namespace+name
doc: Comments
aliases: aliases taken, other places can use aliases to refer to
fields: attribute
name: attribute name
type: attribute type, which can be used ["int", "null"] or ["int", 1] to execute the default value
default: You can also use this field to specify a default value
doc: Comments
Generate code from schema definitions
Follow the steps circled in the picture
Observe whether SUCCESS is output in the console, if yes, it means success
test code
package com.jv.test;
import java.io.File;
import java.io.IOException;
import org.apache.avro.file.DataFileReader;
import org.apache.avro.file.DataFileWriter;
import org.apache.avro.io.DatumReader;
import org.apache.avro.io.DatumWriter;
import org.apache.avro.specific.SpecificDatumReader;
import org.apache.avro.specific.SpecificDatumWriter;
import com.jv.avro.User;
public class TestAvro {
public static void main(String[] args) throws IOException {
//实例化代码方式1
User user1 = new User();
user1.setUsername(new String("Messi"));
user1.setAddress("Barcelona");
user1.setAge(30);
//实例化代码方式3
User user2 = new User(new String("Messi"),30,"巴塞罗那");
//实例化代码方式3
User user3 = new User().newBuilder().setUsername("Havi").setAge(34).setAddress("卡塔尔").build();
//序列化对象并保存到文件中
DatumWriter<User> userDatumWriter = new SpecificDatumWriter<User>(User.class);
DataFileWriter<User> dataFileWriter = new DataFileWriter<User>(userDatumWriter);
dataFileWriter.create(user1.getSchema(), new File("users.avro"));
dataFileWriter.append(user1);
dataFileWriter.append(user2);
dataFileWriter.append(user3);
dataFileWriter.close();
//从文件中反序列化对象输出
DatumReader<User> userDatumReader = new SpecificDatumReader<User>(User.class);
DataFileReader<User> dataFileReader = new DataFileReader<User>(new File("users.avro"), userDatumReader);
User user = null;
while (dataFileReader.hasNext()) {
user = dataFileReader.next(user);
System.out.println(user);
}
}
}
output after running