5.3.4 Hadoop framework sequences

  Framework sequences

In addition writable implement serialization addition, so long as the realization and the binary stream conversion type, can be used as a sequence of hadoop types, aims to provide a serial interface to the Hadoop framework, they package org.apache.hadoop.io.serializer in, Writable as MapReduce support but also because of the type of framework implements this interface. Use the process to achieve a defined sequence framework interface class -> io.serializations parameter configuration sequence of the class name, class name with a comma-separated list -> SerializationFactory constructor reads the configuration, create a sequence in accordance with the class name and reflection object, stored in the queue -> by SerializationFactory obtaining the sequence of the object function getSerializer (class <T> c), the reference is to obtain the object class name.

Take a look at the earlier sequence of open interfaces, as well as writable is how to achieve.

(1 ) serial interfaces the Serializer :

Open stream, serialization, close stream

public interface Serializer <T>  {

    void open(java.io.OutputStream outputStream) throws java.io.IOException;

    void serialize(T t) throws java.io.IOException;

    void close() throws java.io.IOException;

}

(2 ) deserializing interfaces: Deserializer

It defines a set of interfaces, open stream, deserialized, close stream

public interface Deserializer <T>  {

    void open(java.io.InputStream inputStream) throws java.io.IOException;

    T deserialize(T t) throws java.io.IOException;

    void close() throws java.io.IOException;

}

(3 ) determining the sequence and examples of obtaining interface

Accept function to determine whether to support the serialization request is a subclass of the writable, getSerializer function returns the serialized instances, getDeserializer obtain deserialized instance. Function call interface by way of example to implement the serialization function.

public interface Serialization <T>  {

    boolean accept(java.lang.Class<?> aClass);

    org.apache.hadoop.io.serializer.Serializer<T> getSerializer(java.lang.Class<T> tClass);

    org.apache.hadoop.io.serializer.Deserializer<T> getDeserializer(java.lang.Class<T> tClass);

}

(4 ) defines the sequence of the class

Writable class defined sequence to achieve the above three interfaces, an interface implemented in serialization and deserialization function.

public class WritableSerialization extends Configured

  implements Serialization<Writable> {

  // define static class deserialize

  static class WritableDeserializer extends Configured

    implements Deserializer<Writable> {

 

    private Class<?> writableClass;

    private DataInputStream dataIn;

    // define a constructor

    public WritableDeserializer(Configuration conf, Class<?> c) {

      setConf(conf);

      this.writableClass = c;

    }

    // Open input stream

    public void open(InputStream in) {

      if (in instanceof DataInputStream) {

        dataIn = (DataInputStream) in;

      } else {

        dataIn = new DataInputStream(in);

      }

    }

    // deserialize function, data is read

    public Writable deserialize(Writable w) throws IOException {

      Writable writable;

      if (w == null) {

        writable

          = (Writable) ReflectionUtils.newInstance(writableClass, getConf());

      } else {

        writable = w;

      }

      writable.readFields(dataIn);

      return writable;

    }

 // close the input stream

    public void close() throws IOException {

      dataIn.close();

    }

   

  }

  Class sequence

  static class WritableSerializer implements Serializer<Writable> {

    private DataOutputStream dataOut;

    // open the output stream

    public void open(OutputStream out) {

      if (out instanceof DataOutputStream) {

        dataOut = (DataOutputStream) out;

      } else {

        dataOut = new DataOutputStream(out);

      }

    }

 Serialized write data function

    public void serialize(Writable w) throws IOException {

      w.write(dataOut);

    }

 // close the output stream

    public void close() throws IOException {

      dataOut.close();

    }

 

  }

 // determine whether it is writable subclass

  public boolean accept(Class<?> c) {

    return Writable.class.isAssignableFrom(c);

  }

 // returns the deserialized object

  public Deserializer<Writable> getDeserializer(Class<Writable> c) {

    return new WritableDeserializer(getConf(), c);

  }

 // Returns the serialized object

  public Serializer<Writable> getSerializer(Class<Writable> c) {

    return new WritableSerializer();

  }

}

(5 ) Chemical Sequence

public class SerializationFactory extends Configured {

    private static final Log LOG = LogFactory.getLog(SerializationFactory.class.getName());

    private List<Serialization<?>> serializations = new ArrayList();

 

    public SerializationFactory(Configuration conf) {

        super (conf);

// Serializations determined by the read configuration information conf io.serializations parameter of this parameter is a comma-separated list of class names of

        String[] arr$ = conf.getStrings("io.serializations", new String[]{WritableSerialization.class.getName(), //默认包含Writable和Avro

AvroSpecificSerialization.class.getName(), AvroReflectSerialization.class.getName()});

        int len$ = arr$.length;

 

        for(int i$ = 0; i$ < len$; ++i$) {

            String serializerName = arr$[i$];

            this.add(conf, serializerName);

        }

 

    }

// add to the list of functions

    private void add(Configuration conf, String serializationName) {

        try {

            Class<? extends Serialization> serializionClass = conf.getClassByName(serializationName);

// Create a serialized object class name and reflection, save to serializations of the List

 this.serializations.add((Serialization)ReflectionUtils.newInstance(serializionClass, this.getConf()));

        } catch (ClassNotFoundException var4) {

            LOG.warn("Serialization class not found: ", var4);

        }

 

    }

 

    public <T> Serializer<T> getSerializer(Class<T> c) {

        Serialization<T> serializer = this.getSerialization(c);

        return serializer != null ? serializer.getSerializer(c) : null;

    }

 

    public <T> Deserializer<T> getDeserializer(Class<T> c) {

        Serialization<T> serializer = this.getSerialization(c);

        return serializer != null ? serializer.getDeserializer(c) : null;

    }

 

    public <T> Serialization<T> getSerialization(Class<T> c) {

        Iterator i$ = this.serializations.iterator();

 

        Serialization serialization;

        do {

            if (!i$.hasNext()) {

                return null;

            }

            serialization = (Serialization)i$.next();

        } while(!serialization.accept(c));

        return serialization;

    }

}

 

 

Other framework sequence comparison

 

ObjectInput(Out)Stream

1. does not work across languages. Internal private protocol, the stream is too 2. sequence. java serialization more than five times the size of the binary-coded!
3. The sequence of the performance is too low. java serialization performance only 6.17 times the binary-coded.

google's Protobuf

1. The structured data storage format (XML, JSON, etc.)
2. high-performance codec technology
3. languages and platform-independent, good scalability
4. Support java, C ++, Python three languages. protostuff may be the best choice. Compared to protostuff As an added benefit kyro, that is, if after serialization, before deserialization this time, java class increases the field (which is unavoidable thing in the actual business), kyro scrap it. But protostuff as long as the new field is added at the end of the class, and using a series of sun JDK, it is normally used.

faceBook的Thrift

1.Thrift support multiple languages (C ++, C #, Cocoa , Erlag, Haskell, java, Ocami, Perl, PHP, Python, Ruby, and SmallTalk)
2.Thrift suitable for large-scale data storage and exchange tools set up for large systems the internal data transfer, and with respect to Json xml in performance on transfer size and have a distinct advantage.
3.Thrift supports three typical encoding. (Universal binary encoding, optionally compressed binary field coding, optimized compression codec)

Kryo

1. The speed, the sequence of small size
2. Cross-language support more complex

hessian

1. Default support cross-language
2. slow

fst

fst is fully compatible with JDK serialized serialized protocol framework, the sequence is about 4-10 times the speed of the JDK, the size is about 1/3 the size of the JDK.

Gson

Gson is the most versatile Json parsing artifact, Gson application mainly toJson and fromJson two conversion functions, no dependence, no exceptions extra jar, can be directly run on JDK. But before using this type of object conversion you must first create a good object and its members will be successful successfully converted into the corresponding JSON string object. As long as the class which get and set methods, can Gson complex type or bean to bean to json json conversion is JSON parsing artifact.

FastJson

Fastjson Java language is written in a high-performance JSON processor, developed by Alibaba. No dependence, no additional jar exception, can be directly run on JDK. FastJson will appear on a complex type of Bean convert Json some of the problems, the type of reference may occur, resulting Json conversion error, the need for reference. FastJson using original algorithm, we will parse the speed to the extreme, more than all the json library.

 

 

 

Quote:

https://www.jianshu.com/p/937883b6b2e5

https://www.helplib.com/Java_API_Classes/article_62465

https://blog.csdn.net/lipeng_bigdata/article/details/51202764

 

Himself developed an intelligent stock analysis software, very powerful, you need to click on the link below to obtain:

https://www.cnblogs.com/bclshuai/p/11380657.html

 

Guess you like

Origin www.cnblogs.com/bclshuai/p/11795911.html