Wu Yuxiong - natural born HADOOP performing experiments study notes: Introduction to Distributed and RPC communication

Purpose

Master GOF design patterns proxy mode

Understand the socket programming, java reflection, dynamic proxies

Learn NIO, multithreading

Master hadoop framework uses the RPC API

Principle

1. What is the RPC
  in the past hadoop happens, we write programs are generally stand-alone version can only be processed on a single machine, but a machine's processing power is always limited, so that we can write hadoop distributed program the joint to a plurality of processing nodes together. Communication between nodes distributed programs need to rely on a network, a simple idea is to deploy a Web server, such as tomcat, but this will make the whole structure is too large redundancy. More simply, we need a more lightweight realize their communications regulatory framework, which is what we call RPC communications framework.
  RPC (Remote Procedure Call) - remote procedure call protocol, which is a service request from a remote computer through a network, without having to understand the underlying network protocol technology. RPC protocol is assumed that there is some transmission protocols, such as TCP or UDP, this is the communication between information carrying program data. In the OSI model, network communications, RPC across the transport layer and application layer. RPC allows the development of applications, including network distributed multi-program, including easier. So we know that a logical method RPC framework is divided into client and server, the client sends a request to the server, the server calls the corresponding results will be returned to the client, the whole process is so simple, but to achieve it We need to consider many factors network bandwidth, execution performance.

2. How to implement a framework RPC
  proxy mode is Aspect Oriented Programming design patterns often use dynamic proxy is jdk tool to help us, we can take advantage of InvocationHandle class and the Proxy class to a dynamic proxy class method calls.
  We implement a simple RPC process:

1.Client RPC clients to obtain a proxy object proxy.

2. invoke methods on proxy, is implemented InvocationHandler invoke class Invoker () method of capture.

The 3.invoke () method of the RPC request encapsulated into Invocation instance, again Server RPC requests.

4.Server receiving end of the circulation RPC request, for each request to create a Handler thread.

5.Handler thread from the input stream Invocation deserialize an instance, then the terminal to Call Server.

6. call ends and returns the results to the Client-side call.

RPC framework 3.hadoop of
   Hadoop RPC mechanism built in Java's DynamicProxy (dynamic proxy), basic NIO, socket programming, we can call the Client-Server-side approach, just like the same way as a local call, while shielding the middle complex process.

lab environment

1. OS
  operating machine: Windows_7
  manipulator default user name: Hongya, Password: 123456

2. Experimental tool IntelliJ IDEA
  

 

 

IDEA stands for IntelliJ IDEA, a java language development integrated environment, IntelliJ in the industry is recognized as one of the best java development tools, especially in the smart code that automatically prompts assistant, code refactoring, J2EE support, Ant, JUnit, CVS integration area, code review, and innovative GUI design and other functions can be said to be exceptional. JetBrains IDEA is the company's products, the company is headquartered in Prague, capital of the Czech Republic, the developer rigorous known mainly Eastern European programmers. Advantages: 1. The most prominent natural feature is debugging (Debug), can debug Java code, JavaScript, JQuery, Ajax and other technologies. 2. Other editing features not look aside, this is far better than Eclipse. 3. First Map view object type, if the implementation class uses a hash map, the filter is automatically empty Entry examples. Unlike Eclipse, only () method to find key you want in the default toString. 4. Secondly, the need for a value expression of the dynamic Evaluate, for example, I have an instance of a class, but does not know its API, Code Completion can point out the methods it supports, and this point can not match Eclipse. 5. Finally, in the case of multi-threaded debugging, Log on console functions can help you check the implementation of multi-threading. Cons: 1. The lack of plug-in development, compared to Eclipse, IDEA plug-in can only be considered a dwarf, less than the current official plugins 400, and many plug-ins and no substantive things, IDEA itself may be too strong. 2. Support in the same page only a single project, which brings some inconvenience for the development, especially when the project is built like a test developed to test programmers to bring part of the method of psychological do not agree. 3. The lack of technical articles, the current network can be found basically no technical support, technical articles are also few and far between.

  
  
  
  
  
  

  
  
  
  
  4. Resource consumption is relatively large, large and medium build a J2EE project, started basic to more than 200M memory support, including installation of software, including almost 500M of hard disk space to support. (As many intelligent functions in real time, all classes, including the class includes systems are stored IDEA to IDEA's working path). Features:  intelligent selection of   rich navigation mode   history function perfectly support of JUnit   for the reconstruction of the superior support encoding auxiliary   flexible layout capabilities of XML perfect support for   dynamic grammar detection   code checking   , and so on.

  



  

  

  


Step 1: Modify the map to view the development environment

  1.1 First enter the operating machine C: \ Windows \ System32 \ the Drivers \ etc \ hosts , modify the mapping (to test IP prevail). See below:

 

 

1.2 turn on the computer, double-click the icon idea64 our editor, and enter the code package to find local course is located in hellohadoop | src | com.hongya | day009 , inside the two packages are the two test content, the code has been implemented well, you can create your own code package, re-implement their own again.

 

 

 

 2.1 client definition operation we want to achieve Class: Login. We open the code "com.hongya | day009 | myrpc | client | Login" can be seen, Login is an interface, only one method is not implemented!

 

 

2.2 server implementation class defined Login: LoginImpl. We open the code "com.hongya | day009 | myrpc | server | LoginImpl" can be seen to achieve its login method!

 

 

2.3定义Invocation,此类是用来传递客户端和服务器端的联系参数的代码为:"com.hongya|day009|myrpc|server|Invocation",内容为class、方法名、参数、返回值。

 

 

 

 

 

 

 2.4定义MyInvocationHandler,实现java的InvocationHandler接口代码为:"com.hongya|day009|myrpc|server|MyInvocationHandler",我们主要看他的invoke方法实现,可以看出它将我们要调用的class、方法名、参数通过ObjectOutputStream写出到socket,然后得到socket的返回值。

 

 

 

 

 

 

 

 

 

 

2.5定义RPCServer,不断监听端口的请求代码为:"com.hongya|day009|myrpc|server|RPCServer"此部分代码比较复杂,主要是启动一个线程,对客户端写来的数据进行解析,然后调用对应的方法,将返回值写进socket流。

 

 

 

 

 

 

 

 

 

 

 

 

2.6定义ClientTest和ServerTest。都在"com.hongya|day009|myrpc"包下,每个都有main方法。其中server端是监听端口产生代理对象,client是调用RPC的的getProxy方法得到代理对象,调用login方法。

 

 

3.1理清代码逻辑。这部分比较复杂,需要大家思路清晰,整个代码如下:

 

 

我们的调用顺序是:ServerTest启动,new一个RPCServer,调用start方法,启动线程监听,Client启动,得到代理的Proxy类,调用login方法,将请求封装为Invocation向socket发送ObjectOutputStream,Server端线程监听到后,得到Invocation类,执行对应方法,返回结果给client。

  3.2启动Server端。点开ServerTest,点击右边绿色三角形,run,结果如图:

 

 

 

 

启动程序以后可能会报异常,这和本地环境有关系,只要程序还没停止,就可以继续实验。判断进程是否停止的方法可以看最后的步骤6:关闭进程。

 

 

3.3Client调用。点开ClientTest,点击右边绿色三角形,点击"run",客户端的并没有实现Login的方法,但是依然有输出,因为远程调用了Server端的方法。我们的服务器如果不在本地,也可以实现类似的功能

 

 

 

 

关闭进程。参考步骤6。

步骤4:使用hadoop自带的RPC框架的API

  4.1Client端自定义接口。接口的完全限定名为:com.hongya|day009|hadooprpc|LoginProtocol。注意这里和自定义的RPC框架有所不同,需要添加一个versionId的属性,这是hadoop框架规定的。

 

 

 

 

4.2Server端的实现类。实现类为:com.hongya|day009|hadooprpc|LoginProtocolImpl,和自定义一样,只需要实现具体方法

 

 

 

 

 

 

4.3Server端启动监听代码。实现类为:com.hongya|day009|hadooprpc|Server,方法内容和自定义的Server端类似。注意代码内容,需要按照hadoop的API实现,这部分的源代码全部在org.apache.hadoop.ipc.RPC中:

RPC.Server server = new RPC.Builder(new Configuration()).setBindAddress("localhost").setPort(8888)

                .setProtocol(LoginProtocol.class).setInstance(new LoginProtocolImpl()).build();

        server.start();

 

 

 

 

 

 

4.4Client端发送远程请求。注意这里的Client和Server端的Ip都是localhost,通过本地模拟远程通信,如果你可以和别人合作,也可以修改这里的IP地址,但是注意Client和Server要保持一致。

 

 

 

 

 

 

步骤5:测试基于Hadoop的PRC通信实例

  5.1理清思路逻辑。这时候我们只有这四各类,一个接口和他的实现类、一个客户端程序、一个服务端程序。因为我们利用hadoop自带的RPC框架,调用过程和上面自定义的框架是一样的,只是中间的动态代理和Socket通信以及多线程监听由hadoop实现了。

 

 

5.2启动服务端,点击Server类右边的绿色三角形,启动程序:

 

 

5.3启动客户端,点击LoginClient类右边的绿色三角形,启动客户端程序,启动服务端后,不停止的话,会一直运行这个进程。然后可以多次执行步骤5.2,每次都会输出我们在LoginProtocolImpl实现的逻辑。这里报的异常信息和本地环境有关系,可以忽略。

 

 

步骤6:关闭进程

  6.1注意每次运行程序控制台会有输出信息,如果左边还有一个红点,就没有停止,实验结束需要点一下这个红点,停止这个进程:

 

 

 

 

 

 

 

Guess you like

Origin www.cnblogs.com/tszr/p/12168193.html