hadoop shell命令远程提交

hadoop shell命令远程提交
一,hadoop shell命令远程提交原理
    hadoop shell命令执行目前很多场景下面主要通过 Linux shell来交互操作,无论对于远程操作还是习惯于windows/web操作的开发人员而言,也是非常痛苦的事情。
    在hadoop安装包中的src\test\org\apache\hadoop\cli\util 目录中,CommandExecutor.java地实现方式或许对大家有一定启发。
    如下是一段hadoop dfsadmin 命令执行的过程。
public static int executeDFSAdminCommand( final String cmd, final String namenode) {
      exitCode = 0;

      ByteArrayOutputStream bao = new ByteArrayOutputStream();
      PrintStream origOut = System.out;
      PrintStream origErr = System.err;

      System.setOut( new PrintStream(bao));
      System.setErr( new PrintStream(bao));

      DFSAdmin shell = new DFSAdmin();
      String[] args = getCommandAsArgs(cmd, "NAMENODE", namenode);
      cmdExecuted = cmd;

      try {
        ToolRunner.run(shell, args);
      } catch (Exception e) {
        e.printStackTrace();
        lastException = e;
        exitCode = - 1;
      } finally {
        System.setOut(origOut);
        System.setErr(origErr);
      }

      commandOutput = bao.toString();

      return exitCode;
  }
   
    在开始阶段,通过System.setOut和System.setErr来设置当前应用程序的标准输出和错误输出流方式是ByteArrayOutputStream。
    初始化DFSAdmin shell之后,调用ToolRunner.run方法运行 args的命令参数。
    当调用完成之后,重新设置标准输出和标准错误输出的方式为默认方式。
    DFSAdmin对象类似于hadoop shell命令的 "hadoop dfsadmin "
    MRAdmin对象类似于hadoop shell命令的 "hadoop mradmin "
    FsShell 对象类似于hadoop shell命令的 "hadoop fs "
二,利用内置jetty方式,开发jetty servlet来实现一个基于web远程方式提交hadoop shell命令的基本操作。
    1, 设计一个html页面向servlet提交命令参数,如下图:
    2,servlet程序编写,如下:
  
        PrintWriter writer =response.getWriter();
        response.setContentType( "text/html");
        if(request.getParameter( "select_type") ==null){
            writer.write( "select is null");
            return;
        }
        if(request.getParameter( "txt_command") ==null){
            writer.write( "command is null");
            return;
        }
        String type =request.getParameter( "select_type");
        String command =request.getParameter( "txt_command");
        ByteArrayOutputStream bao = new ByteArrayOutputStream();
        PrintStream origOut = System.out;
        PrintStream origErr = System.err;

        System.setOut( new PrintStream(bao));
        System.setErr( new PrintStream(bao));
        if(type.equals( "1")){
            DFSAdmin shell = new DFSAdmin();
            String[] items =command.trim().split( " ");
            try{
                ToolRunner.run(shell,items);
            }
            catch (Exception e) {
                e.printStackTrace();
            }
            finally{
                System.setOut(origOut);
                System.setErr(origErr);
            }
            writer.write(bao.toString().replaceAll( "\n", "<br>"));
        }
        else if(type.equals( "2")){
            MRAdmin shell = new MRAdmin();
            String[] items =command.trim().split( " ");
            try{
                ToolRunner.run(shell,items);
            }
            catch (Exception e) {
                e.printStackTrace();
            }
            finally{
                System.setOut(origOut);
                System.setErr(origErr);
            }
            writer.write(bao.toString().replaceAll( "\n", "<br>"));
        }
        else if(type.equals( "3")){
            FsShell shell = new FsShell();
            String[] items =command.trim().split( " ");
            try{
                ToolRunner.run(shell,items);
            }
            catch (Exception e) {
                e.printStackTrace();
            }
            finally{
                System.setOut(origOut);
                System.setErr(origErr);
            }
            writer.write(bao.toString().replaceAll( "\n", "<br>"));
        }
    上述程序主要用于简单处理dfsadmin,mradmin,fs等hadoop shell,并且最终以字符串打印输出到客户端
   
    简单测试 -report的结果,截取部分图片如下:
Configured Capacity: 7633977958400 (6.94 TB)
Present Capacity: 7216439562240 (6.56 TB)
DFS Remaining: 6889407496192 (6.27 TB)
DFS Used: 327032066048 (304.57 GB)
DFS Used%: 4.53%
Under replicated blocks: 42
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 4 (4 total, 0 dead)

Name: 10.16.45.226:50010
Decommission Status : Normal
Configured Capacity: 1909535137792 (1.74 TB)
DFS Used: 103113867264 (96.03 GB)
Non DFS Used: 97985679360 (91.26 GB)
DFS Remaining: 1708435591168(1.55 TB)
DFS Used%: 5.4%
DFS Remaining%: 89.47%
Last contact: Wed Mar 21 14:37:24 CST 2012
   
   

上述代码利用jetty内嵌方式开发,运行时候还需要加载hadoop 相关依赖jar以及hadoop config文件,如下图所示意:

#!/bin/sh
CLASSPATH = "/usr/local/hadoop/conf"
for f in $HADOOP_HOME /hadoop -core - *.jar; do
  CLASSPATH =${CLASSPATH} :$f;
done
# add libs to CLASSPATH
for f in $HADOOP_HOME /lib / *.jar; do
  CLASSPATH =${CLASSPATH} :$f;
done
for f in $HADOOP_HOME /lib /jsp - 2. 1 / *.jar; do
  CLASSPATH =${CLASSPATH} :$f;
done

echo $CLASSPATH
java - cp "$CLASSPATH:executor.jar" RunServer

猜你喜欢

转载自blog.csdn.net/renzhehongyi/article/details/8093916