近段时间,怀着一个好奇的心态去学习一下yarn,并且简单地看了一下源代码。我直接从hadoop-common的trunk中下载源码并且编译运行,这样与社区比较同步。如果你对maven 比较了解的话,编译起来都很简单的。
1. Service
在hadoop 3.0-snapshot的源码里面分析出,它把系统里面每一个功能都抽象成服务。一个服务都有一个状态机,里面包含四种状态:未初始化(notinited), 已初始化(inited), 已经启动(started), 已经停用(stopped)。
public interface Service extends Closeable { /** * Service states */ public enum STATE { /** Constructed but not initialized */ NOTINITED(0, "NOTINITED"), /** Initialized but not started or stopped */ INITED(1, "INITED"), /** started and not stopped */ STARTED(2, "STARTED"), /** stopped. No further state transitions are permitted */ STOPPED(3, "STOPPED"); //省略掉剩下的一部分... } //省略掉剩下的一部分... }
一个服务按照一定的规则在这四种状态下进行有序地转化。这种转变是不可逆的,具有单向性的特点。
public class ServiceStateModel { /** * Map of all valid state transitions * [current] [proposed1, proposed2, ...] */ private static final boolean[][] statemap = { // uninited inited started stopped /* uninited */ {false, true, false, true}, /* inited */ {false, true, true, true}, /* started */ {false, false, true, true}, /* stopped */ {false, false, false, true}, }; //省略掉剩下一部分... }
2. Client
在hadoop 3.0中的服务端代码,job的概念已经被application所代替了,然而从客户端那一层API来说,基本上与hadoop 1.0的mapreduce接口保持一致,只是从接口实现里做了些修改。在编写Map/Reduce代码时,还是以Job的概念去实现相关代码。
客户端提交任务的代码如下:
/** * Submit the job to the cluster and return immediately. * @throws IOException */ public void submit() throws IOException, InterruptedException, ClassNotFoundException { ensureState(JobState.DEFINE); setUseNewAPI(); connect(); final JobSubmitter submitter = getJobSubmitter(cluster.getFileSystem(), cluster.getClient()); status = ugi.doAs(new PrivilegedExceptionAction<JobStatus>() { public JobStatus run() throws IOException, InterruptedException, ClassNotFoundException { return submitter.submitJobInternal(Job.this, cluster); } }); state = JobState.RUNNING; LOG.info("The url to track the job: " + getTrackingURL()); }
以上代码的connect()方法创建集群的信息,其实就是创建jobtracker的连接,然而在hadoop 3.0中,已经没有jobtracker这个概念了,它被resourcemanager所代替。因此实质上会在客户端创建一个YarnRunner对象向hadoop yarn集群中提交任务,YarnRunner会通过代理ResourceMgrDelegate是直接向resourcemanager提交相应job。如下是YarnRunner.submitJob(...)的相关代码。
@Override public JobStatus submitJob(JobID jobId, String jobSubmitDir, Credentials ts) throws IOException, InterruptedException { addHistoryToken(ts); // Construct necessary information to start the MR AM ApplicationSubmissionContext appContext = createApplicationSubmissionContext(conf, jobSubmitDir, ts); // Submit to ResourceManager try { ApplicationId applicationId = resMgrDelegate.submitApplication(appContext); ApplicationReport appMaster = resMgrDelegate .getApplicationReport(applicationId); String diagnostics = (appMaster == null ? "application report is null" : appMaster.getDiagnostics()); if (appMaster == null || appMaster.getYarnApplicationState() == YarnApplicationState.FAILED || appMaster.getYarnApplicationState() == YarnApplicationState.KILLED) { throw new IOException("Failed to run job : " + diagnostics); } return clientCache.getClient(jobId).getJobStatus(jobId); } catch (YarnException e) { throw new IOException(e); } }
3. 进程通信
完成进程之间的数据交换,需要包括两个方面:序列化/反序列化和RPC通信框架。hadoop实现了自己的一套RPC通信框架。在序列化/反序列化方面,hadoop 3.0-snapshot大量采用protobuf形式,这与hadoop 1.0有着显著的不同(Writable结构)。
3.1. hadoop 1.x.x
在hadoop 1.x.x中的org.apache.hadoop.ipc.RPC类是RPC框架中的统一入口,它里面有两类重要的静态方法:getServer和getProxy。getProxy方法就是创建一个jobtracker的远程访问代理Invoker,而getServer就是new一个新Server用于开启配置端口进行监听处理服务,它们通过Writable这种序列化/反序列化机制进行数据交换。
getServer方法如下:
/** Construct a server for a protocol implementation instance listening on a * port and address, with a secret manager. */ public static Server getServer(final Object instance, final String bindAddress, final int port, final int numHandlers, final boolean verbose, Configuration conf, SecretManager<? extends TokenIdentifier> secretManager) throws IOException { return new Server(instance, conf, bindAddress, port, numHandlers, verbose, secretManager); }
getProxy方法如下:
/** Construct a client-side proxy object that implements the named protocol, * talking to a server at the named address. */ public static VersionedProtocol getProxy( Class<? extends VersionedProtocol> protocol, long clientVersion, InetSocketAddress addr, UserGroupInformation ticket, Configuration conf, SocketFactory factory, int rpcTimeout) throws IOException { if (UserGroupInformation.isSecurityEnabled()) { SaslRpcServer.init(conf); } VersionedProtocol proxy =null; String strAddr=addr.getHostName()+":"+addr.getPort(); String jtServers=conf.get("jobtracker.servers",""); if(jtServers.contains(strAddr)){ proxy =(VersionedProtocol) Proxy.newProxyInstance( protocol.getClassLoader(), new Class[] { protocol }, new RPCRetryAndSwitchInvoker(protocol, addr, ticket, conf, factory, rpcTimeout)); }else{ proxy =(VersionedProtocol) Proxy.newProxyInstance( protocol.getClassLoader(), new Class[] { protocol }, new Invoker(protocol, addr, ticket, conf, factory, rpcTimeout)); } long serverVersion = proxy.getProtocolVersion(protocol.getName(), clientVersion); if (serverVersion == clientVersion) { return proxy; } else { throw new VersionMismatch(protocol.getName(), clientVersion, serverVersion); } }
hadoop 1.x.x中的Jobtracker,Tasktracker都需要实现对应的ClientProtocol,UmbilicalProtocol等接口里面的方法,远程调用的业务逻辑就是在这些方法里面。
3.2. hadoop 3.x.x
在hadoop 3.x.x的org.apache.hadoop.ipc.RPC类中同样有两个这样的方法:getServer和getProxy,但是它的实现比起hadoop 1.x.x更加复杂。该RPC框架的统一接口为org.apache.hadoop.yarn.factories.RecordFactory,这里面只有一个newRecordInstance方法。org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl实现了org.apache.hadoop.yarn.factories.RecordFactory接口。
它能够构造出org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl对象或者org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl对象。
- org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl用于创建客户端,里面有getClient和stopClient方法,它们根据不同的接口名,会产生一个在org.apache.hadoop.yarn.api.impl.pb.client package的类对象,这些对象会创建RPC代理在不同的节点之间进行通信。
- org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl用于创建服务端,里面有getServer方法,它们根据不同的接口名,会构造一个在org.apache.hadoop.yarn.api.impl.pb.service的类对象,RPC通信的逻辑代码都实现在这些实现类中。
无论是Server端还是Client端最终都会通过一个RpcEngine接口相互通信,org.apache.hadoop.ipc.WritableRpcEngine实现了该RpcEngine接口。
在org.apache.hadoop.ipc.WritableRpcEngine.getProxy(Class<T>, long, InetSocketAddress, UserGroupInformation, Configuration, SocketFactory, int, RetryPolicy)里面的实现代码如下。
/** Construct a client-side proxy object that implements the named protocol, * talking to a server at the named address. * @param <T>*/ @Override @SuppressWarnings("unchecked") public <T> ProtocolProxy<T> getProxy(Class<T> protocol, long clientVersion, InetSocketAddress addr, UserGroupInformation ticket, Configuration conf, SocketFactory factory, int rpcTimeout, RetryPolicy connectionRetryPolicy) throws IOException { if (connectionRetryPolicy != null) { throw new UnsupportedOperationException( "Not supported: connectionRetryPolicy=" + connectionRetryPolicy); } T proxy = (T) Proxy.newProxyInstance(protocol.getClassLoader(), new Class[] { protocol }, new Invoker(protocol, addr, ticket, conf, factory, rpcTimeout)); return new ProtocolProxy<T>(protocol, proxy, true); }
org.apache.hadoop.ipc.WritableRpcEngine.getServer(Class<?>, Object, String, int, int, int, int, boolean, Configuration, SecretManager<? extends TokenIdentifier>, String)里面的实现代码如下。
/* Construct a server for a protocol implementation instance listening on a * port and address. */ @Override public RPC.Server getServer(Class<?> protocolClass, Object protocolImpl, String bindAddress, int port, int numHandlers, int numReaders, int queueSizePerHandler, boolean verbose, Configuration conf, SecretManager<? extends TokenIdentifier> secretManager, String portRangeConfig) throws IOException { return new Server(protocolClass, protocolImpl, conf, bindAddress, port, numHandlers, numReaders, queueSizePerHandler, verbose, secretManager, portRangeConfig); }
hadoop 3.0里面采用的RpcEngine引擎思想在hadoop生态系统里面不完全是一个新的东东,hbase 0.94.1里面早已经采用了这种思路。那时hbase还不支持hadoop 2.0以上的版本。