Giraph source code analysis (a) - start ZooKeeper service

Author | White Pine

[Note: This article is original, quote Reprinted in touch with bloggers. ]

Giraph description:

Apache Giraph is an iterative graph processing system built for high scalability. For example, it is currently used at Facebook to analyze the social graph formed by users and their connections. Giraph originated as the open-source counterpart to Pregel, the graph processing architecture developed at Google and described in a 2010 paper. Both systems are inspired by the Bulk Synchronous Parallelmodel of distributed computation introduced by Leslie Valiant. Giraph adds several features beyond the basic Pregel model, including master computation, sharded aggregators, edge-oriented input, out-of-core computation, and more. With a steady development cycle and a growing community of users worldwide, Giraph is a natural choice for unleashing the potential of structured datasets at a massive scale.

principle:

Giraph Hadoop based built in the MapReduce Mapper encapsulated unused reducer. Mapper multiple iterations, each iteration is equivalent to the BSP model SuperStep. A Hadoop Job BSP equivalent to a job. Infrastructure as shown in FIG.

Giraph source code analysis (a) - start ZooKeeper service

The function of each part are as follows:

1. ZooKeeper: responsible for computation state

–partition/worker mapping

–global state: #superstep

–checkpoint paths, aggregator values, statistics

2. Master: responsible for coordination

–assigns partitions to workers

–coordinates synchronization

–requests checkpoints

–aggregates aggregator values

–collects health statuses

3. Worker: responsible for vertices

–invokes active vertices compute() function

–sends, receives and assigns messages

–computes local aggregation values

Giraph source code analysis (a) - start ZooKeeper service

Explanation

(1) experimental environment

Three servers: test165, test62, test63. test165 is also JobTracker and TaskTracker.

Test case: official website comes SSSP program, data generated simulation of their own.

运行命令:Hadoop jar giraph-examples-1.0.0-for-hadoop-0.20.203.0-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsVertex -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /user/giraph/SSSP -of org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /user/giraph/output-sssp-debug-7 -w 5

(2) In order to save space, are hereinafter all code snippet core.

(3) core-site.xml hadoop.tmp.dir set path: / home / hadoop / hadooptmp

(4) This article was written several debugging is completed, it is not the same as the text of the JobID, the reader may be understood as the same JobID.

(5) follow-up article also follow the above rules.

org.apache.giraph.graph.GraphMapper类

Giraph org.apache.giraph.graph.GraphMapper custom classes to inherit in Hadoop org.apache.hadoop.mapreduce.Mapper <Object, Object, Object, Object> classes, override the setup (), map (), cleanup () and run () method. GraphMapper class described as follows:

“This mapper that will execute the BSP graph tasks alloted to this worker. All tasks will be performed by calling the GraphTaskManager object managed by this GraphMapper wrapper classs. Since this mapper will not be passing data by key-value pairs through the MR framework, the Mapper parameter types are irrelevant, and set to Object type.”

BSP arithmetic logic is encapsulated in GraphMapper class, which has a GraphTaskManager object for handling of Job tasks. Each object corresponds to a GraphMapper computing node (compute node) BSP in.

setup (in GraphMapper class) method, create GraphTaskManager object and call its setup () method some initialization work. as follows:

Giraph source code analysis (a) - start ZooKeeper service

map () method is empty, since all operations are encapsulated in GraphTaskManager class. GraphTaskManager call object in run () method execute () method BSP iterative calculation.

Giraph source code analysis (a) - start ZooKeeper service

org.apache.giraph.graph.GraphMapper类

功能:The Giraph-specific business logic for a single BSP compute node in whatever underlying type of cluster our Giraph job will run on. Owning object will provide the glue into the underlying cluster framework and will call this object to perform Giraph work.

下面讲述setup()方法,代码如下:

Giraph source code analysis (a) - start ZooKeeper service

### 依次介绍每个方法的功能:

1、locateZookeeperClasspath(zkPathList)

找到ZK jar的本地副本,其路径为:/home/hadoop/hadooptmp/mapred/local/taskTracker/root/jobcache/job_201403270456_0001/jars/job.jar ,用于启动ZooKeeper服务。

2、startZooKeeperManager(),初始化和配置ZooKeeperManager。

定义如下:

Giraph source code analysis (a) - start ZooKeeper service

3、org.apache.giraph.zk.ZooKeeperManager 类

功能:Manages the election of ZooKeeper servers, starting/stopping the services, etc.

ZooKeeperManager类的setup()定义如下:

Giraph source code analysis (a) - start ZooKeeper service

createCandidateStamp()方法在 HDFS上 的_bsp/_defaultZkManagerDir/job_201403301409_0006/_task 目录下为每个task创建一个文件,文件内容为空。文件名为本机的Hostname+taskPartition,如下截图:

Giraph source code analysis (a) - start ZooKeeper service
运行时指定了5个workers(-w 5),再加上一个master,所有上面有6个task。

getZooKeeperServerList()方法中,taskPartition为0的task会调用createZooKeeperServerList()方法创建ZooKeeper server List,也是创建一个空文件,通过文件名来描述Zookeeper servers。

Giraph source code analysis (a) - start ZooKeeper service

首先获取taskDirectory(_bsp/_defaultZkManagerDir/job_201403301409_0006/_task)目录下文件,如果当前目录下有文件,则把文件名(Hostname+taskPartition)中的Hostname和taskPartition存入到hostNameTaskMap中。扫描taskDirectory目录后,若hostNameTaskMap的size大于serverCount(等于GiraphConstants.java中的ZOOKEEPER_SERVER_COUNT变量,定义为1),就停止外层的循环。外层循环的目的是:因为taskDirectory下的文件每个task文件时多个task在分布式条件下创建的,有可能task 0在此创建server List时,别的task还没有生成后task文件。Giraph默认为每个Job启动一个ZooKeeper服务,也就是说只有一个task会启动ZooKeeper服务。

经过多次测试,task 0总是被选为ZooKeeper Server ,因为在同一进程中,扫描taskDirectory时,只有它对应的task 文件(其他task的文件还没有生成好),然后退出for循环,发现hostNameTaskMap的size等于1,直接退出while循环。那么此处就选了test162 0。

最后,创建了文件:_bsp/_defaultZkManagerDir/job_201403301409_0006/zkServerList_test162 0

Giraph source code analysis (a) - start ZooKeeper service

onlineZooKeeperServers(),根据zkServerList_test162 0文件,Task 0 先生成zoo.cfg配置文件,使用ProcessBuilder来创建ZooKeeper服务进程,然后Task 0 再通过socket连接到ZooKeeper服务进程上,最后创建文件 _bsp/_defaultZkManagerDir/job_201403301409_0006/_zkServer/test162 0 来标记master任务已完成。worker一直在进行循环检测master是否生成好 _bsp/_defaultZkManagerDir/job_201403301409_0006/_zkServer/test162 0即worker等待直到master上的ZooKeeper服务已经启动完成。

启动ZooKeeper服务的命令如下:

Giraph source code analysis (a) - start ZooKeeper service

4、determineGraphFunctions()。

GraphTaskManager类中有CentralizedServiceMaster对象和CentralizedServiceWorker 对象,分别对应于master和worker。每个BSP compute node扮演的角色判定逻辑如下:

a) If not split master, everyone does the everything and/or running ZooKeeper.
b) If split master/worker, masters also run ZooKeeper
c) If split master/worker == true and giraph.zkList is set, the master will not instantiate a ZK instance, but will assume a quorum is already active on the cluster for Giraph to use.

The static method is determined in determineGraphFunctions GraphTaskManager class () is defined, the following code fragment:

Giraph source code analysis (a) - start ZooKeeper service

The default, Giraph distinguishes between master and worker. Zookeeper service will start in the master above ZooKeeper service does not start on the worker. Well Task 0 is the master + ZooKeeper, other Tasks that workers

Guess you like

Origin blog.51cto.com/14463231/2422571