Big data-Spark Graphx

Spark Graphx

Spark Graphx is a module of Spark, mainly used for graph-centric calculations, and distributed graph calculations. The bottom layer of Graphx is based on RDD calculation and shares a storage form with RDD. In the form of display, it can be represented by a data set or a graph

Spark Graphx abstraction

(1) Vertex

RDD [(VertexId, VD)] means

VertexId represents the vertex ID, which is of type Long

VD is a vertex attribute, it can be any type

(2) Side

RDD [Edge [ED]] display

Edge represents an edge and contains an ED type parameter to set attributes. In addition, the edge also contains the source vertex ID and target vertex ID

(3) Triple

The triple structure is represented by RDD [EdgeTriplet [VD, ED]]

The triplet contains an edge, edge attributes, source vertex ID, source vertex attribute, target vertex ID, target vertex attribute high

(4) Figure

Graph said, built by vertices and edges

Example of Spark Graphx

Scala code
package Spark

import org.apache.log4j.{Level, Logger}
import org.apache.spark.graphx.{Edge, Graph}
import org.apache.spark.{SparkConf, SparkContext}

object SparkGraph {
  def main(args: Array[String]): Unit = {
    Logger.getLogger("org.apache.spark").setLevel(Level.ERROR)
    //创建Spark的配置
    val conf = new SparkConf().setAppName("Graph").setMaster("local")
    //实例化SparkContext
    val sc = new SparkContext(conf)
    //定义点
    val spot = sc.parallelize(Array((2L,("Lily","post")),(3L,("Tom","student")),(5L,("Andy","post")),(7L,("Mary","student"))))
    //定义边
    val edge = sc.parallelize(Array(Edge(2L,5L,"Colleague"),Edge(5L,3L,"Advisor"),Edge(5L,7L,"PI"),Edge(3L,7L,"Coll")))
    //构建图
    val graph = Graph(spot,edge)
    //统计Post的数量
    val post_Count = graph.vertices.filter{case (id,(name,pos)) => pos == "post"}.count()
    //打印结果
    println("post count is "+post_Count)
    //统计边的数量(起始的ID大于终点的ID)
    val edge_Count = graph.edges.filter(e => e.srcId > e.dstId).count()
    //打印结果
    println("the value is "+edge_Count)

  }
}

result
Insert picture description here

Published 131 original articles · won 12 · 60,000 views +

Guess you like

Origin blog.csdn.net/JavaDestiny/article/details/98470564