SparkConf source Interpretation

1. Main features: SparkConf is Spark configuration class, the configuration of the spark application of application, the use of (key, value) to store configuration information.

2. The main form:. Val conf = new SparkConf (), read the spark * any configuration, including the configuration of the developer is provided, since the auxiliary SparkConf contains constructor: def this () = this (true), this secondary constructor Boolean value of true read external configuration information described. In the configuration unit may be provided in def this () = this (false), the external configuration information is skipped.

3.Spark configuration:

  Spark Each of the components are directly or indirectly using SparkConf stored configuration attributes, which are stored in a data structure that is in ConcurrentHashMap

  private val settings=new ConcurrentHashMap[String,String]()

4. How to get SparkConf configuration:

  (1) is derived from system parameters (i.e. System.getproperties acquired property) to Spark. Property that portion of the prefix

  (2) using the API to set SparkConf

  (3) Cloning of the other SparkConf

(1) System Configuration Properties

  There SparkConf In a Boolean properties loadDefaults, when loadDefaults is true, from the system configuration Spark loading properties, as follows:

/** Create a SparkConf that loads defaults from system properties and the classpath */
def this() = this(true) //构造方法
 
if (loadDefaults) {
    loadFromSystemProperties(false)
}
 
private[spark] def loadFromSystemProperties(silent: Boolean): SparkConf = {
  // Load any spark.* system properties 加载以spark. 开头的系统属性
  for ((key, value) <- Utils.getSystemProperties if key.startsWith("spark.")) {
    set(key, value, silent)
  }
  this
}
loadFromSystemProperties

 

  The code calls the tools Utils getSystemProperties method, which function as the key attribute acquisition system, the acquisition system to loadFromSystemProperties properties, filtered to use guard scala "spark." Key string prefix and the calling set value and the method, in the final set to the settings

The method of using the memory set configuration properties

(2) using the API configured SparkConf

  Add the configuration to SparkConf A common way is to use the API SparkConf provided, where the final actual API call overloaded method set such as:

Overloaded set method
/** Set a configuration variable. */
def set(key: String, value: String): SparkConf = {
  set(key, value, false)
}

  setMaster, setAppName, setJars, setExecutorEnv, setSparkHome, setAll Sparkconf the like are arranged Spark complete set by the above method, and such setMaster setAppName

/**
 * The master URL to connect to, such as "local" to run locally with one thread, "local[4]" to
 * run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.
 */
def setMaster(master: String): SparkConf = {
  set("spark.master", master)
}
 
/** Set a name for your application. Shown in the Spark web UI. */
def setAppName(name: String): SparkConf = {
  set("spark.app.name", name)
}
Add Configuration

(3) Cloning of configuration SparkConf

  In some cases, the same configuration information needs to be SparkConf example a plurality of common components, and the way we tend to think that the SparkConf instance is defined as a global variable or parameter passed to other components, but this introduces the problem concurrency Although settings ConcurrentHashMap data structure is thread-safe, but ConcurrentHashMap also proved to be a highly concurrent data structure good performance, but there are concurrent, there must be a loss of performance issues, you can create a SparkConf instance b, and a configuration information is copied to all the (b), this wastes memory, resulting in the code scattered in all parts of the program.

  SparkConf inherited material and implemented Cloneable clone method, the code can be improved by the availability of substances Cloneable

Cloneable substance
class SparkConf(loadDefaults: Boolean) extends Cloneable with Logging {
    def this() = this(true)
 
    /** Copy this object */
    override def clone: SparkConf = {
      val cloned = new SparkConf(false)
      settings.entrySet().asScala.foreach { e =>
        cloned.set(e.getKey(), e.getValue(), true)
      }
      cloned
    }
}

 

def this() = this(true)

Guess you like

Origin www.cnblogs.com/hdc520/p/11825548.html