Spark Big Data Analysis and Practical Notes (Chapter 1 Scala Language Basics-2)

Chapter Summary

Spark is a fast and general-purpose computing engine designed for large-scale data processing. It is developed and implemented by the Scala language. Regarding big data technology, it is computing data itself, and Scala has both object-oriented organization project engineering capabilities and The function of calculating data, and the tight integration of Spark and Scala, this book will use Scala language to develop Spark programs, so learning Scala well will help us better grasp the Spark framework.

1.2 Basic syntax of Scala

1.2.1 Declaring values ​​and variables

Scala has two types of variables, one is a variable declared with the keyword var, and its value is mutable; the other is a variable declared with the keyword val, also called a constant, and its value is immutable.

  • Variables declared with the keyword var
    var myVar:String="Hello"
  • Variables declared with the keyword val
    val age:Int=10

There are a few things to note:

  1. Variables in Scala must be initialized when they are declared.
  2. Variables declared using var can be assigned to the variable again after initialization;
  3. The value of a constant declared with val cannot be reassigned.

When declaring a variable, we don't need to give the type of the variable, because at the time of initialization, Scala's type inference mechanism can automatically calculate it based on the initialized value of the variable.

The above code for declaring the variables myVar and age is equivalent to the following code:

var myVar = "Hello"   //使用关键字var声明的变量

val age = 10           //使用关键字val声明的变量

Note: When using the keyword var or val to declare a variable, the variable name immediately following cannot be the same as the reserved word in Scala, and the variable name can start with a letter or an underscore, and the variable name is strictly case-sensitive.

1.2.2 Data Types

  • Any programming language has specific data types, and Scala is no exception.
  • In contrast to other languages, all values ​​in Scala have a type, including numbers and functions.

The hierarchy of data types in Scala
insert image description here
As can be seen from the above figure, Any is the supertype of all types, also known as the top-level type, which contains two direct subclasses, as follows:

  • AnyVal: Indicates a value type, and the data described by the value type is a value that is not empty, not an object. It predefines 9 types, namely Double, Float, Long, Int, Short, Byte, Unit, Char and Boolean. Among them, Unit is a value type that does not represent any meaning, and its function is similar to void in Java.

  • AnyRef: Indicates the reference type. It can be considered that, except for values, all types inherit from AnyRef.

At the bottom of the Scala data type hierarchy, there are two more data types, namely Nothing and Null, as follows:

  • Nothing: A subtype of all types, also known as a bottom type. Its intended use is to signal termination, such as throwing an exception, program exit, or an infinite loop.
  • Null: A subtype of all reference types, its main purpose is to interoperate with other JVM languages, and it is hardly used in Scala code.

1.2.3 Arithmetic and operator overloading

Arithmetic operators (+, -, *, /, %) in Scala work the same as in Java, as do bit operators (&, |, >>, <<). It is particularly emphasized that these operators in Scala are actually methods. For example, a+b is actually shorthand for a.+(b).
insert image description here
Note: Scala does not provide operators ++ and –. If we want to achieve the effect of increasing or decreasing, we can use "+=1" or "-=1" to achieve it.

1.2.4 Control Structure Statements

In Scala, control structure statements include conditional branch statements and loop statements. Among them, the conditional branch statement includes if statement, if...else statement, if...else if...else statement and if...else nested statement; the loop statement includes for loop, while loop and do...while loop.

  1. conditional branch statement
  • if conditional statement
if (布尔表达式){
       语句块
} 
  • if-else conditional statement
if (布尔表达式){
        语句块
} else{
        语句块
}
  • if-else-if-else statement
if (布尔表达式1){
        语句块
} else if(布尔表达式2){
         语句块
} else if(布尔表达式3){
       语句块
} else {
       语句块
}
  • if-else nested statement
if (布尔表达式1){
       语句块
               if(布尔表达式2){
                      语句块
               }
}else if (布尔表达式3){
        语句块
               else if (布尔表达式4){
                      语句块
                }
}else{
         语句块
}

The sample code is as follows:
insert image description here
2. Loop statement
The for statement in Scala and the loop statement in Java have a big difference in syntax. Let's introduce the for loop statement in Scala.

  • for loop statement
for(变量<-表达式/数组/集合){
         循环语句;
}

Below, we loop from 0 to 9, and print out the value for operation demonstration every time we loop. In Scala syntax, "0 to 9" can be used to represent the range from 0 to 9, and the range includes 9. The sample code is as follows:

  • Dos command line
    insert image description here

  • The results under IDEA
    insert image description here
    are as follows:
    0 1 2 3 4 5 6 7 8 9

Scala can filter some elements by using the if judgment statement in the for loop statement, and multiple filter conditions are separated by semicolons. For example, to output even numbers greater than 5 in the range of 0-9, the sample code is as follows:
insert image description here

  • The while loop statement
    The while loop statement in Scala is exactly the same as that in Java, and the syntax format is as follows:

  • while statement

while(布尔表达式){
         循环语句;
}

Below, we demonstrate the use of while by printing out odd-numbered cases.

Suppose there is a variable x=1, judge whether it is less than 10, if it is, print it out, and then perform +2 operation.

The sample code is as follows:
insert image description here

  • do-while statement
do{
       循环语句;
}while(布尔表达式)

The main difference between the do...while loop statement and the while statement is that the loop statement of the do...while statement is executed at least once. The sample code is as follows:
insert image description here

1.2.5 Methods and functions

In Scala, it has methods and functions just like Java. A Scala method is part of a class, whereas a function is an object that can be assigned to a variable. In other words, functions defined in a class are methods.

In Scala, functions can be defined using def statements and val statements, while methods can only be defined using def statements. The methods and functions of Scala are explained below.

  1. Method
    The definition format of a Scala method is as follows:
def functionName ([参数列表]):[return type]={
         function body
         return [expr]
}

The following defines a method add to realize the addition and summation of two numbers. The sample code is as follows:

def add(a:Int,b:Int):Int={
  var sum:Int = 0
  sum = a + b
  return sum
}

The format of a Scala method call is as follows:

//没有使用实例的对象调用格式
functionName(参数列表)

//方法使用实例的对象来调用,我们可以使用类似Java的格式(“.”号)
[instance.]functionName(参数列表)

Next, in the class Test, define a method addInt to realize the addition and summation of two integers. Here, we use "class name. method name (parameter list)" to call, the sample code is as follows:

scala> :paste                         # 多行输入模式的命令
// Entering paste mode (ctrl-D to finish)
object Test{
def addInt(a:Int,b:Int):Int={
  var sum:Int =0
  sum = a + b
  return sum
 }
}
// Exiting paste mode, now interpreting. # Ctrl+d结束多行输入模式
defined object Test
scala> Test.addInt(4,5)      # 方法调用
res0: Int = 9
  1. Functions
    A function in Scala is an object that can be assigned to a variable.
    The definition syntax of a Scala function is as follows:
val functionName ([参数列表]):[return type]={
           function body
           return [expr]
 }

Next, define a function addInt to realize the addition and summation of two integers. The sample code is as follows:

scala> val addInt =(a:Int,b:Int) => a+b
addInt: (Int, Int) => Int = <function2>
scala> addInt(6,7)
res1: Int = 13
  • Converting a method into a function
    The format of converting a method into a function is as follows:
val f1 = m _

In the above format, the method name m is followed by a space and an underscore to tell the compiler to convert the method into a function instead of calling the method. Next, define a method m to convert the method m into a function. The sample code is as follows:

scala> def m(x:Int,y:Int):Int=x*y          # 方法
m: (x: Int, y: Int)Int
scala> val f = m _
f: (Int, Int) => Int = <function2>          # 函数
scala> f(2,3)
res2: Int = 6

Note: The return value type of the Scala method can be omitted, and the compiler can automatically infer it, but for the recursive method, the return type must be specified.

Reprinted from: https://blog.csdn.net/u014727709/article/details/132031799
welcome to start, welcome to comment, welcome to correct

Guess you like

Origin blog.csdn.net/u014727709/article/details/132031799