2 hours to learn big data programming language Scala cheats

Author: Magic Good Source: Hang Seng LIGHT Cloud Community

Scala series:

2 hours to learn big data programming language Scala cheats

Big Data Programming Language Scala Advanced

foreword

In order to understand the underlying source code of the big data framework Spark and Fink, you need to learn the Scala programming language. The design of Scala is derived from Java, but it is "higher" than Java. It adds a layer of encapsulation based on Java, allowing programmers to develop programs through functional programming . So if you have a foundation in Java or other programming languages ​​before learning, it is no problem to learn Scala in 2 hours. This article will summarize the Scala installation and basic grammar points, help you quickly learn Scala grammar with examples, and help you understand the program logic of related open source frameworks.

Introduction to Scala

Scala is a multi-paradigm programming language designed to integrate various features of object-oriented programming and functional programming. Scala runs on the Java Virtual Machine and is compatible with existing Java programs. Scala source code is compiled to Java bytecode, so it can run on the JVM and can call existing Java class libraries.

If you have learned the basics of java before learning scala, it will be faster to get started with scala.

1646010801814-f8b9ef83-f3fb-4e31-b20d-38d20e9d1247.png

characteristic

Object-Oriented : Scala is a pure object-oriented language, each value is an object, and the data type and behavior of an object is described by classes and traits.

The extension of the class abstraction mechanism is mainly through the inheritance of subclasses or the flexible mixin mechanism.

Functional programming : Scala is a functional language, functions can also be used as values, provides a lightweight syntax for defining anonymous functions, supports higher-order functions, allows nesting of multiple levels of functions, and supports currying.

Install Scala

Since Scala runs based on the JVM virtual machine, before installing Scala, you need to install JDK in advance.

At present, Scala has versions 2.0 and 3.0. The most used version is version 2.0. Take version 2.13 as an example to download and install. Official website download link: https://www.scala-lang.org/download/scala2.html 1645493954725-c79d18de-7875-4f85-adf6-005981d5ffea.png Find the required environment version installation package and install Scala.

Install in Windows

The system is win10, download the scala-2.13.8.zipinstallation file to the local, unzip it, and then enter the system properties to configure the environment variables:

  • First add a variable SCALA_HOMEand assign it to the decompression directory (the upper level of the bin directory)D:\Soft_install\scala\scala-2.13.8
  • 1645494236438-d2196fcf-4771-4075-9f6f-2112531cb90d.png
  • In the Path variable, add the Scala bin path:%SCALA_HOME%\bin
  • 1645494404564-9fa5e1a5-1114-49c4-ac38-2983d994c80d.png
  • In the CLASSPATH system variable, add the Scala bin path:;%SCALA_HOME%\bin
  • 1645494492587-11fb8da2-c6f4-4357-8acd-0bc5fcc04751.png
  • After the above system variables are set, and then verify the installation, open the command window and enter: scala
  • 1645494625669-9ef09e58-0d82-40eb-a54e-b05e2878abb0.png
  • It will print the Scala version installed by the system and enter the Scala command line environment (similar to Python)

After the above installation and verification are completed, it means that Scala has been successfully installed, and subsequent development work can be started.

Scala basic syntax

Syntax Description

For people with programming syntax, especially Java, getting started with scala is very fast. First, a code example:

// 指定文件包
package demo01
// import 引入其他包对象或类, HashMap => JavaHashMap 为重命名成员
import java.util.{HashMap => JavaHashMap}
// 引入 java.util 中的所有成员
import java.util._


/**
 * object 声明 HelloWorld 为一个实力对象 
 * def 修饰 main 方法,可以直接执行
 * println 打印字符串 "Hello, world!"
 */
object HelloWorld {
  def main(args: Array[String]): Unit = {
    println("Hello, world!")
  }
}

The Sacala source code file is .scalasuffixed with . If you need to execute the written Scala program, you need to compile the written source code into a bytecode file .class, and then execute it through the JVM virtual machine. Points to note and understand in the grammar:

  • All class names should be capitalized, method names should be lowercase, and the file program name should be the same as the class name or program name.
  • By default, Scala programs are executed from the main method.
  • Scala programs can directly reference Java libraries for use.

For beginners, some syntax concepts in Scala need to be understood:

concept explain
kind Represents abstract objects that can understand specifications that describe the behavior of properties of objects
object Represents instantiation of a class, a concrete example of a class, takes up memory space
method A method is basically the same as a function and is used to describe a general process or process

type of data

As an object-oriented language, Scala's data types are very similar to those of Java. The main types are as follows:

type of data describe
Byte 8-bit signed two's complement integer. The value range is -128 to 127
Short 16-bit signed two's complement integer. The value range is -32768 to 32767
Int 32-bit signed two's complement integer. The value range is -2147483648 to 2147483647
Long 64-bit signed two's complement integer. The value range is -9223372036854775808 to 9223372036854775807
Float 32-bit, IEEE 754-standard single-precision floating-point numbers
Double 64-bit IEEE 754 standard double-precision floating-point number
Char 16-bit unsigned Unicode characters in the range U+0000 to U+FFFF
String Strings, decorated "with double-quote symbols , multi-line strings with symbols"""
Boolean true or false
Unit Represents no value, and is equivalent to void in other languages. Used as a result type for methods that return no results. Unit has only one instance value, written as ().
Null null or null reference
Nothing The Nothing type is at the bottom of Scala's class hierarchy, and it's a subtype of any other type.
Any Any is the superclass of all other classes
AnyRef The AnyRef class is the base class for all reference classes in Scala

The data type in Scala is generally specified by declaration: var str : String = "ok", and the data type is specified by :the ; or by direct assignment, the data type can be specified by the type of assignment: var str = "ok".

string

In Scala, the type of string is actually a java.lang.Stringclass , Scala itself does not have Stringa class, so if you know Java syntax, you can simply skip this section. Since Scala directly uses the Stringclass , it is an immutable object. For details, please refer to the official API documentation: https://docs.oracle.com/javase/8/docs/api/java/lang/String.html

array

The same type of elements provided in Scala to store fixed-size arrays. The specific use cases are as follows:

// 定义一个数组
var arr1 : Array[String] = new Array[String](3);
var arr2 = new Array[String](3);
var arr3 = Array("a", "b", "c");
// 数组访问赋值
arr1(0) = "x";
println(arr1(0))
// 定义多维数组 ofDim 指定数组的长度
val arrMultia = Array.ofDim[Int](2 2)
// 合并数组
arr2 =  concat( arr1, arr3)

variable

Variables in Scala are divided into variable and immutable. Variables are used to refer to memory addresses and occupy a certain amount of memory space after creation. Variable variables are mainly modified with varmodifiers : var x = "temp"immutables (constants) are mainly modified with valmodifiers :val x = 1

object BaseExample {
  def main(args : Array[String]) : Unit = {

    // 变量的声明
    var change : String = "可变量"  // 变量初始化后,可以再次赋值
    val noChange : String = "不可变量"  // 常量初始化后,再次赋值会编译报错

    // 变量声明可以不指定类型,但是必须赋初始值,否则编译报错
    var changes = "可变量"
    val noChanges = "不可变量"

    // 同时声明多个变量
    var x,y = 20  // 同时给x,y都赋值20

  }
}

In the variable syntax, the points that need to be noted and understood :

  • When declaring variables and constants, it is not necessary to specify the data type, but the initial value must be assigned, and its type can be deduced through assignment, so you need to pay attention to this.
  • In development, if some variables will not be modified, it is recommended to use constant modification to prevent program errors caused by incorrect modification.

operator

Operators in Scala are basically the same as other programming languages, if you have other language foundations, you can skip this section directly. The operators in Scala mainly include:

  • arithmetic operators
operator describe
+ plus
- minus sign
* Multiplication sign
/ division sign
% Remaining
  • relational operator
operator describe
== equal
!= not equal to
> more than the
< less than
>= greater or equal to
<= less than or equal to
  • Logical Operators
operator describe
&& logical and
|| logical or
! logical not
  • bitwise operators
operator describe
& bitwise AND operator
| bitwise OR operator
^ bitwise exclusive or operator
~ bitwise negation operator
<< left shift operator
>> right shift operator
>>> unsigned right shift
  • assignment operator
operator describe
= A simple assignment operation that assigns the right operand to the left operand.
+= Add and then assign, add the left and right operands and assign them to the left operand.
-= 相减后再赋值,将左右两边的操作数相减后再赋值给左边的操作数。
*= 相乘后再赋值,将左右两边的操作数相乘后再赋值给左边的操作数。
/= 相除后再赋值,将左右两边的操作数相除后再赋值给左边的操作数。
%= 求余后再赋值,将左右两边的操作数求余后再赋值给左边的操作数。
<<= 按位左移后再赋值
>>= 按位右移后再赋值
&= 按位与运算后赋值
^= 按位异或运算符后再赋值
|= 按位或运算后再赋值

条件判断

Scala 中的条件判断使用 if...else 语句,根据条件的 truefasle 进行判断下一步执行的代码。 主要掌握以下语法:

if(条件 1){
   // 如果条件 1 为 true 则执行
}else if(条件 2){
   // 如果条件 2 为 true 则执行
}else if(条件 3){
   // 如果条件 3 为 true 则执行
}else {
   // 如果以上条件都为 false 则执行
}

与 Java 中的 if 语句基本相同,会 Java 的同学可以直接跳过。

逻辑循环

Scala 主要提供了 for 循环 、while 循环、do...while 循环语句,循环中的中断只有在 2.8+ 版本以上,才支持了 break 语句,需要注意下。

for循环

for 循环主要掌握以下几个示例即可。 语法示例:

/*
 * for 循环
 */
// 通用 for 循环
// 遍历区间主要使用 a to b 或者 a until b
// 左箭头 <- 用于为变量 a 赋值。
var x = 1;
for (x <- 1 to 10) {
  println("x = " + x);
}

// for 循环集合遍历
// 通过循环可以遍历获取集合List中的值
var y = 0;
val loopList = List(0, 1, 2, 3, 4, 5);
for (y <- loopList) {
  println("y = " + y);
}

// for 循环 if 过滤
// 可以在循环中加入限制条件,只遍历符合条件的语句
for (y <- loopList ; if y != 1; if y == 5) {
  println("y = " + y);
}

while循环

while 循环和其他编程语言语法上是相同的:

// while 循环
var z = 0;
while (z < 5) {
  println("z = " + z);
  z+=1;
}

do...while循环

do...while 循环和 while 循环语句的区别,主要在于 do 是先处理循环逻辑,在判断条件。

// do...while 循环
var d = 0;
do {
  println("d = " + d);
  d+=1;
} while (d < 5)

方法与函数

方法与函数在 Scala 中的区别很小,主要是定义上的区别,通常类中定义的函数称为方法,而函数主要作用是用于给对象赋值。 而在 Scala 中使用 val 语句可以定义函数,def 语句定义方法,所以在称呼上没有明显区别。 理解并掌握以下方法的实例即可:

// addInt 为方法名
// x,y 为方法入参
// 括号后面的 Int= 表示方法返回的数据类型
def addInt( x:Int, y:Int ) : Int = {
  var sum:Int = 0;
  sum = x + by;
  return sum;
}

def main(args : Array[String]) : Unit = {
  // 将函数作为入参传入
  printName(getName());
}

def getName() : String = {
  return "Scala";
}

// String = "defName" 表示默认参数,如果未传参则按默认参数执行
def printName(name : String = "defName"): Unit = {
  println("name is " + name);
}

// * 表示可变参数的函数,可以动态传入多个参数
def canChangVar(strs : String*) ={
  var str = "";
  for (str <- strs) {
    println("str = " + str);
  }
}

闭包

Scala 中的函数闭包可以理解为通过另外一函数访问函数局部变量。

def main(args : Array[String]) : Unit = {
  println(getName("scala"));
}
// 闭包
var getName = (name : String) => "name is " + name;

Scala集合容器

Scala 提供了一套完整的集合实现,集合主要分为可变和不可变,这一点与 Java 中的集合是有所区别的。

List(列表)

List 列表是一个不可变元素可重复的集合,属于Seq接口的子接口。 List 的值在初始化后就不能修改了,所以在使用时需要注意这个特性。

// 定义 List , 普通列表、空列表、多维列表
val strList : List[String] = List("a", "b", "c");
var nullList: List[Nothing] = List();
nullList = Nil; // Nil 表示空列表
val mulList : List[List[Int]] = List(List(1,1), List(2,2));
// head 返回列表第一个元素
println("head:"+strList.head);
// tail 返回一个列表,包含除了第一元素之外的其他元素
println("tail:" + strList.tail);
// :: 可以拼接成新的列表
println("head::tail =" + strList.head :: strList.tail);

其他详细操作,参考API文档:http://www.scala-lang.org/api/current/scala/collection/immutable/List.html

Set(集合)

Set 是包含元素不重复的集合,根据引用的包分为可变集合和不可变集合,默认使用都是不可变集合。 Set 集合引用的可变(scala.collection.mutable.Set 包)和不可变(scala.collection.immutable.Set),通过引入不同包进行操作。 scala.collection.immutable.Set 集合的基本操作如下:

// 定义 set 集合,默认是 scala.collection.immutable.Set 不可变集合
var set = Set(1,2,3)
println(set.getClass.getName)
// 丢弃第一个元素并创建一个新集合返回
println(set.drop(1))

scala.collection.mutable.Set 集合的基本操作如下:

// 引入可变集合包
import scala.collection.mutable.Set

// 定义 set 集合
var set = Set(1, 2, 3)
println(set.getClass.getName) // scala.collection.mutable.HashSet
// 丢弃第一个元素并创建一个新集合返回
// println(set.drop(1))
// 可变集合基本操作
set.add(6);     // 新增元素
set += 7;       // 新增元素
set.remove(3);  // 去除元素
set -= 1;       // 去除元素
println(set)    // HashSet(2, 6, 7)
// 将可变集合转换为不可变集合
val noSet = set.toSet;
println(noSet.getClass.getName) // scala.collection.immutable.Set

其他详细操作,参考API文档:http://www.scala-lang.org/api/current/scala/collection/immutable/Set.html

Map(映射)

Map 集合和 Java 中的 Map 一样是一种 Hash 表,以键值对(key/value)结构存在。 Map 集合也包含可变(import scala.collection.mutable.Map)和不可变 (scala.collection.immutable.Map),默认使用是不可变集合。 scala.collection.immutable.Map 集合的基本操作如下:

// Map 集合,默认使用 scala.collection.immutable.Map 不可变集合
// 通过 -> 表示键值对关系
var map: Map[String, Int] = Map("a" -> 1, "b" -> 2);
// 通过主键获取值
println(map("b"))               // 2
println(map.getClass.getName)   // scala.collection.immutable.Map

scala.collection.mutable.Map 集合的基本操作如下:

import scala.collection.mutable.Map

// Map 集合
// 通过 -> 表示键值对关系
var map: Map[String, Int] = Map("a" -> 1, "b" -> 2);
// 通过主键获取值
println(map("b"))               // 2
println(map.getClass.getName)   // scala.collection.mutable.HashMap
// 判断集合中是否存在指定值
println(map.contains("c"));     // false
// 获取集合的 key 和 value
println(map.keys);              // Set(a, b)
println(map.values);            // Iterable(1, 2)
// 集合新增元素
map += ("c" -> 3)
println(map)                    // HashMap(a -> 1, b -> 2, c -> 3)

其他详细操作,参考API文档:http://www.scala-lang.org/api/current/scala/collection/immutable/Map.html

Iterator(迭代器)

In Scala, iterator is Iteratornot a collection container, but a method for accessing collections. If you can understand Java, you can know that its commonly used operations are nextand hasNext. Basic operation:

// 定义迭代器
var iterator = Iterator(1, 2, 3, 4, 5);
// 遍历迭代器
while (iterator.hasNext) {
  print(iterator.next())
}
// 获取迭代器的长度
println("长度为:" + iterator.size )

For other detailed operations, refer to the API documentation: http://www.scala-lang.org/api/current/scala/collection/Iterator.html

Summarize

This article summarizes the basic syntax and basic operations of Scala, which can facilitate us to read the source code of other frameworks and improve our understanding of the underlying design of the framework. We will continue to introduce advanced Scala programming in the future, so stay tuned.

{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/3620858/blog/5471020