Big data: spark environment construction, local mode, standalone mode, zookeeper standby, yarn mode

Big data: spark environment construction, local mode

2022找工作是学历、能力和运气的超强结合体,遇到寒冬,大厂不招人,可能很多算法学生都得去找开发,测开
测开的话,你就得学数据库,sql,oracle,尤其sql要学,当然,像很多金融企业、安全机构啥的,他们必须要用oracle数据库
这oracle比sql安全,强大多了,所以你需要学习,最重要的,你要是考网络警察公务员,这玩意你不会就别去报名了,耽误时间!
与此同时,既然要考网警之数据分析应用岗,那必然要考数据挖掘基础知识,今天开始咱们就对数据挖掘方面的东西好生讲讲 最最最重要的就是大数据,什么行测和面试都是小问题,最难最最重要的就是大数据技术相关的知识笔试


Big data: spark environment construction, local mode

insert image description here
insert image description here
insert image description here
insert image description here
local is a process
that uses many threads to simulate a cluster

insert image description here
The team leader not only manages people, it can work by itself, and it works as an officer in local mode.
Got it

insert image description here
A jvm process
is only responsible for one task,
plus tasks, a second process is required

spark-standalone mode

insert image description here
insert image description here
insert image description here
Spark is also a master-slave architecture
. Note that the driver runs in the master process, which
is different from the concept of logical space.

insert image description here
Standalone is a fixed cluster, so many more drivers are required for multiple tasks. Anyway, it is enough for the department head to assign a few more drivers.

spark-standalone HA:zookeeper standby

Single-point master, if something goes wrong, gg
if multiple masters appear, if another boss is sick, we can change the chairman as soon as possible

insert image description hereinsert image description here
insert image description here
insert image description here
Understood.
Backup chairman, always on call

spark: yarn is the most frequently used by enterprises

insert image description hereinsert image description here
insert image description here
We are in the yarn of the original enterprise hadoop, just add spark.
This is just to replace the computing framework. It is
easy
to understand.


The previous role of yarn was resourcemanager, now let it serve as the master in spark

The role of worker is the head of the department, just let nodemanager take it directly
.

Do not engage in spark processes separately in yarn

When the task is running, the driver still has to be the team leader alone, and the executor has to do the work alone. It’s enough to run it in the yarn container. Here, the driver still goes in the worker process, not the master
insert image description hereinsert image description here
process
.
Anyway, it can be improved directly on yarn, giving it the ability to spark

insert image description here


Summarize

提示:重要经验:

1)
2) Learn oracle well, even if the economy is cold, the whole test offer is definitely not a problem! At the same time, it is also the only way for you to test the public Internet police.
3) When seeking AC in the written test, space complexity may not be considered, but the interview must consider both the optimal time complexity and the optimal space complexity.

Guess you like

Origin blog.csdn.net/weixin_46838716/article/details/131022840
Recommended