Big data: wordcount case RDD programming operator, countByKey, reduce, fold, first, take, top, count, takeSample, takeOrdered

Big data: wordcount case review RDD programming operator

2022找工作是学历、能力和运气的超强结合体,遇到寒冬,大厂不招人,可能很多算法学生都得去找开发,测开
测开的话,你就得学数据库,sql,oracle,尤其sql要学,当然,像很多金融企业、安全机构啥的,他们必须要用oracle数据库
这oracle比sql安全,强大多了,所以你需要学习,最重要的,你要是考网络警察公务员,这玩意你不会就别去报名了,耽误时间!
与此同时,既然要考网警之数据分析应用岗,那必然要考数据挖掘基础知识,今天开始咱们就对数据挖掘方面的东西好生讲讲 最最最重要的就是大数据,什么行测和面试都是小问题,最难最最重要的就是大数据技术相关的知识笔试


Big data: wordcount case review RDD programming operator

insert image description here
insert image description here
insert image description here
Filter
and then combine,
just take the binary group and
splicing
insert image description here

countByKey counts key occurrences

insert image description here
Words are respelled into words, counting 1
and then counting
insert image description here
dict,
it is not an RDD, it
is already an action,
insert image description here
it is an action operator

insert image description here
The reduce operator
insert image description here
is not an RDD return value. It
is different from the previous reduceByKey.
insert image description here
The aggregation method is the same as it
insert image description here
insert image description here
. The initial value will be added when it works within a partition or between partitions.

insert image description here
Use less
insert image description here
first, the return is not an RDD, but a specific element
insert image description here
insert image description here
insert image description here
take, the first n
insert image description here
items are returned, the list is sorted in
insert image description here
descending order
, and then the first n
insert image description here
elements are taken for internal comparison.
Others need to customize the comparison method. The first parameter of
insert image description here
insert image description here
insert image description here
insert image description here
gg is not allowed to say that the position is repeated.
insert image description here
insert image description here
, rather than the element itself

insert image description here
The default is ascending.
Under normal circumstances, you can change the number
insert image description here


Summarize

提示:重要经验:

1)
2) Learn oracle well, even if the economy is cold, the whole test offer is definitely not a problem! At the same time, it is also the only way for you to test the public Internet police.
3) When seeking AC in the written test, space complexity may not be considered, but the interview must consider both the optimal time complexity and the optimal space complexity.

Guess you like

Origin blog.csdn.net/weixin_46838716/article/details/131032679