Big data: sparkSQL programming syntax, DSL style, SQL style, select, filter, where, groupBy, createTempView, sql.functions

Big data: sparkSQL programming syntax

2022找工作是学历、能力和运气的超强结合体,遇到寒冬,大厂不招人,可能很多算法学生都得去找开发,测开
测开的话,你就得学数据库,sql,oracle,尤其sql要学,当然,像很多金融企业、安全机构啥的,他们必须要用oracle数据库
这oracle比sql安全,强大多了,所以你需要学习,最重要的,你要是考网络警察公务员,这玩意你不会就别去报名了,耽误时间!
与此同时,既然要考网警之数据分析应用岗,那必然要考数据挖掘基础知识,今天开始咱们就对数据挖掘方面的东西好生讲讲 最最最重要的就是大数据,什么行测和面试都是小问题,最难最最重要的就是大数据技术相关的知识笔试


Big data: sparkSQL programming syntax

insert image description here
Field DSL style, SQL style
DSL directly calls the function suffix
select()
to extract, show
column object, very show
insert image description here
filter filter
directly write expression, column object like python
can also use
where and filter
insert image description here
to group fields, and then count
insert image description here
df The return value of groupBy is not a DataFrame object
. GroupData has a data structure with a grouping relationship. If there is an API to aggregate the groups, it will be convenient to handle
count, sum, avg, min, and max.
It cannot be directly shown,
it needs to be aggregated, and then Can the show
understand? ? ?
insert image description here
Use df to directly register:
register a temporary table and use it under the current object, which is equivalent to
registering or replacing a local variable registering a global table and using it across sparkSession objects, with global_temp
in front of it. Use spark.sql() to write sql statements in the middle of the table name Then use it as a DataFrame to read the file , convert the DF word and split it into a two-dimensional table
insert image description here


insert image description here



The df wearing form
can be done through the api

The withColumn method
operates the value column
F to split spaces
and then explodes into an array
insert image description here
insert image description here
df2, which can continue to be
grouped directly, then count, show
insert image description here
withColumnRename and
rename
insert image description here


Summarize

提示:重要经验:

1)
2) Learn oracle well, even if the economy is cold, the whole test offer is definitely not a problem! At the same time, it is also the only way for you to test the public Internet police.
3) When seeking AC in the written test, space complexity may not be considered, but the interview must consider both the optimal time complexity and the optimal space complexity.

Supongo que te gusta

Origin blog.csdn.net/weixin_46838716/article/details/131103610
Recomendado
Clasificación