Big Data:
2022找工作是学历、能力和运气的超强结合体,遇到寒冬,大厂不招人,可能很多算法学生都得去找开发,测开
测开的话,你就得学数据库,sql,oracle,尤其sql要学,当然,像很多金融企业、安全机构啥的,他们必须要用oracle数据库
这oracle比sql安全,强大多了,所以你需要学习,最重要的,你要是考网络警察公务员,这玩意你不会就别去报名了,耽误时间!
与此同时,既然要考网警之数据分析应用岗,那必然要考数据挖掘基础知识,今天开始咱们就对数据挖掘方面的东西好生讲讲 最最最重要的就是大数据,什么行测和面试都是小问题,最难最最重要的就是大数据技术相关的知识笔试
Big Data
Records of various operational behaviors
What kind of person is the user?
Whatever he wants to buy, he can get it basically based on the data
Birth of big data
Before the computer was invented, paper was used to record,
and later it was recorded by computer.
In the last century,
they were all independent computers.
Later, small-scale interconnection
and later global interconnection
. With the development of the global Internet, there are more and more users, and
the data is getting bigger
and bigger
. Big
is big.
Too much data , can you handle it?
A computer cannot solve this problem
distributed processing technology
The amount of data is large, and large-scale servers are used to solve it. It
needs to be stored
and calculated
.
Before 2008,
small companies could not play it, and
only large companies could do it.
Later, Aliyun appeared,
open source
Hadoop appeared, open
source
Awesome ,
gradually blossoming and bearing fruit.
Awesome
, the core is distributed computing
, storage and resource scheduling
Apache Hadoop Super Hang
Big Data Overview
The essence is the value behind
the processing of distributed massive data , mining the large volume in the digital age , the variety of data sources , the low value density, and the need to mine the quality of velocity, fast growth, fast acquisition, fast use, and high-performance veracity data , Accurate, credible, and reliable conclusions. Useful and high-quality results are mined from massive, high-growth, multi-category, and low-information-density big data . and scheduling
Big data software ecology
This wave is the theoretical focus of the test for the Internet police.
In 2023, the special recruitment of the Internet police will take the test of
HDFS, which is a distributed storage technology.
HBase is a nosql database technology.
HBase is based on HDFS.
storage technology
The following is the computing technology.
The core of the technology is MapReduce
, and Hive is the database computing technology based on MapReduce.
This is a compulsory test for the special recruitment network police exam
What about data transfer?
Storage, computing, and transmission
are all very rich
Apache
is the company
Apache Hadoop Overview
Apache Software Foundation
Distributed Storage, Computing, Resource Scheduling
From the avenue to simplicity, simple and important
Big Data: Birth of Big Data, Overview, Big Data Software Ecosystem, Apache Hadoop Overview
Resource scheduling is the forward-looking function of YARN that transmits data . It is very important
.
The distributed storage MapReduce in GFS
is distributed computing.
Based on these three papers, it directly designed Hadoop and made it open source.
Awesome
Awesome
Awesome
gangster
gangster
gangster
The open-source community version and
the commercial hairstyle version
Google is still awesome,
it has this technology itself
Summarize
提示:重要经验:
1)
2) Learn oracle well, even if the economy is cold, the whole test offer is definitely not a problem! At the same time, it is also the only way for you to test the public Internet police.
3) When seeking AC in the written test, space complexity may not be considered, but the interview must consider both the optimal time complexity and the optimal space complexity.