This time, Big Data Engineer win!

Big Data era has arrived, it has become a strategic resource, has become a key element to improve competitiveness. To this end, various industries are beginning to use data to guide decision-making, from the micro-channel circle of friends, Taobao, Jingdong and other electricity providers APP's product recommendation, headlines today, vibrato deft other media news and video push, even to the travel route optimization, behind this, the results are heavily dependent on the decision-making data-based.

With the outbreak of large data, the Chinese IT industry environment will face a new round of reshuffle, not only businesses, but also the opportunities of employees in transition can be met without resort.

See the figure in the investigation of future technology directions, the most promising is the big data, artificial intelligence, mobile development, cloud computing, these items no clear winner, is the direction of the most promising technical personnel. China Big Data technology is still in its infancy, so now is the best time to learn big data technology.

Now the industry, often in our ears is this:

See people XXX, then switch to do a big salary data once turned over several times;

See people XXX, we had a mid-life crisis after the switch to do big data also done management;

See people XXX, then switch to do big data is now being Daozhui sister ......

first to know about big data?

Big Data is the nature of the data, but there are new features, including extensive data sources, diversification (structured data, unstructured data, Excel files, text files, etc.) data format, data volume (also the minimum level of TB and may even be PB level), data is growing faster and so on.

For more than four major characteristics we need to consider the following questions:

Wide data sources, how to collect summary? Correspondence appeared Sqoop, Cammel, Datax and other tools.

After data collection, how to store? It corresponds to the emergence of GFS, HDFS, TFS and other distributed file storage system.

Due to the fast growth of data, data storage must be extended horizontally.

After the data is stored, how quickly transformed into a consistent format through the operation, which calculated how quickly the results you want?

Corresponding MapReduce distributed computing framework for such a solution to this problem; it takes a lot to write MapReduce amount of Java code, so there Hive, Pig and so on will be converted to SQL MapReduce analytic engine;

普通的MapReduce处理数据只能一批一批地处理,时间延迟太长,为了实现每输入一条数据就能得到结果,于是出现了Storm/JStorm这样的低时延的流式计算框架;

但是如果同时需要批处理和流处理,按照如上就得搭两个集群,Hadoop集群(包括HDFS+MapReduce+Yarn)和Storm集群,不易于管理,所以出现了Spark这样的一站式的计算框架,既可以进行批处理,又可以进行流处理(实质上是微批处理)。

而后Lambda架构,Kappa架构的出现,又提供了一种业务处理的通用架构。

为了提高工作效率,加快运速度,出现了一些辅助工具:

Ozzie,azkaban:定时任务调度的工具。

Hue,Zepplin:图形化任务执行管理,结果查看工具。

Scala语言:编写Spark程序的最佳语言,当然也可以选择用Python。

Python语言:编写一些脚本时会用到。

Allluxio,Kylin等:通过对存储的数据进行预处理,加快运算速度的工具。

以上大致就把整个大数据生态里面用到的工具所解决的问题列举了一遍,知道了他们为什么而出现或者说出现是为了解决什么问题,进行学习的时候就有的放矢了。

大数据工程师的技能要求有哪些?

划重点:通过系统实训成为一名起薪保底6K的大数据工程师。。在这里相信有许多想要学习大数据的同学,大家可以+下大数据学习裙:957205962,即可免费领取套系统的大数据学习教程

用阿里巴巴集团研究员薛贵荣的话来说,大数据工程师就是一群“玩数据”的人,玩出数据的商业价值,让数据变成生产力。大数据和传统数据的最大区别在于,它是在线的、实时的,规模海量且形式不规整,无章法可循,因此“会玩”这些数据的人就很重要。沈志勇认为如果把大数据想象成一座不停累积的矿山,那么大数据工程师的工作就是,“第一步,定位并抽取信息所在的数据集,相当于探矿和采矿。第二步,把它变成直接可以做判断的信息,相当于冶炼。最后是应用,把数据可视化等。”因此分析历史、预测未来、优化选择,这是大数据工程师在“玩数据”时最重要的三大任务。通过这三个工作方向,他们帮助企业做出更好的商业决策。

附上大数据工程师技能图:

必须掌握的技能11条:

Java高级(虚拟机、并发)

Linux 基本操作

Hadoop(HDFS+MapReduce+Yarn )

HBase(JavaAPI操作+Phoenix )

Hive(Hql基本操作和原理理解)

Kafka

Storm/JStorm

Scala

Python

Spark(Core+sparksql+Spark streaming)

辅助小工具(Sqoop/Flume/Oozie/Hue等)

高阶技能:

机器学习算法以及mahout库加MLlib

R语言

Lambda 架构

Kappa架构

Kylin

Alluxio

大数据开发工程师待遇如何?

大数据开发工程师作为IT类职业中的“大熊猫”,大数据工程师的待遇是非常高的,在这个领域再次给我们展示了“物以稀为贵“的道理。在国内IT、通讯、行业招聘中,有10%的招聘岗位都是和大数据相关的,且比例还在不断的上升。

在美国,大数据工程师平均每年薪酬高达17.5万美元,而在国内,顶尖的互联网类公司,相比于其他岗同等级别的岗位,大数据工程师的薪酬大约要比其他职位高20%至30%,而且很受企业的重视

看了那些动则年薪百万的报道,内心有木有充满期待呢?

但是对于基础相对薄弱的你我来说,看了需要掌握的技能后,说心里话,无论从专业还是学习时间上都是有难度的,那么作为有心成为大数据工程师的我们,该如何实现呢?

 

Guess you like

Origin blog.csdn.net/lele989/article/details/91577915