Become a big data development engineer, really so good?

Russia since the World Cup, veteran teams frequently upset, circle of friends from observation, the roof has been full of fans, friends.

True fans will find that the Russian World Cup more than a lot of new stuff full of sense of technology. For example, in France the game against Australia, the assistant referee VAR video on the "shine", effectively avoid the year in Germany against England "door line tragedy."

Moreover, Russia has allowed FIFA World Cup teams get the data in the game, which means that the value of big data soccer has been another heavy official recognition.

"Big data" is simply God's perspective on football in full, against the team from analysis to tactical suggestions to the players to adjust the state, etc., it can play a role.

Why Big Data frequently on hot search?

Pay to move from a shared economy, everything from the Internet to the wisdom of the city, from the initial acceptance of the concept of big data, to refresh the annual bill, songs report, the value of big data being created step by step reflected. Internet, finance, telecommunications, medical, transportation, people's livelihood, the industry began to large data applications, big data scenarios in the future is limitless possibilities.

The real value lies in big data applications.

According to the Ministry of Industry issued the "big data industry development plan (2016-2020)", 2020, big data-related products and services business revenue exceeded 1 trillion yuan.

Of course, the question arises of privacy should also be valued.

Payroll big data job is not artificially high?

Good prospects for development, meaning strong liquidity and demand for talent, appropriate, pay big data jobs will be higher than normal job. In simple terms, the salary is high because it can create greater value.

Amazon, for example, its patented "predictive logistics" logistics work can begin before the user orders. Amazon based on your previous browsing history, search history, and even the mouse over time to predict your shopping behavior, commodity recommended for you, only makes this point is increased by 10% to 30% of its additional profits. It will also advance these goods were sent to the appropriate warehouse, so that we can greatly shorten product time to reach, so that users are more willing to buy buy buy.

It is also because of this practical, "big data talent" to the pursuit of the enterprise is "just be." There are executive search firm estimates that over the next five years, 94% of companies will require a data scientist.

 Data from the pull hook Network

As can be seen from the chart, in order to win various talents, out of the salary is good, but the talent gap large data or real existence.

According to the latest release of "big data talent report" shows that the country's big data talent only 460,000, the next 3 - 5 years, big data talent gap will reach 1.5 million.

Statistical analysis of the Professional Committee of China General Chamber of Commerce according to data, the future of China's basic data analysis talent gap will reach 14 million, while in BAT Recruitment jobs in more than 60% are hiring big data talent.

How to become a big data talent?

Both in terms of personal development or industry perspective, to enter the field of big data is a good choice.

Keen, founder of China's top Whitecaps team's Wang Qi said that "the era of big data, data is money."

The current big data processing has two main software framework, Hadoop and Spark. Each company's recruitment requirements from the point of view, you want to engage in the development of big data, Hadoop master or Spark is indispensable. Want to learn the system big data, you can join the big data exchange technology to learn buttoned Junyang: 522 189 307, welcome additions, understand Courses

Hadoop Big Three, including HDFS, Yarn and MapReduce. HDFS storing large amounts of data, a resource management framework is the Yarn their own, but MapReduce is a distributed computing frameworks, run on the Yarn, with HDFS distributed data used for calculation.

Spark equivalent MapReduce improved version, write distributed computing tasks, from the point of view of code is more concise, and it supports everyone likes the python, get started faster.

From the current technical trends, Spark heady, and some components of Hadoop, Storm is waning. Distributed computing framework Hadoop MapReduce, known for its stability. But he is a computing framework based disk IO, poor performance in iterative and interactive data mining. It is based on pain points MapRedecu based computing framework Spark memory come into being.

As computing systems Spark circle of "hot", if you want to become a big data engineer, Spark learning can be said is indispensable.

Before learning, need to have some basis:

1. Understand the Linux operating system, to some of the basic commands mixed with a familiar, not rote, after using more naturally remember. To understand the content of some of the JavaSE, you can find some information online, you can also buy some "from the entry to the master," the book.

2. Learn to overcome Spark, a proper understanding of hadoop (HDFS, Yarn and MapReduce). Spark has now developed into an ecosystem, there are many technical, pre-need to understand the offline processing sparkcore, and real-time processing spark streaming, similar Spark Mlib and Sparkgraphx can wait until late when the need to use and then slowly study. To properly understand Hadoop mentioned here because Spark in practice, when you load the data and store data, but also will be used to HDFS, understanding the basics of annoying Yarn is a must, Cloudera official recommended the Spark on Yarn cluster mode .

3. Learn python. python easy to learn, you can develop a program Spark, Spark has a Python interface.
----------------
 

Published 160 original articles · won praise 2 · views 10000 +

Guess you like

Origin blog.csdn.net/mnbvxiaoxin/article/details/104654089