How to switch to zero-based big data? In this study route system

All know that big data high salaries and good prospects. And large data it needs Java foundation. A little Java for some children's shoes, in the end how big data switch it? Xiao Bian today to give you a big data specific engineers to learn the roadmap. [Ps: no java-based learning can also be big data]

Share experience a career change route

For Java programmers, mainstream platform hadoop big data is based on Java development, so Java programmers to develop large data direction switch from the language environment more smoothly, while many large data-based application framework is Java, so in many big data projects in the Java language to master is a certain advantage.

In this case still have to recommend my own build Big Data learning exchange group: 529 867 072, the group is big data science development, big data if you are learning, you are welcome to join small series, we are all party software development, from time to time Share dry (only the big data-related software development), including a copy of the latest big data and advanced data advanced development course my own sort of welcome advanced and want to delve into the big data small partners to join.

Of course, hadoop core value is to provide a distributed file system and distributed computing engine, for most companies, does not require modification to the engine. This time in addition to familiar with programming, you usually also need to learn some knowledge of data processing and data mining. In particular, to data mining engineer direction, then you need to have more knowledge of the algorithms involved.

For purposes of data mining engineers, although it needs to master programming tools, but in most cases the hadoop as a platform and tools, via the interface platform and tools provided using a variety of scripting languages ​​for data processing and data mining.

So, if you are into mining engineering direction, then the master distributed programming languages ​​such as scala, spark-mllib and so may be more important.

Learning roadmap for Java programmers to turn Big Data Engineer:

The first step: a distributed computing framework

And spark control hadoop distributed computing framework for understanding the file system, and a message queue database Nosql, related components, such as learning hadoop, MR, spark, hive, hbase, redies, kafka the like;

Step two: algorithms and tools

Learn about various data mining algorithms, such as classification, clustering, association rules, regression, decision trees, neural networks, data mining master a programming tool: Python or Scala. The current mainstream platforms and frameworks have provided library of algorithms, such as Mllib on Mahout and spark on hadoop, you can start learning algorithms to learn from these interfaces and scripting languages.

The third step: Mathematics

Supplemental Mathematical knowledge: the high number, probability theory and line-generation

Step Four: Practice Project

1) open source project: tensorflow: Google's open source library has more than 40000 star, very alarming, support for mobile devices;

2) participate in the contest data

3) to obtain internship experience through corporate projects

If you are just big data development and operation and maintenance, you can skip the second step and the third step, if you are focused on the application of existing algorithms for data mining, then the third step can skip.

Guess you like

Origin blog.51cto.com/14296550/2411870