How to lead the Big Data learning

In recent years, large data fire up, attracting more and more people join the ranks of big data learning, where 0 is the foundation of white, there are many, even some even do not know programming languages, so that some students will lose confidence in himself that he can not learn large data, then this is definitely wrong. So here, I simply take you to find out in the end should learn how Big Data

A. Theoretical knowledge of Big Data

First, be sure to lead you must first understand what is big data, right? Big Data to be built on top of what the medium, used in what scenario, you can know in the end want to learn big data, if you just listen to the name of the tall have to learn it, so you learn after a while, only to find themselves do not like, that do not belong to you is a waste of time and money costs costs. Therefore, we must first find out in the end what is big data!How to lead the Big Data learning

II. Basic programming language

When you have enough data to understand what is the big, big data is then what do you think of their own interest. Then congratulations. Then you have to enter our entire Big Data journey, you want to get into the big data industry so sure you want to learn to write the program, which is a programming language, programming language, what do we learn it? Yes, that is java, because our road is big data, so do not require in-depth java, just need to finish school javaSE enough.

If you want to learn the best big data added to a good learning environment, this may be the Q group 251,956,502 so that everyone would be relatively easy to learn, but also to communicate and share information on common

III. Linux operating system and database

After learning a programming language, we certainly will learn next database, because the data we need to store that database from where to start to learn? Start with the most simple mysql relational database began to learn, of course, you still have energy and time, can also be re-learning under Oracle database, the database after the two finished learning, and learning linux operating system, because most of them are using linux enterprise server operating system, and is a surface version ×××

Four large ecological data

Hadoop:

Then later lay the basis previous, we will enter a large study of ecological data, starting from the hadoop (offline distributed processing framework), starting with the four core components to start, which is hdfs (to solve the problem of how big data stored ), MapReduce (to solve the problem of how big data count), yarn (resource scheduler), common (public library), after the completion of four core components of learning, we must learn hadoop under external dependencies components, such Zookeeper ( thin strips provide services), Sqoop (data migration), hive (data warehouse), hbase (column-store database),

Spark

After reading our hadoop, we have to learn another distributed computing framework Offline: Spark, Spark because we are faster than our hadoop, first is based on memory, there is first of all to learn because there are so DAG directed acyclic graph. Spark must learn

Scala (functional programming language), kafka (messaging middleware message queue), sparksql, spark core, sparkstreaming (real-time micro-batch process), spark Structure streaming (spark flow batch fusion), redies (memory database)

Flink

The current relatively hot, 2016 to promote the use of our Ali branch launched a blink, continue to use flink this year, flink has support from the system Ali, ants, High German, Taobao, rookie logistics whole calculation model is the use of flink last year, Ali bought the flink, in terms of the moment is very hot.

Emphasis

V. project combat

Natural Needless to say, we go out looking for work will be able to see the company standard, require work experience in the field, practical operation is very important, but also a consolidation of the knowledge and learning of the landing.

Guess you like

Origin blog.51cto.com/14296550/2426364