In recent years, in large data in a fiery, I think big data is fire, good jobs, high wages. . . . . . . Many people who want to change jobs, then as a large data beginners want to go big data direction, the school which technology, learning what kind of line is, if they are lost for these reasons to think big data direction, can then I would like to ask, what is your profession, for the computer / software, what are you interested in? Is a computer professional, interested in the operating system, hardware, network, server? Is a professional software, software development, programming, writing code that interest? Or Math, Statistics, particularly interested in data and numbers. .
In fact, this is the direction you want to tell three big data platform to build / Optimization / operation and maintenance / monitoring, Big Data development / design / architecture, data analysis / mining. Please do not ask me which is easy, which is good prospect, which more money.
I would like to popularize 4V features big data:
Large amount of data, TB-> PB
Many types of data, structured, unstructured text, log, video, image, location and the like;
High commercial value, but the value on top of huge amounts of data required, through data analysis and machine learning faster excavated;
High processing timeliness, massive data processing requirements no longer confined to them off-line calculation.
Today, open source big data framework, and more and stronger, as are a few large service framework on the technical aspects of data I have cited:
File Storage: Hadoop HDFS, Tachyon, KFS
Off-line calculation: Hadoop MapReduce, Spark
Streaming, real-time calculation: Storm, Spark Streaming, S4, Heron
KV, NOSQL database: HBase, Redis, MongoDB
Resource Management: YARN, Mesos
Log collection: Flume, Scribe, Logstash, Kibana
Message system: Kafka, StormMQ, ZeroMQ, RabbitMQ
Analysis: Hive, Impala, Pig, Presto, Phoenix, SparkSQL, Drill, Flink, Kylin, Druid
Distributed Coordination Services: Zookeeper
Cluster management and monitoring: Ambari, Ganglia, Nagios, Cloudera Manager
Data mining, machine learning: Mahout, Spark MLLib
Data synchronization: Sqoop
Task scheduling: Oozie
So much stuff, how to start, how to learn, do not worry, QQ group to tell you how to play these: big data sharing learning materials group 142,974,151, 20:10 at night every day a [free] big data live courses, focus big data analysis, large data programming, large data warehousing, big data cases, artificial intelligence, data mining are pure dry goods share, welcome beginners and advanced junior partner.