Big Data development can study it? Which areas need attention there?

Before we learn the development of large data needs to find their own ways, you first need to look at their own situation, whether the starting point is the interest of big data is that they are not really interested in it, is currently on the big data to understand how many, their learning ability and ability to understand the suitability of learning . If cross reorientation if well prepared. Depending on the basic level can be divided into three categories:

The first category: zero-based practitioners , industry and technology for big data ignorant;

The second category: there are certain programming based on big data industry know a little, no hair really should be used;

The third category: work experience engineer , to understand big data industry, want to switch large data development.

In addition to figure out their own situation, we have to different stages, different for different students based learning programs.

For zero-based self-study large data they want, not to say impossible, but many failed, objective reasons: poor learning environment; subjective reasons: bad foundation, can not read, can not learn, dull directly to give up.

For students who want to learn the zero-based big data, the best solution is: the first concern of some large dynamic data fields, let themselves into such a big big data environment. Then look up information programming language (basic skills necessary big data) and video and large data entry books, basic technical knowledge or to know.

After studying for some time, you think you can cope with, the data base will continue to look for large video and book, step by step to; if they feel that their own entry are difficult, either give up or be willing to invest for themselves a put, to choose a reliable training institutions.

 

Data Science and Big Data learning characteristics Misunderstandings

1, Big Data learning to be business-driven, not technology-driven: the core competence of scientific data is to solve the problem.

The core objective of Big Data is a data-driven intelligence to solve specific problems, whether it is scientific problem or problem business decisions, or is it governance issues.

So to be clear before learning problems, understand the problem, the so-called problem-oriented, goal-oriented, this study again and choose the right technology to be applied after the clear, so that only targeted, true to its word hadoop, big data analytics spark is not rigorous. If you are interested in big data development, want the system to learn big data , you can join the big data exchange technology to learn buttoned Junyang: 522 189 307 , welcome to add, to understand course descriptions, access to learning resources. Different business areas that require support in different directions theories, techniques and tools. Such as text, web pages to the natural language model, the data stream changes with time series modeling required, multiple audio and video images are mixed spatiotemporal modeling;

The process requires large data acquisition reptiles, export and poured pretreatment support, distributed storage needs cloud storage, a cloud computing resource management support, calculation requires classification, prediction, etc. described model support, the visualization application needs, knowledge, decision evaluation support.

So technology is a business decision, not to consider the business according to technology, this is the first study large data errors to be avoided.

2, Big Data learning to make good use of open source, do not repeat create the wheel: data science gene technology is open source.

IT frontier of open source has become an irreversible trend, Android open source so that smartphone civilians, we entered the era of mobile Internet, smart open source hardware will lead into the era of things to Hadoop and Spark represented by large data open source ecosystem to accelerate the IOE (IBM, ORACLE, EMC) process, Forced traditional IT giants to embrace open source, Google and OpenAI Union of deep learning open source (with Tensorflow, Torch, Caffe as the representatives) is accelerating the development of artificial intelligence.

Scientific data R and Python language standard is due to open source and health, and prosperity due to the open source, open source Nokia because they can not grasp the general trend of the decline.

Why open source, component-based industrialization and thanks to IT development, major field of basic technology stack and tool libraries have been very mature, the next stage is how the problem quickly assemble, building blocks fast, fast output, whether it is linux, anroid or tensorflow, which base member is substantially library using the existing open source library, in conjunction with new technologies implemented method, a combination constructed from the wheel made rarely repeated.

In addition, the open-source development model that crowdsourcing is a manifestation of collective intelligence programming, companies can not build up a global engineers develop intelligence, while the star on a GitHub open source project can be, so to make good use of collective wisdom and open-source programming, Do not re-create the wheel, this is the big data learning to avoid the second myth.

3, Big Data learning to point to an area, not Tandaqiuquan: data science to grasp the fragmented and systemic . According to big data technology systems foregoing analysis, we can see the depth and breadth of Big Data technologies are traditional information technology can not match.

Only big data and application-specific areas combine to produce value, data science or engineering data is Big Data learning to clear the key issues.

Big Data learning must be clear that I was doing data science or engineering data, technical capabilities of what each needs, which is now in a stage, etc., or for technology and technology, it is difficult to learn and use big data.

Published 123 original articles · won praise 0 · Views 4926

Guess you like

Origin blog.csdn.net/mnbvxiaoxin/article/details/104227238