Run large data flow

Big Data era of background:

      First proposed the Big Data era is the world's leading consulting firm McKinsey , big data exist for some time in the fields of physics, biology, ecology and environment, military, financial, communications and other industries,

           But because in recent years the development of the Internet and information industry caused concern.

                Big data as a cloud computing, Internet and IT industry after another big disruptive technology revolution. Cloud computing provides data storage assets mainly, place and channel access,

                    The data is really valuable asset. Internal business information, the Internet world community commodity logistics information,

                         Human interaction in the world of Internet information, location information, their number will be far beyond the carrying capacity of existing enterprise IT architecture and infrastructure,

                              Real- timeliness requirements are far beyond the existing computing power. How to make an inventory of data assets,

                                  It is national governance, corporate decision-making as well as personal life service is the core issue of big data, cloud computing is the inner soul and the inevitable escalation direction.

Background Description:

     Entered in 2012, big data (big data) word is increasingly being mentioned, a flood of people use it to describe and define the era of information explosion generated big data era data,

And name associated with technological development and innovation.

     It has been on " New York Times ", " Wall Street Journal " column cover, into the White House official website of the news, appeared in a number of domestic Internet seminar themed salon,

Even the sense of smell State Securities , Guotai Junan, Galaxy Securities and other investment recommendation written into the report.

     Data is rapidly expand and become larger, which determines the future of enterprise development, although many companies may not realize the explosive growth of data has brought the issue of hidden dangers,

But as time goes on, more and more people will realize the importance of data for the enterprise.

     As the "New York Times" in February 2012 wrote a column called "big data" Time has come, in the commercial, economic and other fields,

Decisions will increasingly be made based on the data and analysis, rather than based on experience and intuition.

     Harvard University professor of sociology Gary King said: "This is a revolution, vast data resources so that all fields begin to quantify the process,

Whether academia, business or government, all areas will begin this process. "

 

A: Source:

                          1. reptiles; 2. integrating heterogeneous data sources (Kettle) and the like.

 

II: Storage:

                1.HDFS;2.hbase;3.hire等。

III: Processing (calculated):

                        (Excluding filter the data, reducing the amount of data, reduced memory footprint)

                                Mapreduce and spark and so on.

Four: Again storage: MASQL:

                  A development direction: three directions forming Visualization:

                                                                             1.jzee direction achieved; 2.nodejs building system; 3.python flask structures.

 

                 Development DIRECTION: machine learning, artificial intelligence.

 

         

 

Guess you like

Origin www.cnblogs.com/xuezu2018/p/11564326.html