Fun Big Data must master the seven core technical concepts

Big Data concept:

Big Data What exactly? Many people may still be some confusion, this article let's look at some of the main definition of big data. First thing to note is that everyone in the industry generally agree, Big Data is not just more data.

(1) Initial Big Data

Features big data can be described by many words. In 2001 Doug Laney was first proposed "3V" model, including the number (Volume), speed (Velocity) and species (Variety). After that, a lot of people to expand the industry 3V to 11V, also including the validity, authenticity, value and visibility and so on.

(2) Large Data: Technology

Why 12 years ago, the old term was suddenly put the spotlight? This is not only because we now have more than a decade ago the number, speed and variety. But because of large data driven by new technologies, especially the rapid development of open source technologies such as Hadoop and other NoSQL way to store and process data.

Users of these new technologies will require a term to distinguish them from previous technologies, so large data become their best choice. If you go to the big data conference, you will certainly find that meetings involving relational database will be very limited, no matter how many they preached V.

The difference between (3) and the large data data

Big Data technology problem is that big data somewhat ambiguous, so that each vendor in the industry can claim that their technology to jump into the big data technologies. The following are two good ways to help businesses understand the difference between the current big data and big data past simple.

Transaction, interaction and observation: It is the responsibility of the company Hortonworks Shaun Connolly, vice president of corporate strategy proposed. Trading is our last collection, storage and analysis of primary data. It is the interaction data click on the page and other operations get. Observation is automatically collected data.

(4) Big Data: signal

SAP's Steve Lucas believes the world should be divided according to the intention and timing, rather than depending on the type of data. "Old World" is mainly about the transaction, when such transactions are recorded, we have been unable to take any action against them: companies are constantly Management "stale data." In the "new world", companies can use the new "signal" data to predict what will happen and intervene to improve the situation.

Related cases are tracking people's attitudes towards the brand on social media, and predictive maintenance (using complex algorithms to help you decide when to replace parts).

(5) Big Data: Opportunities

This is Matt Aslett from 451 Research, he will big data as "before because of technical limitations of the data being ignored." (Although, technically, Matt uses the "dark data" instead of big data, but is very close). This is my favorite definition, because it is consistent with most of the articles and discussions in the statement.

(6) Big Data: Metaphor

Rick Smolan, wrote in his book, big data is "help the planet this process generated the nervous system, in which we humans are just another type of sensor." Very deep, right?

(7) Big Data: old wine in new bottles

Many projects are basically using the previous technologies, which used to be known as BI or analysis suddenly jumped into the ranks of big data.

The bottom line: Although we have to define Big Data lot of controversy, but the fact that everyone agrees: Big Data is a big event in the coming years will bring great opportunities.

How Fun Big Data:

As technology advances, the daily work, the amount of data in life is climbing, we ushered in the big data era .

Large data represented by the data-intensive science will become the cornerstone of a new technological change. With centralized data and further increase the amount of data, massive data security becomes more difficult, the distributed processing of data also increases the risk of data leakage.

Things, cloud computing, development of new technologies such as mobile Internet, making phone, tablet, PC and sensors throughout every corner of the earth, and become sources of data bearer, BYOD also will be born.

(1) What is large data

According to the analyst firm predicts that by 2013, the amount of data carried by the Internet annually will reach 667EB, what is this concept? 1EB = 230GB, the large amount of data is obvious that the vast majority of these data is "unstructured data", usually can not be used as a traditional database, but big data technology innovation will bring great changes to our lives.

(2) four characteristics consisting of a large data

A huge amount of data: the amount of data for all printed materials produced by humans is 200PB (1PB = 210TB), and the amount of data all the words in the history of mankind have said about 5EB (1EB = 210PB). Currently, a typical personal computer's hard drive capacity of the order of TB, while the amount of data that a number of large enterprises has been close to the order of EB, such a large amount of data base, analyze their degree of difficulty can be imagined, and therefore requires a lot of big data solutions .

Low density value: This is also a problem in today's context of large data need to be resolved, the value of the density level and inversely proportional to the total amount of data, if the watch an hour of video, probably in uninterrupted monitoring of time will be very useful data short, even for a few seconds, then the algorithm robust data the computer needs to be done very quickly to data "purification."

Many types of data: This is not to explain, but also to the diversity of data types of data is divided into structured data and unstructured data. Compared with the previous text based convenience store structured data, unstructured data more and more, including logging networks, audio, video, images, and other geographical information.

Processing speed: According to IDC research agency report shows that by 2020, global data usage will reach 35.2ZB (1ZB = 210EB), analyze so much data, you need equipment has greatly improved the speed of data processing.

(3) large benefit to the enterprise data

In March 2012, the United States announced plans to invest $ 200 million to start "Big Data Research and Development Program", in order to enhance the ability to collect huge amounts of data, analyze information extraction. 2012 posted on the forum in Davos, Switzerland "big data big impact," the report said, the data has become a new economic asset classes, like currency or gold, many governments has even rise to the strategic level big data.

(4) global market trends data

For enterprises, the data in the context of large data assets will replace the talent to become an important carrier of various companies and industries, it can effectively help enterprises to complete the operation of the business, as well as the development of operations and supervision of the work of company processes, by analyze data to help business leaders carry out the decisions.

Big Data core business assets have also been remodeling, companies must be familiar with and make good use vast amounts of data, and the Internet industry has felt the deep changes ahead of big data brings. Some Internet companies have completed the re-define the core competitiveness.

(5) Information security can not be overlooked

Big Data has become a significant target of cyber attacks in cyberspace, big data is more likely to be "discovered" big goal. On the one hand, big data means that vast amounts of data also means more complex and sensitive data, which will attract more potential attackers. On the other hand, a large collection of data so that hackers can successfully attack once more data, potentially reducing the cost of hacker attack, increasing the "yield."

Such a large data contains a lot of personal information, or even private information centrally stored data is bound to pose a risk of significant data loss and destruction, some of the ownership and use of sensitive data is not clearly defined, based on a lot of big data none of analysis taking into account individual privacy issues involved.

Unlike many companies large degree of awareness of the data, thus rendering companies will appear inadequacies in big data management and operations, update speed upgrade security tools can not keep pace with the amount of data to non-linear growth, it will be exposed to large data security vulnerabilities.

Big Data technologies can easily become hacker attacks, data mining and data analysis in enterprises and other big data technologies for commercial value, while hackers are taking advantage of these big data technology to attack business. Hackers will maximize the collection of more useful information.

The traditional detection is real-time threat detection based on matching characteristics are based on a single point in time, but the attack was a senior sustainable implementation process, can not be detected in real time. In addition, the value of the low density of large data, making security analysis tool is difficult to focus on the value of the point, the hacker can attack hidden in large data.

(6) avoid the herd

Chinese Academy of Engineering Wu He Quan had suggested that China's development of large data need to develop information protection law and information law as soon as possible, both to encourage community-oriented and social services in data mining, but also to prevent acts against individual invasion of privacy, it is necessary to promote data sharing but also to prevent the data from being abused.

It is understood that China's new data storage 2010 250PB, only 60% in Japan, 7 percent in North America. My country also did not pay attention to the use and storage of big data, some of the data within a certain time wasted. Some departments and agencies have a lot of data, but the expense of others, would prefer to not be reluctant to provide the relevant share, resulting in incomplete information or duplication of investment.

(7) the development of the domestic large data

Wu He Quan said that China's information security is not enough emphasis on large data. 2012 China's data storage capacity to reach 364EB, 55% of the data requires a certain degree of protection, but now less than half of the data is protected.

In the manufacturing industry, online business by big data analytics to understand customer needs and market trends, and after a large analysis of the data, we can effectively manage the procurement and reasonable inventory levels, greatly reduce the loss of sales caused by the blind purchase, big data is a highly application-driven service, its standards and industry structure has not yet formed, this is our opportunity to leapfrog development, but avoid the herd instinct.

 

Guess you like

Origin blog.csdn.net/sdddddddddddg/article/details/90952715