Not a technical person can understand, what is big data

Over the years, big data as a fashionable concept, the emergence of high frequency, very high degree of concern, for many people, when he first heard "big data" is the word, will naturally go to be taken literally - think big data is a lot of data, large data storage technology is the technology of large amounts of data.

 

But that is not the case.

Big Data complicated than expected. It is not just a data storage technology, but a series of massive amounts of data and related extraction, integration, management, analysis, interpretation techniques, is a huge frame system.

Furthermore, the big data is a new way of thinking and business models.

Today this article, let's take five minutes to understand what, in the end what is big data.

Big Data Definition

First of all, still have to re-examine the definition of big data.

The definition of the industry have a lot of big data, there is a broad definition, there are narrow definition.

Broad definition, a little taste of philosophy - big data, refers to the physical world to the digital world map and refining. By discovering where data features, which enhance the efficiency of decision-making behavior.

Narrowly defined, technical engineer is given - big data, through the acquisition, storage, analysis, and tap the kind of value of new technology architecture from the large-capacity data.

Comparatively speaking, I still prefer the technical definition, ha ha.

We note that in the above keywords I have the original sentence in bold Ha!

What to do? - access to data, data storage, data analysis

To whom do? - Large-capacity data

What is the purpose? - tap the value of

Data acquisition, data storage, data analysis, this series of actions are not new. We are using the computer every day, are doing this thing every day.

For example, the beginning of each month, attendance administrator to obtain information for each employee attendance, entry Excel spreadsheet, and then stored on a computer, statistical analysis of how many people are late, absent, then buckle TA wages.

However, the same behavior on a large body of data, will not work. In other words, the traditional PC, the traditional software routine, unable to cope with the data level, called "big data."

Big Data, in the end how much?

Our traditional personal computer data processing, is the GB / TB level. For example, our hard drive, now usually 1TB / 2TB / 4TB capacity.

The relationship between TB, GB, MB, KB, we should be very familiar:

1 KB = 1024 B (KB - kilobyte)

1 MB = 1024 KB (MB - megabyte)

1 GB = 1024 MB (GB - gigabyte)

1 TB = 1024 GB (TB - terabyte)

And what level it is big data? PB / EB level.

Most people have never heard. In fact, that is, continue to turn 1024 times:

1 PB = 1024 TB (PB - petabyte)

1 EB = 1024 PB (EB - exabyte)

Just look at these letters, it looks like is not very intuitive. Let me give you an example.

1TB, only need a hard disk can store. Capacity is approximately 200,000 photos, or 200,000 MP3 music, or 671 "Dream of Red Mansions" novels.

1PB, takes about two storage cabinets. Capacity is about 200 million photos or 200 million MP3 music. If a person constantly listen to music, you can listen to 1900.

1EB, takes about 2000 cabinet storage device. If these emissions and cabinets, can stretching 1.2 kilometers long. If placed in the engine room, we need 21 standard basketball court so much room to put it down. Want to learn the system big data, you can join the big data technology learning buttoned Junyang: 522 189 307

Ali, Baidu, Tencent this Internet giant, said to have been close to the amount of data EB level.

EB is not the greatest. Currently the amount of data of all mankind, is ZB level.

1 ZB = 1024 EB (ZB - zettabyte)

In 2011, the amount of data being created and replicated worldwide is 1.8ZB.

By 2020, global electronic data storage device, will reach 35ZB. If you build a room to store these data, then this room area than the 42 Bird's Nest stadium bigger.

The amount of data is not only large, but also increase rapidly - an increase of 50% per year. In other words, it is doubling every two years.

The current big data applications, has not reached the level ZB, mainly in the PB / EB level.

Level targeting big data:

1 KB = 1024 B (KB - kilobyte)

1 MB = 1024 KB (MB - megabyte)

1 GB = 1024 MB (GB - gigabyte)

1 TB = 1024 GB (TB - terabyte)

1 PB = 1024 TB (PB - petabyte)

1 EB = 1024 PB (EB - exabyte)

1 ZB = 1024 EB (ZB - zettabyte)

Sources of data

Data growth, why so fast?

Here, we should look at several key stages of human social data generated.

Roughly speaking, three important stages.

The first stage is the stage after the computer was invented. Especially after the database was invented, so that data management complexity is greatly reduced. Businesses start data is generated so as to be recorded in the database.

At this data to structured data-based (explain what would be "structured data"). How data is generated, it is passive.

The second phase, with the advent of the Internet 2.0 era. The most important sign of the Internet 2.0 is user-generated content.

With the popularity of the Internet and mobile communication devices, people began using the blog, facebook, youtube such a social network, the initiative generated a lot of data.

The third stage is the stage of perceptual systems. With the development of things, a variety of perception layer node starts automatically generate large amounts of data, such as sensors throughout every corner of the world, cameras.

After the development of "passive - - Automatic active" of these three stages, eventually led to the expansion of the total speed of human data.

4Vs Big Data

 

Industry characteristics of large data, summarized as 4 V. We said before the huge amount of data volume is Volume (Quantification). In addition to the Volume, the remaining three, namely Variety, Velocity, Value.

We introduced one by one.

  • Variety (diversification)

Form data are diverse, including digital (price, transaction data, weight, number, etc.), text (e-mail, web pages, etc.), images, audio, video, location information (latitude and longitude, altitude, etc.), and so on, It is data.

Data is divided into structured data and unstructured data.

As the name suggests, data structure, means may be pre-defined data model representation, or data may be stored in a relational database.

For example, a class for all ages, a supermarket prices of all commodities, which are structured data.

The page article, message content, images, audio, video, etc., are words of unstructured data.

In the Internet field, unstructured data accounting for more than 80% of the entire amount of data.

Big data, in line with such characteristics: diversification data form, and the high proportion of unstructured data.

  • Velocity (timeliness)

Big Data also has a feature that is timeliness. From the data generated to the consumer, the time window is very small. The rate of change data, as well as the process, getting faster and faster. For example the rate of change, or even by changes in milliseconds from the previous day by the change into the current second.

We still use numbers to speak:

In just the past this minute, the data world, what happened?

Email: 2.04 Yi Feng is issued

Google: 200 million times a search request is submitted

Youtube: 2880 minutes of video is uploaded

Facebook: 69.5 Wan bar status is updated

Twitter: 98000 Tiao push is issued

12306: 1840 tickets were sold

……

how about it? It is not changing?

  • Value (value density)

The last characteristic is the value of the density.

Large data volume of data is large, but the attendant is a low density value, the real value of the data, just one small part.

Such as looking for surveillance video appearance by criminals, perhaps a few TB of video files, real value, only a few seconds.

 

 

2014 Boston bombings, obtain a surveillance scene 10TB of data (including mobile base station communication records, image data near the shops, gas stations, newsstands and surveillance video provided by volunteers), finally found a suspect ' photo.

Value of Big Data

Just said density value, it comes to the core essence of big data, it is worth.

The main purpose of proposed big data, big data research in human, it is to tap the value of big data inside.

Big data, what is the value?

As early as 1980, the famous futurist Alvin Toffler in his book "The Third Wave", the clearly stated: "Data is wealth", and the big data called "Third Wave cadenza. "

  • First Wave: agriculture stage, about 10,000 years ago
  • Second Wave: industrial stage, beginning of the 17th century
  • Third Wave: Information stage, since the late 1950s

After entering the 21st century, with the development of the previously mentioned second and third stages, the rise of mobile Internet, cloud computing storage capacity and the ability to leap, big data has begun to drop, but also attracted more and more attention.

2012 World Economic Forum said: "Data has become a new economic asset classes, as the same currency and gold." This will undoubtedly value of big data pushed to an unprecedented height level.

Today, large data applications start into our lives, affecting our basic necessities.

The reason why is there such a large data fast development, it is because more and more industries and businesses, began to recognize the value of big data, began trying to tap the value of participating in large data.

Induction, the value of big data mainly from two aspects:

1 to help companies understand users

Big data correlation analysis, customer and product, service relationship series, locate the user's preference, thereby providing a more accurate, more oriented products and services, improve sales performance.

A typical example is the electricity supplier.

Ali Taobao such as e-commerce platform, has accumulated a large number of users purchase data. In the early days, these data are cumbersome and burden, they need a lot of storage hardware costs. However, these data are now Ali's most valuable asset.

With these data, you can analyze user behavior, consumption patterns precise positioning target customer base, brand preferences, geographical distribution, so as to guide business operations management, brand positioning, marketing and other promotion.

Big data can have a direct impact on performance. Its efficiency and accuracy, far beyond the traditional user research.

In addition to electricity providers, including energy, film and television, securities, finance, agriculture, industry, transportation, public utilities, are useless big data.

2 to help businesses understand their own

In addition to helping users understand the outside, big data can help you understand yourself.

Production and operation requires a lot of resources, big data can be analyzed and the specific circumstances of locking resources, such as the distribution of reserves and demand trends. Visualization of these resources can help managers more intuitive understanding of the operation of state enterprises, identify problems more quickly, timely adjustment of operating strategy, reduce business risk.

All in all, "kill the mutant." Big Data is for decision-making.

Cloud computing and large data

Here, we have to answer a lot of doubt in his heart there is - between big data and cloud computing, in the end what is the relationship?

It can be explained: the data itself is an asset, and cloud computing, is to tap the value of the assets to provide the right tools.

Technically, the data is dependent on a large cloud. , Big Data technologies are the foundation of cloud computing inside the mass data storage technology, mass data management technology, distributed computing model.

Cloud computing is like excavators, Big Data is mine. If there is no cloud computing, big data value will play out.

Conversely, large data processing requirements, but also stimulate the development of cloud computing technology and floor.

That is, if there are no major data this mine, this excavator cloud computing, many powerful functions are not develop.

Apply the old saying - cloud computing and big data, the two are complementary.

Large data and things (. 5G)

The second problem, big data and things have anything to do?

The problem I think we should be able to quickly want to understand, in fact, also mentioned earlier.

Things is the "internet of things and objects connected to each other." Things perception layer, resulting in vast amounts of data, will greatly promote the development of big data.

Similarly, large data applications also played things of value, reverse the stimulus of things needs. More and more companies, found that the value of networking to obtain large data objects, will be willing to invest in things.

In fact, this problem can be further extended to "the relationship between big data and 5G".

Upcoming 5G, by increasing the connection speed, enhance the "human things" perception, also contributed to the initiative to create human data.

On the other hand, it is more for the "Internet of Things" services. Requirements include low latency, massive connection terminals and the like, all things scene.

5G stimulate the development of things, but things stimulate the development of big data. All powerful communications infrastructure, are paving the way for the rise of big data.

Big Data industry chain

Let me say the following chain of big data.

Industrial chain, and handle large data flow of big data are closely related. Simply put, the production data, aggregated data, analyzing data, consumption data.

Each link, has a corresponding role players.

From the current situation, foreign manufacturers in large data occupy a larger share of the industry, especially in the upstream areas, basically foreign enterprises. In contrast, the domestic IT companies, there is a big gap.

The challenges of big data

So a good word to say how much data they do not represent big data is perfect.

Big Data is also facing many challenges.

In addition to the technical difficulty of data management, the biggest challenge of big data is safe.

Data are assets, but also privacy. No one wants their privacy is exposed, so people are more and more attention to protect their privacy. The government is also constantly strengthen the protection of citizens' privacy, we introduced a number of laws.

In this case, gaining access to user data, we need to carefully consider whether the ethical and legal. Once illegal, they will pay a very heavy price.

In addition, even if legal access enterprise data, but also worried about whether malicious attacks and theft. There's a risk can not be ignored.

In addition to security, big data we are confronted with problems of energy consumption.

In other words, if not properly protect and utilize the hands of big data, then it is a hot potato, there might as well not.

Published 178 original articles · won praise 3 · views 30000 +

Guess you like

Origin blog.csdn.net/mnbvxiaoxin/article/details/104887379