Introduction to basic terms of big data


Preface

With the advent of the era of big data, big data technology is constantly being understood and familiarized by people in various industries. Many people are very concerned about the development and application of this technology, but also have great expectations for its prospects and the wealth and opportunities it can create! Now many people have started their big data journey to explore and experience the beauty of big data. Below, this article will introduce you to some related big data terms, which can help you speed up your understanding and learning of big data.

1. What is CYBER space?

CYBER space refers to the space composed of all man-made networks and equipment such as computer networks, communication networks, Internet of Things, and mobile Internet .

2. What is informatization?

Informationization is the process of transforming everything in reality into data and storing it in the CYBER space .

3. What is data?

In computer science, data refers to the general term for all the media of symbols that can be input into a computer and processed by a computer program.
In big data science, data refers to anything that can be input into cyber space (CYBER space), and refers to the only thing that exists, measurable, processable, and occupies space in cyber space (CYBER space).

4. What is the data world?

The data world is composed of all the data in the CYBER space. CYBER space only serves as the carrier of the data world.

  • The growth rate of data in the data world is not controlled by humans .
  • There are a lot of unknown data in the data world, which contains unknown phenomena and unknown laws . This is one of the reasons for the development of data science.
  • With the advancement of science and technology, there are many ways to generate data, so the types of data in the data world are also diversified and complicated .

Due to the above characteristics of the data world, the data world is sometimes called the data nature.

5. What is data science?

Data science is a science based on data, which studies the commonality of data based on the breadth and diversity of data. It is the theory of exploring and discovering the world of data in the CYBER space (I like to call it a different-dimensional space). Different from natural sciences and social sciences, the research object of data science is data in the data world in different dimensions. (For data science, a detailed introduction will be given later, only a brief description here)

6. The connection between data, information, knowledge and wisdom

For the connection between Data, Information, Knowledge and Wisdom, we can use the DIKW pyramid model to help understand.

The following pictures are from the Internet
Insert picture description here

Insert picture description hereInsert picture description hereFrom the three charts above, we can find that data can be transformed into information, knowledge and wisdom , which shows its importance.

Let me give an example to help understand:

  1. Data: I have a huge number of books in my collection, about water quality introduction, water recycling, water purification, seawater desalination technology and other water sources.
  2. Information: I realized a small and simple water purification system by reading and studying these books.
  3. Knowledge: Through continuous improvement and perfection, I realized a large-scale fully functional seawater desalination system.
  4. Wisdom: I compiled the implementation process and methods of this large-scale and fully functional seawater desalination system into a book, and I explained and expanded the application of related technologies in detail.

7. What is big data?

So far, no authoritative and recognized definition of big data has emerged. But it is undeniable that big data is a very important strategic resource, and its status will surpass oil in the future, because big data is inexhaustible and continuous, and it also contains great value.

8. Description of big data (4V and 5V)

4V characteristics of big data:

  1. Large volume (data of PB (10^15) and above)
  2. Many types (Variety) (structured data, semi-structured data, unstructured data)
  3. Velocity (Velocity) is fast (generation speed is fast, change speed is also fast)
  4. Value (Value) is large (note that its value density is low)

Compared with 4V features, 5V features have one more veracity, which is difficult to judge

to sum up

The above is what I want to share today. This article only briefly introduces some basic terms of big data, so stay tuned for more knowledge about big data!

Guess you like

Origin blog.csdn.net/weixin_46658699/article/details/109694173