Big data (1) Definition and characteristics

Big data (1) Definition and characteristics

Contents of this article:

1. The words written in front

2. Definition of big data

3. Big data characteristics

3.1. Volume characteristics of big data

3.2. Velocity characteristics of big data

3.3. Variety characteristics of big data

3.4. Value characteristics of big data

3.5. Veracity characteristics of big data

4. Units of big data

5. Data types involved in big data

6. Five core areas of big data

7. Big Data Trends


1. The words written in front

2ab463de26a541818271efa176656ae6.png

   A week ago, I was watering the flowers on the roof of the building, and I ran into a neighbor in this building. She mentioned in a gossip that two people from a certain house on the 5th floor had left this year. Hearing this news, I felt very sad and couldn't help but shed tears. . .

   In fact, I am not familiar with the family on the fifth floor, and the only few exchanges were more than ten years ago. At that time, I was still working, and when I came home, I could always meet the hostess of the house walking the dog near my house. The hostess is a teacher from the Propaganda Department of Huagong University. She has short hair and is very cheerful. The dog is a white long-haired Pekingese. The dog is a bit old. He would pant after running for two steps, and sometimes he would lie on the ground to rest. Whenever this time, the hostess always stood patiently on the road. While waiting for the dog, his eyes are very gentle and doting, as if looking at his own child. I like to tease the dog, touch its head, ask it if it is good today, and if it has a good meal. . .

   The male owner is also a teacher of our Huagong University. I have no chance to meet him, but I know that he has done a lot of things for our building, such as handling a lot of elevator-related matters. Up to now, the owner of the elevator still writes this Teacher Wang's name.

   Later, due to an accident, I stopped going to work and stayed at home to do housework. I basically never went out, and I didn't have much chance to meet the two neighbors on the fifth floor. I remember Teacher Wang's signature: Give some sunshine and it will shine, and give it a basket and you will lay eggs. . . A very humorous and optimistic person. After the epidemic began, Teacher Wang also posted funny pictures of the epidemic in the building group to make everyone happy. . .

   Everything seems to be just yesterday, I didn't expect the two elders to leave so suddenly. . .

   Good people will be rewarded, and the suffering and cultivation in this life will be exchanged for all the well-being in the next life. I would like to write this article to commemorate my old neighbors.

   Sober in adversity

2023.8.24

f3751a45350f4910835ba888fb79118a.gif

2. Definition of big data

big data definition

   Big Data refers to a collection of data that cannot be captured, managed and processed by conventional software tools within a certain period of time. It is a massive, high-growth and diverse information assets.

   Big data refers to the collection of data with huge scale and various types, which cannot be efficiently processed by traditional data processing tools. It can be generated in a number of ways, including the internet, social media, sensors, financial transactions, etc.

   Big data usually has three characteristics: large data scale, diverse data types, and fast data processing speed.

   Applications of big data include business intelligence, finance, healthcare, energy, agriculture, transportation, and more. Using big data technology can obtain valuable information and knowledge from data to support decision making, predictive analysis, marketing and other aspects of work.

Big companies define "big data":

(1), the definition of Oracle

Oracle's big data definition:

   In short, big data refers to very large and complex data sets, especially data sets from new data sources. Its scale is so large that traditional data processing software is helpless, but it can help us solve very difficult business problems in the past.

(2), research institution Gartner:

For the "big data" (Big data) research organization Gartner gives such a definition:

   "Big data" requires a new processing model to have stronger decision-making power, insight and process optimization capabilities to adapt to massive, high-growth and diverse information assets.

(3) The definition given by the McKinsey Global Institute is:

McKinsey Global Institute's definition of big data:

   A data set whose scale is so large that it exceeds the capabilities of traditional database software tools in terms of acquisition, storage, management, and analysis. It has four characteristics: massive data scale, fast data flow, diverse data types, and low value density.

3. Big data characteristics

Big Data Features :

   ♦ Volume: the size of the data determines the value and potential information of the considered data;
   ♦ Velocity: the speed at which data is obtained;
   ♦ Variety: the diversity of data types, there are many data types;
   ♦ Value (value): Reasonable use of big data to create high value at low cost.
   ♦ Veracity: The quality of the data. 

3.1. Volume characteristics of big data

   The massive nature of big data refers to the sheer volume of data. This data is often generated by data sources such as sensors, mobile devices, social media, financial data, medical records, etc.

A number of properties of big data include the following:

   ♦ Huge amount of data: the amount of data ranges from several gigabytes to hundreds of petabytes, and the scale is very huge.

   ♦ Rapid data growth: Data is growing exponentially, requiring constantly upgraded technologies and architectures to handle large-scale data.

   ♦ Diverse data sources: Data sources include sensors, mobile devices, social media, financial data, medical records and other fields.

   ♦ Various types of data: Data includes not only structured data, but also unstructured and semi-structured data, such as text, image, audio and video.

   ♦ High data complexity: Data is often highly complex and contains a large number of associations, interactions and changes, so efficient processing and analysis techniques are required.

   A large amount of data brings great challenges to data processing and analysis, and advanced techniques and tools are required to process and analyze these data.

3.2. Velocity characteristics of big data

   The high-speed (Velocity) characteristic of big data means that the speed of big data generation, transmission, storage and processing is very fast, and the amount of data is huge, often calculated in milliseconds or microseconds.

The high-speed characteristics of big data are mainly reflected in the following aspects:

   ♦ Data generated in real time: Big data is often generated in real time, such as user behavior on social media, sensor data generated by IoT devices, etc. These data need to be acquired and processed in real time.

   ♦ Fast data transmission speed: With the continuous improvement of network bandwidth and data transmission technology, a large amount of data can be quickly transmitted to the target system in a short time, such as cloud storage and data processing platform.

   ♦ Fast data storage: quickly write large client data into the database, process data in real time, etc.

   ♦ Fast data processing speed: Big data processing adopts distributed computing and parallel computing technology, which can quickly process a large amount of data, such as real-time data mining, real-time analysis and reporting, etc.

   ♦ Fast data update speed: Big data processing requires a very high data update speed, so as to ensure the real-time and accuracy of data.

   To sum up, the high-speed characteristics of big data refer to the very fast speed of data generation, transmission, storage and processing, which can quickly respond to user needs and realize real-time data analysis and decision-making.

3.3. Variety characteristics of big data

The diversification of big data is mainly reflected in the following aspects:

   ♦ Diversified sources of data: Big data can come from a variety of sources, such as sensors, social media, logs, traditional databases, etc.

   ♦ Diversified data types: Big data types can be structured data (such as tabular data in relational databases), semi-structured data (such as XML files) and unstructured data (such as pictures, videos and sounds, etc.).

   ♦ Diversified data formats: Big data can be stored and transmitted in various standards and formats, such as CSV, JSON, XML, Avro, ORC, etc.

   ♦ Diversified data content: Big data can contain various types of information, such as text, numbers, images, audio, etc., and even intangible things, such as voice, emotion, opinion, etc.

   ♦ Diversified data scale: Big data can be massive, ultra-large, or even exponential data, which also brings great challenges to data analysis and processing.

3.4. Value characteristics of big data

The value characteristics of big data include the following aspects:

   ♦ Volume: Big data has a huge amount of data, which enables people to understand and predict the changing trend of things more comprehensively and accurately, so as to make better decisions.

   ♦ Diversity: Big data can come from various sources, such as sensors, social media, mobile devices, etc., and thus can contain different types of data, such as text, images, videos, etc. This diversity facilitates data integration and analysis , leading to more precise conclusions. Big data includes structured data (such as data in databases) and unstructured data (such as social media, logs, and pictures, etc.). These data come from different sources, types, and formats, providing a more comprehensive information basis for decision-making.

   ♦ Speed: The speed characteristic of big data means that the data processing speed and update speed are very fast, even in real time. Big data has the ability to process data at high speed, and can process a large amount of data in a short period of time, so as to quickly obtain information. Such data can help companies make decisions quickly, seize the market and gain a competitive advantage in the market.

   ♦ Scale: Big data is very large, containing billions or tens of billions of data points. This scale of data allows businesses to derive better information from larger data sets to better predict market and customer needs.

   ♦ Value: The true value of big data lies in extracting useful information from the data for analysis and application. This is very important for businesses as it can help them make better business decisions, improve products and services, optimize marketing, etc.

   ♦ Accuracy: The accuracy of big data refers to the accuracy and reliability of data. Ensuring the quality of data will help companies make better decisions and improve efficiency and effectiveness.

   ♦ Visualization: Data visualization allows people to better understand the data and thus discover patterns and trends in the data.

   ♦ Openness: Big data needs to be shared and accessed in an open manner so that more people can use and analyze the data.

   Generally speaking, the comprehensive function of the value characteristics of big data is to help enterprises better understand their business, customers and markets, and formulate and execute strategies based on the results of data analysis, so as to obtain greater business value.

3.5. Veracity characteristics of big data

   The veracity characteristic of big data refers to the accuracy and reliability of data. Since big data often comes from a variety of different sources and formats, they can have quality issues such as errors, deletions, duplications, ambiguities, etc. Therefore, for big data systems, it is very important to ensure the authenticity of the data to ensure the accuracy and reliability of the system.

In order to ensure the authenticity of the data, the following measures can be taken:

   ♦ Data cleaning: remove errors, repetitions and unnecessary information by cleaning the data, so as to improve the quality and accuracy of the data.

   ♦ Data Validation: Validate data to ensure it complies with business rules and standards, ensuring data correctness and reliability.

   ♦ Data monitoring: monitor data sources, discover and correct data quality problems in time to ensure data authenticity.

   ♦ Database management: manage the database, including backup, recovery and maintenance, to ensure data security and consistency.

   ♦ Data sharing: When sharing data externally, it is necessary to ensure the authenticity and security of the data to ensure that the data will not be tampered with or misused.

   In short, the veracity feature of big data is a key element to ensure the quality and reliability of data and the correctness and reliability of big data systems.

4. Units of big data

   The smallest basic unit is the bit , and all units are given in order: bit , Byte , KB , MB , GB , TB , PB , EB , ZB , YB , BB , NB , DB .

   They are calculated according to the rate of 1024 ( 2 to the tenth power):

1 Byte =8 bit

1 KB = 1,024 Bytes = 8192 bit

1 MB = 1,024 KB = 1,048,576 Bytes

1 GB = 1,024 MB = 1,048,576 KB

1 TB = 1,024 GB = 1,048,576 MB

1 PB = 1,024 TB = 1,048,576 GB

1 EB = 1,024 PB = 1,048,576 TB

1 ZB = 1,024 EB = 1,048,576 PB

1 YB = 1,024 ZB = 1,048,576 EB

1 BB = 1,024 YB = 1,048,576 ZB

1 NB = 1,024 BB = 1,048,576 YB

1 DB = 1,024 NB = 1,048,576 BB

5. Data types involved in big data

Data types involved in big data

type of data

concept

Manifestations

typical scene

structured data

Also known as row data, it is data that has a unified structure and can be expressed and managed in a two-dimensional form of rows and columns, such as relational database data.

database tables etc.

Enterprise ERP, finance, HR database, etc.

semi-structured data

It is a data model suitable for database integration, and it can also be a basic model of marking services for sharing information on the Web.

Mail, HTML, reports, etc.

Mail system, web page information, report system, etc.

unstructured data

The data structure is irregular, and it is inconvenient to express data in two-dimensional form of rows and columns, such as pictures, texts, audio and video, etc.

video, audio, etc.

Online video content, audio content, graphic images, etc.

6. Five core areas of big data

   ♦  Data storage and computing,

   ♦  Data management,

   ♦  Data circulation,

   ♦  Data application,

   ♦  Data security.

7. Big Data Trends

   ♦ Cloud computing: Cloud computing has become the preferred method for enterprises to store and process large amounts of data.
   ♦ Artificial intelligence and machine learning: Artificial intelligence and machine learning techniques are increasingly being applied to big data analysis and prediction.
   ♦ Blockchain: Blockchain technology can be used for data security and privacy protection.
   ♦ Data Science: Professionals in the field of data science are working with big data analysts to better understand and utilize big data.
   ♦ Data Quality Management: Data quality management has become an important area in big data management to ensure the accuracy and consistency of data.
   ♦ Data visualization: A large amount of data needs to be presented through data visualization tools in order to better understand and utilize the data.
   ♦  Edge computing: Edge computing technology can process large amounts of data on-site, thereby reducing data transmission and processing time.

  Big data articles:

         Recommended reading:

[Have you found someone who will hold hands for a lifetime? ] Chinese Valentine's Day Special
Can digital technology bring ancient books back to life?
When you are in a bad mood, help yourself to train an AI emotional encourager (based on PALM 2.0 finetune)
Deep learning framework TensorFlow
AI Developer Workflow, Perceptions, Tool Statistics
June 2023 Developer Survey Statistics - Most Popular Technologies (2)
June 2023 Developer Survey Statistics - Most Popular Technologies (1)
Let Ai help us draw a zongzi, what will it look like?

​​

​​

​​

Change the background color of the photo (python+opencv) Twelve categories of cats Virtual digital human based on large model__virtual anchor example

​​

​​

​​

Computer Vision__Basic Image Operations (Display, Read, Save) Histogram (color histogram, grayscale histogram) Histogram equalization (adjust image brightness, contrast)

​​

​​

​​

 Speech recognition practice (python code) (1)

 Artificial Intelligence Basics

 Basics of Computer Vision__Image Features

93d65dbd09604c4a8ed2c01df0eebc38.png​​

 Quick check of matplotlib's own drawing style effect display (28 types, all)

074cd3c255224c5aa21ff18fdc25053c.png​​

Detailed explanation of Three.js example ___ rotating elf girl (with complete code and resources) (1)

fe88b78e78694570bf2d850ce83b1f69.png​​

​​

cb4b0d4015404390a7b673a2984d676a.png​​

Three-dimensional multi-layer rose drawing source code__Rose python drawing source code collection

 Python 3D visualization (1)

 Make your work better - the method of making word cloud Word Cloud (based on python, WordCloud, stylecloud)

e84d6708316941d49a79ddd4f7fe5b27.png​​

938bc5a8bb454a41bfe0d4185da845dc.jpeg​​

0a4256d5e96d4624bdca36433237080b.png​​

 Usage of python Format() function___Detailed example (1) (full, many examples)___Various formatting replacements, format alignment printing

 Write romance with code__Collection (python, matplotlib, Matlab, java to draw hearts, roses, front-end special effects roses, hearts)

Python love source code collection (18 models)

dc8796ddccbf4aec98ac5d3e09001348.jpeg​​

0f09e73712d149ff90f0048a096596c6.png​​

40e8b4631e2b486bab2a4ebb5bc9f410.png​​

 Usage of the Print() function in Python___Detailed examples (full, many examples)

 The complete collection of detailed explanations of Python function and method examples (updating...)

 "Python List List Full Example Detailed Explanation Series (1)" __ series general catalog, list concept

09e08f86f127431cbfdfe395aa2f8bc9.png​​

​​

Celebrate the Mid-Autumn Festival with code, do you want to have a bite of python turtle mooncake?

 directory of python exercises

03ed644f9b1d411ba41c59e0a5bdcc61.png​​

daecd7067e7c45abb875fc7a1a469f23.png​​

17b403c4307c4141b8544d02f95ea06c.png​​

Strawberry bear python turtle drawing (windmill version) with source code

 ​Strawberry Bear python turtle drawing code (rose version) with source code

 ​Strawberry bear python drawing (Spring Festival version, Christmas countdown snowflake version) with source code

4d9032c9cdf54f5f9193e45e4532898c.png​​

c5feeb25880d49c085b808bf4e041c86.png​​

 Buzz Lightyear python turtle drawing__with source code

Pikachu python turtle turtle drawing (power ball version) with source code

80007dbf51944725bf9cf4cfc75c5a13.png​​

1ab685d264ed4ae5b510dc7fbd0d1e55.jpeg​​

1750390dd9da4b39938a23ab447c6fb6.jpeg​​

 Node.js (v19.1.0npm 8.19.3) vue.js installation and configuration tutorial (super detailed)

 Color and color comparison table (1) (hexadecimal, RGB, CMYK, HSV, Chinese and English names)

A number of authoritative organizations in April 2023____Programming language rankings__Salary status

aa17177aec9b4e5eb19b5d9675302de8.png​​​

38266b5036414624875447abd5311e4d.png​​

6824ba7870344be68efb5c5f4e1dbbcf.png​​

 The phone screen is broken____how to export the data inside (18 methods)

[CSDN Cloud IDE] Personal experience and suggestions (including ultra-detailed operation tutorials) (python, webGL direction)

 Check the jdk installation path, realize the coexistence solution of multiple java jdk on windows, and solve the terminal garbled characters after installing java19

​​

Vue3 project building tutorial (based on create-vue, vite, Vite + Vue)

fea225cb9ec14b60b2d1b797dd8278a2.png​​

bba02a1c4617422c9fbccbf5325850d9.png​​

37d6aa3e03e241fa8db72ccdfb8f716b.png​​

The second part of the 2023 Spring Festival blessings - send you a guardian rabbit, let it warm every one of you [html5 css3] drawing and moving bunny, cool charging, special font

 Unique, original, beautiful and romantic Valentine's Day confession album, (copy is available) (html5, css3, svg) confession love code (4 sets)

Detailed explanation series of SVG examples (1) (overview of svg, difference between bitmap and vector graphics (diagram), SVG application examples)

5d409c8f397a45c986ca2af7b7e725c9.png​​

6176c4061c72430eb100750af6fc4d0e.png​​

1f53fb9c6e8b4482813326affe6a82ff.png​​

[Programming Life] Python turtle drawing of World Cup elements in Qatar (with source code), 5 World Cup theme front-end special effects (with source code) HTML+CSS+svg draws exquisite colorful flashing lights Christmas tree, HTML+CSS+Js real-time New Year countdown (with source code)

 The first part of the 2023 Spring Festival Blessing Series (Part 1) (flying Kongming lanterns for blessings, wishing everyone good health) (with complete source code and resources for free download)

fffa2098008b4dc68c00a172f67c538d.png​​

5218ac5338014f389c21bdf1bfa1c599.png​​

c6374d75c29942f2aa577ce9c5c2e12b.png​​

 Tomcat11, tomcat10 installation configuration (Windows environment) (detailed graphics)

 Tomcat port configuration (detailed)

 Tomcat startup flashback problem solving set (eight categories in detail)

Guess you like

Origin blog.csdn.net/weixin_69553582/article/details/132483962