With the technology of the era, will there be an era of technology?

Table of contents

1. Thinking questions

2. Internet Era

2.1 Web 1.0

2.2 Web 2.0

2.3 Web 3.0

3. Problems of the times

3.1 Big data

3.2 High concurrency

4. Who is making waves?

5. NoSQL

5.1 Concept

5.2 Classification

5.3 Features

5.4 Usage Scenarios

6. Conclusion


1. Thinking questions

A question: We often say that now is the Internet age and the mobile Internet age, so what is the Internet age? When did it start and when did it end?

Another question: The times are changing, and IT technology is also changing. Is the change of the times leading to technology changes? Or do technological changes cause changes in the times?

2. Internet Era

2.1 Web 1.0

The first generation of the Internet (Web 1.0) is the PC (Personal Computer) Internet. Since its development in 1994, it has improved the efficiency of global information transmission and lowered the threshold for information acquisition.

The advantage of the first-generation Internet lies in the efficient transmission of information. Therefore, applications such as online news, online search, e-mail, instant messaging, e-commerce, MMS and ringtones, client terminals, and web games became popular, and Internet users were quickly connected.

Representative companies of this era include Yahoo, Google, Amazon, Sina, Sohu, Netease, Tencent, Baidu, Alibaba, JD.com, etc.

2.2 Web 2.0

The second generation of Internet (Web 2.0) is the mobile Internet , which started around 2008 and is still brilliant today.

The salient feature of the second generation Internet: digitization. The popularity of smartphones has led to close interaction between online (online) and offline (offline). Mobile Internet services such as social networks, O2O services (online to offline services), mobile games, short videos, webcasts, information streaming services, application distribution, and Internet finance have become the mainstream of society. At this stage, Apple, Facebook, Airbnb, Uber, Xiaomi, ByteDance, Didi, Meituan, Ant Financial, Pinduoduo, and Kuaishou have risen rapidly and become leaders in their respective fields.

Image source network
Image source network

2.3 Web 3.0

The third generation Internet (Web 3.0) will be a decentralized Internet, aiming to create a new contract system and subvert the way individuals and institutions reach agreements.

In the future we will see an Internet different from Web 1.0 and 2.0:

Blockchain makes data an asset

Smart contracts create a programmable smart economic system

Artificial intelligence builds a global intelligent brain and creates "digital people"

The Internet of Things enables the extensive mapping of real objects in the physical world to the digital space

AR realizes the superposition of the digital world and the physical world

5G network, cloud computing, and edge computing will build a more magnificent new digital space

At this stage of development, a series of new "killer applications" will also appear, and a group of great new economic organizations will be born instead of monopoly giants.

Image source network

3. Problems of the times

The times are developing, hidden under the trend of the times, what is the opportunity for the emergence of technology?

3.1 Big data

The Web3 era has brought about explosive data growth. Traditional data levels have long been out of use, and technology has introduced the concept of big data .

The definition given by the McKinsey Global Institute is: a data collection that is so large that it greatly exceeds the capabilities of traditional database software tools in terms of acquisition, storage, management, and analysis . It has massive data scale, fast data flow, and diverse Four major characteristics of low data type and value density.

Here's a picture: A DAY IN DATA

Figure source network

Here are the key figures generated daily highlighted in the infographic:

  • 500 million tweets

  • 294 billion emails sent

  • 4 petabytes (PB) of data newly created on Facebook

  • Every connected car creates 4 terabytes of data

  • 65 billion messages sent on WhatsApp

  • 5 billion searches

In order to have a general idea of ​​the data units of the above content, we can first understand each data unit.

1B (Byte)=8b (bit)

1KB (Kilobyte Kilobyte)=1024B

1MB (Megabyte megabyte referred to as "mega")=1024KB

1GB (Gigabyte, also known as "gigabyte")=1024MB

1TB (Trillionbyte Terabyte)=1024GB

1PB (Petabyte petabyte) = 1024TB

1EB (Exabyte) = 1024PB

1ZB (Zettabyte Ten trillion billion bytes)=1024EB

1YB (Yottabyte one billion billion bytes)=1024ZB

Facebook alone can generate 4,194,304GB of data per day, which shows how much data is generated every day.

In recent years, with the rapid development of cloud computing, big data, Internet of Things, artificial intelligence and other information technologies and the digital transformation of traditional industries, the amount of data has shown geometric growth. According to the report "Data Age 2025" released by IDC, the global annual The generated data will grow from 33ZB in 2018 to 175ZB, equivalent to 491EB of data generated every day. 【2017】

 In 2021, the global real-time data volume will be 16ZB (1ZB is approximately equal to 1 trillion GB), accounting for 19.6% of the total global data volume. In 2025, the real-time data volume will reach 51ZB, accounting for 29% of the total data volume. " The proportion of data that needs to be processed quickly is rising rapidly ."

3.2 High concurrency

In addition to the large amount of data, there is another difficult challenge in the Web3 era: high concurrency

Generally speaking, high concurrency means that at the same point in time, many users access the same API interface or URL address at the same time. It often happens in a business scenario with a large number of active users and a high concentration of users. Let's intuitively feel the high concurrency scenario:

Scenario 1: Tmall Double Eleven

Image source network
Image source network

50w+ transaction volume per second~  

However, transaction = view + order + settlement + payment.....

Scenario 2: 12306 buys tickets

In 2015, Alibaba Cloud cooperated with 12306 to provide free technical support to 12306, and put the query and access of 12306 website on Alibaba Cloud. The 12306 system, which has been transformed by Ali, has sold more than 3.5 billion tickets annually, making it the largest real-time ticket transaction system in the world. The average daily ticket sales amounted to 9 million + tickets, the highest daily ticket sales volume was 10 million, and the peak sales volume reached 1,000 tickets per second, which accounted for 80% of the total ticket sales volume. The number of web page views on the peak day exceeded 150 billion, which is equivalent to the fact that each Chinese person visited the ticketing page more than 100 times a day. 【Data for 2023】

 Can let hundreds of millions of people stare at the website every day and flick hard, flick desperately, spend money to buy software to flick, who else but 12306? Madara is willing to call it the strongest.

4. Who is making waves?

Under the background of this era, who will become the waver?

Storage direction: Ceph, Swift, HDFS, RocksDB, LevelDB, memcache, Redis , HBase, MySQL, Postgresql, mongoDB....

Computing direction: for Spark, MapReduce, Storm, OLAP, Flume, Kafka...

Cluster management direction: YARN, Mesos....

Virtualization direction: KVM, XEN, VMvare, OpenStack, Cloud Stack....

Platform architecture: IaaS, PaaS, SaaS...

Microservices: Hessian, Montan, rpcx, gRPC, Thrift, Dubbo, Dubbox, SpringCloud....

The technologies mentioned above are all the trendsetters of this era. This column focuses on explaining the direction of storage: Redis

5. NoSQL

When it comes to Redis, there is an unavoidable concept: NoSQL, a trump card that makes traditional relational databases play well.

Traditional usage of Redis

5.1 Concept

NoSQL refers to non-relational databases. With the rise of web2.0 websites on the Internet, traditional relational databases have become incapable of handling web2.0 websites, especially ultra-large-scale and high-concurrency SNS-type web2.0 purely dynamic websites, exposing many insurmountable problems. for example:

1. Highperformance- The need for high concurrent read and write of the database

Web2.0 websites need to generate dynamic pages and provide dynamic information in real time based on user personalized information, so it is basically impossible to use dynamic page static technology, so the concurrent load of the database is very high, often reaching tens of thousands of read and write requests per second. Relational databases can barely withstand tens of thousands of SQL queries, but hard disk IO can no longer handle tens of thousands of SQL write data requests. In fact, for ordinary BS websites, there is often a need for high concurrent write requests .

2. HugeStorage- the need for efficient storage and access to massive data

For large-scale SNS websites, users generate a large amount of user dynamic information every day. Taking the foreign Friend feed as an example, it reaches 250 million user dynamic information in one month. For relational databases, in a table with 250 million records The efficiency of SQL query is extremely low or even unbearable. Another example is the user login system of a large web site, such as Tencent and Shanda, with hundreds of millions of accounts at every turn, and relational databases are also difficult to cope with.

3. HighScalability&&HighAvailability- the need for high scalability and high availability of the database

In the web-based architecture, the database is the most difficult to expand horizontally. When the number of users and visits of an application system are increasing day by day, your database cannot be easily expanded by adding more servers like the Web server and App server. Hardware and service nodes to expand performance and load capacity. For many websites that need to provide 24-hour uninterrupted service, it is very painful to upgrade and expand the database system, which often requires downtime maintenance and data migration, but the downtime maintenance brings a reduction in the company's revenue

However, it is as powerful as NoSQL technology, and it should be called when it first came out: No SQL, intending to break away from the restrictions of SQL, stand on its own, and take the pure memory route, but countless facts have hit the face. Under the curse of Moore's Law, memory hardware will Still can't make a substantial breakthrough, No SQL and SQL will love and kill each other forever. So now the second meaning of NoSQL is generally recognized: Not Only SQL .

5.2 Classification

Classification ExamplesExample Typical Application Scenarios data model advantage shortcoming
key-value (KV) Tokyo Cabinet/Tyrant, Redis, Voldemort, Oracle BDB Content caching is mainly used to handle high access loads of large amounts of data, and is also used in some log systems and so on. Key points to the key-value pair of Value, usually implemented by hash table fast search The data is unstructured and usually only treated as string or binary data
column store database Cassandra,HBase,Ripple distributed file system Store in column clusters, store data in the same column together Fast search speed, strong scalability, and easier distributed expansion Relatively limited functions
document database CouchDB, MongoDb Web application (similar to Key-Value, Value is structured, the difference is that the database can understand the content of Value) Key-value pair corresponding to Key-Value, Value is structured data The data structure requirements are not strict, the table structure is variable, and there is no need to pre-define the table structure like a relational database The query performance is not high, and it lacks a unified query syntax.
Graph database Neo4J, InfoGrid, Infinite Graph Social networks, recommender systems, etc. Focus on building a relationship graph graph structure Algorithms related to graph structures are used. Such as shortest path addressing, N-degree relationship search, etc. In many cases, it is necessary to calculate the entire graph to obtain the required information, and this structure is not suitable for distributed cluster solutions.

5.3 Features

There is no clear scope and definition for NoSQL, but they all have the following common characteristics:

Easy to expand

There are many types of NoSQL databases, but a common feature is to remove the relational features of relational databases. There is no relationship between data, so it is very easy to expand. Invisibly, it brings scalable capabilities at the architectural level.

Large amount of data, high performance

NoSQL databases have very high read and write performance, especially in the case of large amounts of data, and also perform well. This is due to its non-relational nature and the simple structure of the database. Generally, MySQL uses Query Cache. NoSQL's cache is record-level and a fine-grained cache, so NoSQL's performance at this level is much higher.

Flexible Data Model

NoSQL does not need to create fields for the data to be stored in advance, and can store custom data formats at any time. In a relational database, adding and deleting fields is a very troublesome thing. If it is a table with a very large amount of data, adding fields is simply a nightmare. This is especially evident in the Web 2.0 era of large data volumes.

high availability

NoSQL can easily implement a highly available architecture without affecting performance. For example, Cassandra and HBase models can also achieve high availability by replicating the model.

5.4 Usage Scenarios

NoSQL databases are more suitable in the following situations:

1. The data model is relatively simple;

2. More flexible IT systems are needed;

3. Higher requirements on database performance;

4. Does not require a high degree of data consistency;

5. For a given key, it is easier to map complex value environments.

6. Conclusion

This is basically the end of this article, and I have explained the background of the emergence of NoSQL. In the next article, our protagonist: Redis is about to debut, so stay tuned~

Guess you like

Origin blog.csdn.net/langfeiyes/article/details/129350978