IPFS Series 02-IPFS and web3.0

IPFS will reshape the Internet protocol, and the Internet will enter the era of web3.0.

In this article, we talk about how ipfs can help bring the Internet into the web3.0 era.

1. Features of web3.0

Although there is no clear definition of web3.0 in the industry, some experts in the industry have proposed that web3.0 should have the following characteristics:

  1. web3.0 should be a distributed, decentralized and trusted network, which is a real public carrier
  2. In the era of web3.0, the entire Internet will become a huge public database
  3. web1.0 is the interconnection of machines and machines, web2.0 is the interconnection of people and web3.0 should be the interconnection of everything

2. Defects of http

http is the basic protocol of web1.0 and web2.0. The original intention of the Internet is to be open, interconnected, and decentralized. It is based on the http protocol, but
http is a fragile, highly centralized, inefficient, and overly dependent on the backbone network agreement.

2.1 http is very fragile

Let's first look at a picture:

image

It is said that this is the first http server in history. Does the note on it look familiar, "This machine is a server, don't shut it down!". This perfectly presents the vulnerability of http servers.

Why can't it be shut down, because the whole service will be paralyzed after the server is shut down.

You may often see results like this:

image

If you have a little understanding of the http protocol, you will know that this means that the resource you want to access has been deleted.

The reason this happens is simple: centrally managed web servers inevitably shut down. The domain name changes ownership, or the company running it goes bankrupt. Or the computer crashes and there is no backup to restore things.

Some time ago, I cleaned up my web page favorites, and found that 30% of the favorite pages can't be opened now.

2.2 http encourages centralization

Whether we admit it or not, almost 90% of the Internet services we use now come from less than 1% of websites, and even billions of users have to rely on services provided by a few companies. http makes the web
more and more centralized.

image

The excessive dependence of http on the backbone network makes the monitoring and censorship of the Internet very low, and it can be easily realized by intercepting only a few backbone networks.
Moreover, the Internet backbone network is not sound, it is easy to be attacked, and services are easily affected when some important optical fiber lines are cut off.

As the Internet's influence has grown, governments and corporations alike have begun prying open HTTP's flaws, using them to spy on users and prevent them from accessing anything that poses a threat to them.

On January 28, 2011, Egypt cut off the internet across the country .

As can be seen from the figure below, at midnight on January 27, the international data traffic in Egypt suddenly dropped to close to zero.

image

On the backbone network, the 3,500 routers connecting Egypt to the outside world did not respond. For the Egyptian people, the Internet is completely unusable, and any website cannot be opened;
for foreign visitors, not only cannot the Egyptian website be opened, but even the IP address cannot be found, as if they do not exist at all.

On the international routing table (global routing table), all entries related to Egypt are invalid. The Egyptian government managed to disconnect the country from the internet overnight, "wiping itself off the world map".

Another risk of centralization is that it puts our communications at risk of being disrupted by a DDoS attack.

2.3 http is inefficient and expensive

First of all, let’s talk about efficiency. Suppose you are watching the 4K video of “Avengers 4” on Tencent Video. Your coordinates are in Beijing, but Tencent’s server is in Shenzhen (disregarding CDN for now), so you have to wait for the video
. When transferring from Shenzhen to Beijing, if your internet speed is fast enough and only one person is watching at this time, you may not feel the lag. But if there are still 1,000 users in Beijing watching "Avengers 4" like you at this time
, at this time, Tencent's server needs to transmit 1,000 videos from Shenzhen to Beijing at the same time, and your viewing experience will be extremely fast at this time Decline, and become "buffer for five minutes, watch movies for three minutes".

At the same time, Tencent will pay a high cost to the ISP service provider to distribute these video data. Assuming that the 4K video of the movie "Avengers 4" is 2GB, then 1000 users need to distribute a total of nearly 2TB of data.
Assuming that 0.1RMB per 1GB is calculated, a total of 200 RMB needs to be paid, which seems not much, but in fact, popular movies like "Avengers 4" generally have more than 100 million views, so Tencent needs to pay 20 million
. to complete data distribution.

What if we could turn every computer on an ISP's network into a streaming CDN instead of always serving that content from a data center?
Like some popular videos, they can even be fully downloaded from the ISP's network, without requiring a lot of hops through the Internet backbone. This is the problem ipfs is going to solve.

3. How IPFS achieves its goals

Now that we've discussed the shortcomings of http (and the problem of hypercentralization), let's talk about how and how IPFS can help improve the web.

First of all, ipfs has fundamentally changed the way we search for Internet files. It is a distributed storage and transmission protocol based on content addressing, versioning, and point-to-point hypermedia.

When we access a file through http, we first need to locate the server where the file is located through the IP address, and then we need to know the path of the file in order to access a file correctly. This mode is called IP and path addressing
.

In the ipfs network, when a file is added to an ipfs node, ipfs will generate an encrypted hash value (starting with Qm) according to the content of the file. As long as you know the hash value, you can locate the file according to the hash
value file, this mode is called content addressing. Cryptography guarantees that the hash value always represents only the content of the file. Even if only one bit of data is changed in the file, the hash will be completely different.

4. How IPFS works

When we add a file to an IPFS node, if the file size exceeds 256K (this value can be set), IPFS will automatically fragment the file, each fragment is 256K, and then disperse and store the slices in each node of the network (but the
author The test found that the currently implemented version only stores all shards on the current node). Each shard will generate a unique hash, and then calculate the hash of the file after splicing the hash values ​​of all shards.
Each IPFS node will save a distributed hash table (DHT), which contains the mapping relationship between data blocks and target nodes. No matter which node adds new data, the DHT will be updated synchronously.

When we need to access this file, IPFS can quickly (only need 20 hops in a network with 10,000,000 nodes) find the node that owns the data and get all the fragments of the file by using a distributed hash table
. Hash, then reassemble into the complete file, and use the hash to verify that this is the correct data.

If you start the IPFS daemon,
you can easily get all the fragments of the entire file through http://127.0.0.1:5001/api/v0/object/get?arg={hash}this API:

image

Moreover, IPFS does not require each node to store all the content published on IPFS. Instead, each node only stores the data it wants. Only when a file is accessed, the node will download (synchronize) it.

From the principle of IPFS, we can see that IPFS can solve some defects of the existing http protocol:

  1. It becomes difficult to take the site offline, if someone hacks Wikipedia's web server or Wikipedia's engineers make a big mistake and cause their server to catch fire, you can still get the same page from other nodes.
  2. The files on the IPFS node can only be added, but cannot be deleted, to ensure that there will be no errors such as http 404, and if you modify a file and add it again, IPFS will regenerate a hash value different from the original file, and the tampered file will be
    in There are new files in the IPFS network, and what you access through the original hash must be the file you added, not the tampered file.
  3. Makes it more difficult for authorities to censor content, because files on IPFS may come from many places, and because some of these places may be nearby, since it does not require a
    backbone network, it is almost impossible for governments or organizations to intercept data for censorship arrive. Things like Turkey blocking Wikipedia and Spain blocking access to independent Catalan websites will never happen again.
  4. In theory, as long as the network is large enough, the IPFS distributed network will soon become the fastest, most available, and largest data storage network in the world. Any node downtime will not affect the external services provided by the IPFS network.
    No one has the ability to shut down all nodes, so data will never be lost.
  5. Since the IPFS network is based on content addressing, it is naturally resistant to DDoS attacks, because you do not know where the data is stored and cannot find the target of the attack.
  6. IPFS uses shard storage, and the hash value of the same shard is the same, so two identical shards will not be stored in the IPFS network, which not only saves storage space, but also saves bandwidth.

5. IPNS

IPFS hashes represent immutable data, which means they cannot be changed without causing the hash value to change. This is a good thing because it encourages data persistence,
but if the content of a website is updated every day, it means that the address of the website is changing every day, which is unacceptable for users.

IPFS provides a special solution IPNS (Inter-Planetary Naming System), which allows users to use a public key as a reference to a hash representing the root directory of a website , and then use a private key to sign the reference.
Since the public key remains unchanged, the user does not need to change the address every time. Every time the site is updated, it only needs to regenerate a new reference and then sign it.

It can be simply understood as binding the IPFS hash of the project root directory through the node ID. When we visit the website in the future, we can directly access it through the node ID.

Although the problem of site update is solved, the hash value of IPFS is irregular, difficult to remember, and has extremely low readability. So IPNS allows you to use your existing DNS to provide human-readable addresses for websites, and
it does this by allowing you to insert hashes into TXT records on your nameservers:

dig TXT www.r9it.com

Then we can visit the site through http://ipfs.io/ipns/www.r9it.com/.

5. IPFS Gateway, a bridge between old and new networks

To access files in the IPFS network, you have three ways:

  1. Via the IPFS client command line tool
  2. IPFS Companion via browser plugin for IPFS
  3. Through the built-in http gateway of the IPFS node (http://127.0.0.1:8080)

The first two methods are more troublesome to access, but the IPFS gateway allows users to realize the seamless connection between http and IPFS, and users can start to slowly migrate their services to the IPFS network.

7. IPFS and web3.0

In view of our analysis of IPFS above, the ultimate goal of IPFS is to replace the http protocol and become the basic protocol of the third-generation Internet. Combining the characteristics of the web3.0 network, we can foresee that
IPFS has played a certain role in paving the way for the arrival of web3.0.

  • First of all, IPFS combined with blockchain can realize a distributed and trusted network.
  • Second, IPFS can transform the entire Internet into a huge storage system.
  • Thirdly, the point-to-point hypermedia protocol proposed by IPFS provides an underlying protocol solution for the Internet of Everything.

8. Reference Links

This article was first published by the younger generation of proletarian code farmers

Guess you like

Origin blog.csdn.net/yangjian8801/article/details/124919925