[Blockchain | IPFS] IPFS node construction, file upload, node storage space settings, node upload file chunk settings

1. Create ipfs node

  • By setting up a node ipfs initon your local computerIPFS

  • Some commands in this article have been executed and have not been reinitialized. Some pictures are copied from previous documents. Specific information should be subject to the actual object.

ipfs init initializing IPFS node at /Users/CHY/.ipfs generating 2048-bit RSA keypair...done peer identity: QmdKXkeEWcuRw9oqBwopKUa8CgK1iBktPGYaMoJ4UNt1MP to get started, enter:

ipfs cat /ipfs/QmVLDAhCY3X9P2uRudKAryuQFPM5zqA3Yij1dY8FpGbL7T/readme

cd /.ipfscd ~/.ipfscd /.ipfs ls blocks datastore version config keystore $ open ./
  • After executing ipfs init to initialize the node, an .ipfs folder will be generated to store related information, such as node ID, environment configuration information, data storage, etc.
  • If you are using a MAC computer, use shift+command+. to view hidden files

  • ​View the information of the created node id through ipfs id

2. Start the node server

  • Use the command ipfs daemon to start the node server
  • Once started, the current interface will be in a listening state and a new tab page needs to be created.

3. Simple verification

  • Use the following command to perform a simple test

    ipfs cat /ipfs/QmYwAPJzv5CZsnA625s3Xf2nemtYgPpHdWEz79ojWnPbdG/readme

4. Detailed explanation of related issues

1. The storage location of ipfs

  • IPFS data storage, individual users' data is stored on their own personal hard drive, which is local hard drive storage. After storage, it will be broadcast on the IPFS network, "I have stored data with a hash of Qm...". Because of the uniqueness of the hash, if the data is divided in a certain way, then there will only be one copy of the same data in the network storage. , that is, only stored on the local node. When a user retrieves the data, the hash value of the retrieved data is the key. The node will first query whether the key exists in the DHT table (key/value storage). If not, search in the K bucket closest to the XOR distance of the key. , if a node in the K bucket has a value corresponding to the key, it will be returned, otherwise it will return the most likely node that it thinks contains the value, and use this recursion to finally find the value corresponding to the key. Then request the node to establish a connection with value (that is, the node ID), request the data, and store the key/value pair in its own DHT table. The requesting node stores the received data into the ipfs cache, and the data is retrieved successfully. During the validity period of the cached data, the requesting node can also provide the data to the ipfs network as a backup of the original data.

2. Redundant backup measures for ipfs

  • IPFS adopts the redundant backup measure of Erasure coding. There are n copies of original data and m copies of verification data in the cluster, that is, a total of n+m copies of backup data.

3. Modify the node default storage space

  • ipfsThe default storage space of the node is10个G

Method 1: You can open the terminal and execute the following command

export EDITOR=/usr/bin/vim ipfs config edit

  • Find the content marked with a red box in the picture below and modify it to the size you want.
  • PS: Enter ito start editing. After compilation, escpress the key , enter again :, enter again wqto save and exit.

Method 2 uses the web interface to modify

  • Modify the corresponding information and click Save

The impact of the ipfs node going offline on the entire organization

  • The fault-tolerant mechanism of IPFS will ensure that a sufficient amount of data is copied and stored in different regions. Even if the data in a certain place is completely destroyed due to force majeure factors, the data can be completely restored through backups in other regions, which greatly guarantees Security of data stored on IPFS
  • MerkleDAG is used because it has the following characteristics: 1. Content addressable: All content is uniquely identified by multiple hash checksums, including links. 2. Cannot be tampered with: all content is verified with its checksum. If the data is tampered with or corrupted, IPFS will detect it. 3. Deduplication: Duplicate content and store it only once.
    In the IPFS network, data storage may be duplicated. The number of duplications is related to the IPFS chunking method used when users upload.
  • As mentioned before, data is stored in blocks in IPFS. There are many ways to split data provided by ipfs. The cutting method is described in the ipfs source code core/commands/add.go code:
  1. In the default mode, the block size is 256kb, which is 256 * 1024 bytes, and the corresponding size=262144. The command does not require parameters, that is, ipfs add file.

  2. Specifies the block size mode. The command is ipfs add --chunker=size-1000. The following 1000 can be any number less than 262144.

  3. rabin variable block size cutting mode. The command is ipfs add --chunker=rabin-[min]-[avg]-[max] file. Among them, the values ​​​​of min, avg, and max respectively represent the minimum block size, average block size, and maximum block size. The values ​​​​are smaller than 262144 and can be set by yourself.

    The chunker option, '-s', specifies the chunking strategy that dictates how to break files into blocks. Blocks with same content can be deduplicated. The default is a fixed block size of 256 * 1024 bytes, 'size-262144'. Alternatively, you can use the rabin chunker for content defined chunking by specifying rabin-[min]-[avg]-[max] (where min/avg/max refer to the resulting chunk sizes). Using other chunking strategies will produce different hashes for the same file.

    ipfs add ipfs-logo.svg ipfs add --chunker=size-2048 ipfs-logo.svg ipfs add --chunker=rabin-512-1024-2048 ipfs-logo.svg

  • The same file is stored in ipfs. Because the file cutting method used for storage is different, the hash values ​​returned are different. Therefore, there is no duplication in IPFS block storage, but there may be duplication in the data pieced together by IPFS block files. In other words, the same file can be stored multiple times in the IPFS network according to different file cutting methods.

As shown in the picture above, a 6.8K file storage was tested. The storage was set to 1024B as one shard. After the sharding was completed, it can be found that the file was divided into 7 shards. 

  • How is backup implemented? If there is a very popular movie, and everyone habitually stores the movie on their computer's E drive or other hard drive, if 100 million people around the world have stored the movie, this is not a huge waste of storage. ? In the IPFS network, the movie is only stored in one node. When a user needs to read it, a new backup will be generated. Whoever uses the data will be copied to whom. When a node joins the IPFS network, the node will provide a portion of the hard disk space (default is 10G, configurable) for use by the entire network. So usually, when storing files, the hard disk space provided by yourself is always the fastest, because there is no need to cross the network. When the storage is completed, any node on the network can access the file. When another node accesses it, that node will often copy your data to its cache space. This way there are two copies in the entire network. Just imagine, when many people are interested in this file, the number of copies on the network will increase.
  • What needs to be mentioned is that copies are generally cached, which means they are temporarily stored. It will be automatically deleted after a period of time. This kind of temporary cache solves the problem of distributed data distribution very well. For example, a social hot spot often shows phases such as warm-up period, hot period and ebb period. Using IPFS, the distribution and copy number of data are completely consistent with these periods. matched. The more people visit, the more copies there will be, but when the popularity goes down, the copy numbers will drop, thus naturally achieving a balance between space utilization and access efficiency. If you want this file to be stored permanently, it must be set to a fixed style, that is, stored on the hard disk.

4. Use of ipfs

upload txt file

Upload files in other formats

  • pdf
  • docx
  • jpg
  • mp4
  • mp3

Precautions

  1. The format of the downloaded file needs to be changed, otherwise it will not be available. This conversion method can be done manually or by using commands.
  2. You can also specify the downloaded file name, add -o 文件名, you can also add -a : 压缩成.tar格式,-C :压缩成.gz格

pdf

ipfs get QmZJBKrLFPvn8zEatZsxSJTtJkCFm4YeMwChDLRPPPerZ6 -o 1.pdf

  • Use the command open hh.pdf to open the pdf file. The usage of open here is a function that comes with Linux and has nothing to do with ipfs.

docx

mp3

​ jpg

mp4

upload entire folder

  • The files in the entire folder uploaded here are the same files used in the previous test, so their hash values ​​are consistent. This is what IPFS requires to prevent the same file from being uploaded multiple times by users.

View subfiles contained in an uploaded file

View the referenced hash

  • The concept of referenced hash: generally refers to how many files there are under the folder, and the name of the folder is referenced how many times. The hash is the file hash applied to the file name.

​If you upload a folder, then pull the folder back to the local computer. The files inside will be in the normal storage format and no format conversion is required.

​ Enter the web visualization interface, enter the hash sequence into the search box, and query the file. If the file does not support preview, you need to click downloading to download and view it.

problem dicovered

  • It is different to use root user and ordinary user to check their own node information using ipfs id.

​ Moreover, the two nodes cannot exchange files with each other and do not belong to the same cluster.

reference link

Guess you like

Origin blog.csdn.net/qq_28505809/article/details/132737166