Advanced patent (2): Summary of common techniques and algorithms for patent writing (continuously updated)

I. Introduction

It is very common to use existing technologies or algorithms to solve new problems in the patent writing process. This blog post mainly sorts out the common technologies and algorithms involved in the software invention patent writing process.

2. Common techniques and algorithms

2.1 Zone cross-chain technology

2.2 Clustering algorithm

2.3 Edge algorithm

2.4 Ant Colony Algorithm

The ant colony algorithm is inspired by the research on the foraging behavior of real ant colonies. Biological studies have shown that a group of cooperative ants can find the shortest path between food and the nest, but a single ant cannot. After a lot of careful observation and research, biologists found that the behavior of individual ants interacts with each other. During the movement of ant, it can leave a substance called pheromone on the path it passes through, and this substance is just the carrier of information transmission and communication between ants. Ants can perceive this substance when they are moving, and they are used to tracking this substance to crawl. Of course, pheromones will be released during crawling. The thicker the pheromone trail on a road, the higher the probability that other ants will follow and crawl along this path, so the pheromone trail on this path will be strengthened. Therefore, the collective behavior of an ant colony composed of a large number of ants will show a certain An information positive feedback phenomenon. The more ants have walked on a certain path, the more likely the latecomers will choose this path. Through this indirect communication mechanism, individual ants realize the goal of cooperatively searching for the shortest path.

The two core steps of the colony ant algorithm are: path construction and pheromone update .

2.4.1 Path construction

Each ant randomly selects a city as its starting city, and maintains a path memory vector, which is used to store the cities that the ant passes through in turn. In each step of constructing the path, the ants choose the next city to reach according to a random ratio rule.

insert image description here

The above formula is to calculate the probability of the current point to each possible next node. The numerator is the power product of pheromone intensity and visibility, while the denominator is the sum of all numerators. This is not easy to understand at the beginning, but we can see it clearly when we calculate the final example, and then understand the formula in reverse. Note that each time a node is selected, the selected node is removed from the available nodes.

2.4.2 Pheromone update

Pheromone update is the core of the colony ant algorithm. It is also the core of the whole algorithm. The algorithm has a fixed concentration value in the initial period. After each iteration is completed, all the ants that have gone out will calculate the route they have traveled, and then update the pheromone concentration of the corresponding edge. Obviously, this value must be related to the length of the ants. After repeated iterations, the concentration of the short-distance lines will be very high, so that an approximate optimal solution can be obtained.

The role of pheromone renewal :

  1. Pheromone volatilization ( evaporation) The volatilization process of pheromone traces is a process in which the concentration of pheromone traces on each connection is automatically and gradually weakened. This volatilization process is mainly used to prevent the algorithm from concentrating too quickly on the local optimal area, which is helpful for searching Expansion of the area.
  2. The pheromone enhancement ( reinforcement) enhancement process is an optional part of the ant colony optimization algorithm, which is called the offline update method (there is also an online update method). This approach enables concentrated action that cannot be achieved by a single ant. The offline update method of the basic ant colony algorithm is to update the residual information uniformly after all m ants in the ant colony complete the visit to n cities.

iterate and stop

The conditions for the iteration stop can be stopped after selecting the appropriate number of iterations, outputting the optimal path, or checking whether the specified optimal condition is met, and stopping after finding a satisfactory solution. The most important thing is that when I first started to understand this algorithm, I thought that each ant walking an edge was an iteration, which was actually wrong. The significance of each iteration of the algorithm here is: the m ants of each iteration have completed their own path process, and the whole process after returning to the origin.

2.5 Hash algorithm

Hash algorithm or hash algorithm is to map a binary value of any length to a shorter fixed-length binary value, and this mapping value is called a hash value.

For example, calculate the SHA-256 Hash value of a sentence "hello blockchain world, this is yeasy@github".

This means that for a certain file, there is no need to check its content. As long as the SHA-256Hash calculation result is also db8305d71a9f2f90a3e118a9b49a4c381d2b80cf7bcef81930f30ab1832a3c90, it means that the content of the file is highly likely to be "hello blockchain world, this is yeasy@github".

The Hash value is often called a fingerprint ( fingerprint) or a summary ( digest) in applications. The core idea of ​​the Hash algorithm is also often applied to content-based addressing or naming algorithms.

An excellent Hash algorithm will be able to achieve the following functions:

  • Forward fast : Given plaintext and Hash algorithm, the Hash value can be calculated within limited time and limited resources;

  • Reverse difficulty : Given (several) Hash values, it is difficult (basically impossible) to reverse the plaintext within a limited time;

  • Input sensitivity : Any change in the original input information, the newly generated Hash value should be very different;

  • Conflict avoidance : It is difficult to find two pieces of plaintext with different content so that their Hash values ​​are consistent (collision occurs).

Collision avoidance is sometimes called " collision resistance " and is divided into " weak collision resistance " and " strong collision resistance ". If it is impossible to find other plaintexts that collide with it under the given plaintext premise, the algorithm has "weak collision resistance"; if it cannot find any two plaintexts that collide with Hash, the algorithm is said to have "strong collision resistance".

2.5.1 Common algorithms

Currently common Hash algorithms include MD5and SHAseries algorithms.

MD4 (RFC 1320) was designed by Ronald L. Rivest of MIT in 1990, MDand is Message Digestan acronym for MD4. Its output is 128 bits. MD4 has proven to be less secure.

MD5 (RFC 1321) is an improved version of MD4 by Rivest in 1991. It still groups 512 bits on the input and 128 bits on the output. MD5 is more secure than MD4, but the process is more complicated and the calculation speed is slower. MD5 has been proven not to have " strong collision resistance ".

SHA ( Secure Hash Algorithm) is not an algorithm, but a family of Hash functions . NIST (National Institute of Standards and Technology) released its first implementation in 1993. The currently well-known SHA-1algorithm was launched in 1995, and its output is a Hash value with a length of 160 bits, which has better resistance to exhaustion. The design of SHA-1 imitates the MD4 algorithm and uses a similar principle. SHA-1 has been shown not to be " strongly collision resistant ".

In order to improve security, NIST also designed the SHA-224, SHA-256, SHA-384and SHA-512algorithms (collectively referred to as SHA-2), which are similar to the principle of the SHA-1 algorithm. SHA-3 related algorithms have also been proposed.

Currently, MD5and SHA1have been cracked, it is generally recommended to use least SHA2-256or more secure algorithms.

Tips ⚠️: MD5It is a classic Hash algorithm, and SHA-1it is considered that the security is not enough to be used in commercial scenarios together with the algorithm.

Hash algorithms are generally computationally sensitive. It means that computing resources are the bottleneck, and CPUs with higher clock speeds run the Hash algorithm faster. Therefore, the throughput of Hash calculation can be improved through hardware acceleration. For example, if it is used FPGAto calculate the MD5 value, the throughput of tens of Gbps can be easily achieved.

There are also some Hash algorithms that are not computationally sensitive. For example, the scrypt algorithm requires a large amount of memory resources in the calculation process, and nodes cannot simply add more CPUs to improve Hash performance. Such Hash algorithms are often used in scenarios to avoid computing power attacks.

2.6 Digest of numbers

The digital digest is to turn a message of arbitrary length into a short message of fixed length, which is similar to a function whose argument is a message, that is, a Hash function. The digital digest is to use the one-way Hash function to "digest" the plaintext that needs to be encrypted into a series of fixed-length (128-bit) ciphertext. This series of ciphertexts is also called digital fingerprint, it has a fixed length, and different plaintexts are digested into ciphertexts, the result is always different, but the same plaintext must have the same abstract.

As the name implies, digital digest is to perform Hash operation on digital content to obtain a unique digest value to refer to the original complete digital content. Digital summary is one of the most important uses of Hash algorithm. Using the anti-collision feature of the Hash function, the digital summary can solve the problem of ensuring that the content has not been tampered with.

Careful friends may notice that when downloading software or files from a website, sometimes a corresponding numeric summary value is provided. After downloading the original file, the user can calculate the digest value locally and compare it with the provided digest value to check whether the file content has been tampered with.

The usage scenarios of digital summaries are as follows:

  1. The sent file is encrypted with SHA code to generate 128bit 数字摘要.
  2. The sender re-encrypts the digest with its own private key, and this is formed 数字签名.
  3. Send the original text and encrypted digest (digital signature) to the other party at the same time.
  4. The other party decrypts the digest with the sender's public key, and at the same time encrypts the received file with SHA code to generate another digest.
  5. The decrypted digest is compared with the digest generated by re-encryption of the received file at the receiving end. If the two are consistent, it means that the information has not been destroyed or tampered with during transmission. Otherwise not.

2.7

2.8

2.9

2.10

3. Extended reading

Guess you like

Origin blog.csdn.net/sunhuaqiang1/article/details/130438408