Tree breadth-first search (Part 1): Is the six-degree theory of interpersonal relationships true?

Tree breadth-first search (Part 1): Is the six-degree theory of interpersonal relationships true?

Friends in social networks

There are a large number of users on social networking platforms such as LinkedIn, Facebook, WeChat and QQ. In these social networks, a very important part is the "friend" relationship between people.

In mathematics, in order to express this kind of friend relationship, we usually use the nodes in the graph to represent a person, and the edges in the graph to represent the acquaintance between people, then social networks can use graph theory to Said. The "acquaintance relationship" can be divided into one-way and two-way.

One-way representation, two people a and b, a knows b, but b does not know a. If it is a one-way relationship, we need to use a directed edge to distinguish whether a knows b or b knows a. If it is a two-way relationship, the two sides know each other, so it is enough to directly use an undirected edge.

img

As can be seen from the example above, there can be multiple paths to the acquaintance relationship between people. For example, Zhang San can connect to Zhao Liu directly, or he can connect to Zhao Liu through Wang Wu. Comparing the two paths, the shortest path length is 1, so Zhang San and Zhao Liu were once friends. In other words, here I use the length of the shortest path between the two to define how many times they are friends. According to this definition, in the previous social relationship diagram, Zhang San, Wang Wu and Zhao Liu were first-degree friends, while Li Si and Zhao Liu and Wang Wu were second-degree friends.

img

Given a user, how to find his second-degree friends first?

Problems faced by depth-first search

When using depth optimization search, we need to filter it once we encounter the edge that generates the loop. The specific operation is to determine whether the new access point has already appeared in the current channel, and if it has occurred, it will not be accessed again.

The Six Degree Theory tells us that your social relationship will expand exponentially as the degree of the relationship increases. This means that during the deep search, each time you add a degree of relationship, a large number of friends will be added.

What is breadth-first search?

BFS (Breadth First Search), also called breadth-first search , refers to a node from the graph, walk along the edge and the point of attachment, and to find the distance to this point 1, all other point. Only when all the points with distance 1 from the starting point are searched, the point with distance 2 from the starting point is searched. When all the points with a distance of 2 from the starting point are searched, the point with a distance of 3 from the starting point is searched, and so on.

img

Breadth-first search is actually searching a tree horizontally!
Although the order of breadth-first and depth-first search is different, they also have two things in common.
First, in the process of progress, we do not want to take repeated nodes and edges, so we will mark the points that have been visited, and in the subsequent process of progress, we will only visit those points that have not been marked. . On this point, breadth first and depth first are consistent. The difference is that in breadth priority, if you find that the points directly connected to a certain node have been visited, then the next step will be to look at the points directly connected to the sibling node of this point. Is there a new point to visit.

In the above figure, after visiting the two child nodes 580 and 762 of node 945, the breadth-first strategy found that 945 has no other child nodes, so go to check 945's sibling node 131 to see which child nodes it has The point is accessible, so the next point visited is 906. In depth priority, if you reach a certain point and find that all points directly connected to this point have been visited, then you will not view its sibling node, but fall back to the parent node of this point and continue Check whether there is a new point among the points directly connected to the parent node. For example, in the above figure, after visiting the two child nodes of node 945, the depth-first strategy will fall back to point 110, and then visit the child node 131 of 110.
Second, breadth-first search also allows us to access all points that are connected to the starting point, so it is also called breadth-first traversal. If a graph contains multiple unconnected subgraphs, the breadth-first search from the starting point can only cover one of the subgraphs. At this time, we need to change a starting point that has not been visited, and continue to breadth-first traverse another subgraph. Breadth-first search can use the same method to traverse a graph with multiple connected subgraphs,

How to implement social friend recommendation?

How to preferentially visit the points with shorter distance when recording all the found nodes? If you look closely, you will find that the node closer to the starting point will be discovered earlier. In other words, the sooner the node is visited, the sooner it will be processed,

Here we need to use the queue first in first out (First In First Out) data structure.

So how does the queue work in breadth-first search? This is mainly divided into the following steps.
First, put the initial node in the queue. Then, each time a node is taken from the top of the queue, all the nodes below it are searched. Next, add the newly discovered node to the end of the queue. Repeat the above steps until no new nodes are found.

img

In the first step, the initial node 110 is added to the queue.
In the second step, the node 110 is taken out, and the nodes 123, 879, 945 and 131 of the next level are searched out.
Step 3. Add points 123, 879, 945, and 131 to the end of the queue.
In step 4, repeat steps 2 and 3 to process node 123 and add newly discovered nodes 162 and 587 to the end of the queue.
Step 5, repeat steps 2 and 3, process node 879, no new nodes are found.
Step 6, repeat steps 2 and 3, process node 945, and add newly discovered nodes 580 and 762 to the end of the queue.
...
Step n-1, repeat steps 2 and 3, process node 906, and no new nodes are found.
In step n, repeat steps 2 and 3 to process node 681, no new nodes are found, and there are no more nodes to be processed, and the whole process ends.

  • User node Node . The user node designed this time is slightly different from the prefix tree node TreeNode, which contains the user ID user_id and the user's friend collection. I use HashSet to facilitate the user to confirm whether there will be duplicate friends when generating the user relationship graph.
  • Represents the node array Node [] of the whole graph . Since each user is represented by user_id, I can use a continuous array to represent all users. The user's user_id is the index of the array.
  • Queue . Because Queue is an interface in Java, you need to use a LinkedList class with a concrete implementation.

to sum up

When traversing a tree or graph, if you use a depth-first strategy, the number of nodes found may increase exponentially. If we are more concerned with the nearest connection point, such as a second-degree friend in a social relationship, then in this case, the breadth-first strategy is more efficient. It is precisely because of this feature that we can no longer use recursive programming or stack data structures to achieve breadth-first, but need to use a queue with first-in first-out characteristics.

img

Guess you like

Origin www.cnblogs.com/liugangjiayou/p/12712077.html