Getting Started with P2P Protocol

1.P2P definition

        P2P (Peer to Peer) peer-to-peer computer or peer-to-peer network.

        The core of P2P is that data is stored locally in the client, and through the query of stored information (name, address, block), data can be directly transmitted between terminals. The P2P network decentralizes the data traffic on the network. At the same time, the management point not only does not have the pressure of service capacity, but also only stores the index and link of the data, and is not responsible for the data itself, avoiding the trouble of copyright and management.

        The idea of ​​"I am for everyone, everyone is for me" is based on the P2P network, but it is not equal to P2P. The P2P network is actually a "non-central government" and "tribal" network. The way to join is to log in from the client. Most do not require authentication, and there is no restriction on leaving. Others "take" your things voluntarily. You don't have to pay for other people's resources, "the P2P world is harmonious". In order to encourage everyone to give more while acquiring, because the more you give, the more you can share, the developer uses the file-blocking technology to immediately give the part you just owned to others when you download it. Sharing, of course, this kind of sharing does not have to go through your permission, and according to your performance points, encourage "good" people, reward dedication, you help others, others will help you. Since the protocols of many P2P networks are open, the ways of joining are also very broad, and different P2P networks can also communicate with each other, providing a basis for further information sharing.

        The biggest problem of a free and loose P2P network is that it is very difficult to supervise the government, and it is not faced by an organization, but every user. After a user joins the P2P network, he contributes the processing power and storage capacity of his computer to the network, and it is not clear who is using him. The designer of the network provides a way to concentrate everyone's resources and defines the rules, but the specific content depends on the specific user. P2P is a new technology. It is a business development model that is parallel to C/S and B/S. The technology is good, but it is born as a product of small people resisting big companies, so some people do not welcome it.

 

2. My understanding of P2P network

        The application of P2P has quickly become popular in the world since the download of MP3 was "recognized" by people. At present, more than half of the backbone traffic of the network is P2P traffic, and it is used in file sharing, live video and on-demand, instant messaging (Internet telephony), network Applications in chat, network storage, grid computing and other fields are developing rapidly. The P2P networking model and development model have become the most suitable network model for the "freedom community" on the Internet. With the popularization of the Internet, the Internet has gone through the stage of information access and information search, and "community-based" information search may become next stage sign.

I personally understand that the key to the development of P2P is the business model of P2P, because P2P truly embodies the advantages of mesh networks, and also solves the problem of TCP/IP service guarantee that has long plagued people from the network itself. On the Internet, applications that can guarantee the quality of service are realized, such as SKYPE calls and PPLive video live broadcasts. The network is the conduction nerve of the information society, and the most suitable mode for this nerve is P2P.

        The main problem of P2P technology is still focused on information search. The search technology is directly related to the P2P network structure. It is necessary to learn the network structure first.

 

3. P2P network structure

1. Centralized P2P network: representatives are Napster, QQ

        There is a central server responsible for recording shared information (index information) and answering queries for this information. The difference from the C/S mode is that in the C/S structure, there is no data flow between clients, and data is exchanged through the central server. The P2P login and information query are connected to the central server, but after the data is queried, it will directly establish a connection with the client that stores the data.

 

2. Distributed unstructured P2P network: the representative is Gnutella

        The organization method of random graph is adopted to form a loose network, there is no central server, and flooding search (Flooding) and random forwarding mechanism (TTL forwarding mechanism) are adopted. Each node has the same function and serves as both server and client.

        The management of nodes is somewhat like the management of routing, and information travels through the network like ripples of water until the "energy" is exhausted.



3. Distributed structured P2P network: representing Pastry, Tapestry, Chord, CAN

        Structural is a management method for network solutions. It is a logically structured query, rather than a change in physical connection. Structural is for the quickness of the search algorithm, which is generally equivalent to a halved search.

        DHT(Distributed Hash Table分布式散列表)路由算法是通过分布式散列函数将输入的关键字唯一映射到某个节点上,然后通过特定路由算法和该节点建立连接。网络节点被分配唯一节点标识符(Node ID),资源对象通过散列运算产生唯一资源标识符(Object ID),且该资源存储在NID与之相等或相近的节点上,查询时,同样的方法定位到存储该资源的节点。

 

4.混合式(半分布式)P2P网络:第三代P2P,代表Skype

        在分布式模式基础上,将用户节点按能力进行分类,使某些节点担任特殊的任务。用户节点:可以从索引节点处得到相临的搜索节点地址。搜索节点:处理搜索请求,要有128k以上的速度,从子节点中搜索文件列表。索引节点:速度快、内存大的节点,保存可以利用的搜索节点信息、搜集状态信息,并维护网络结构。索引节点也可以同时是搜索节点。用户节点可以选择三个搜索节点为父节点,并提交它的共享列表。一个父节点可以维护500个孩子节点。

        首先索引节点的引入不直接连接有版权的资料,摆脱了版权问题。其次引入搜索节点,查询时,用户节点直接连接搜索节点,若搜索的结果不足100个,就向相临的搜索节点再发请求,若还不足,再继续扩散请求,直到所有的搜索节点都访问过。

 

附一:用P2P协议打造人工智能运行平台的讨论

1.用P2P协议打造分布式计算系统

        a.p2p是当今互联网中的一种分布式通讯协议,BT下载,P2P聊天工具都是p2p协议的应用。

        b.而p2p协议曾被用来构建PC机集群来进行大规模并行的蛋白质模拟运算。

        c.分布式系统不是人工智能所必须的,但却是一个很好的人工智能运行的平台

        d.DHT协议是P2P协议中的一种,也是最没有中心的一种P2P协调,很多eMULE就采用这次协议。

        采用DHT协议可以将数以千计的计算机联系起来成为一个分布式计算系统

        DHT没有事件通知,或者选择其他的框架如JXTA。其实C语言确实很不适合做分布式,做服务器编程还可以,GO语言,java,python做分布式比较好。

2.做这个有几个先决条件:

        a.相关的人工智能系统必须有个分布式的架构,不然的话得不到什么好处。

        b.要使用DHT来分布数据的话,那些的数据必须使用hashtable的形式。有些现有的人智系统可能需要改变架构才能用上这个技术。

        c.DHT 貌似没有支持事件通知(event notification)的功能,所以会需要使用其它的p2p协议来满足这类分布式系统的通讯需求。

这种协议多半是支持多语言的,所以该不会被局限在C语言。

 

附二:P2P及DHT网络简单介绍

        P2P在思想上可以说是internet思想/精神/哲学非常集中的体现,共同的参与,透明的开放,平等的分享(让我想起之前学习过的,现在正在疯狂热炒的云计算的"中央集权"制度)。基于P2P技术的应用有很多,包括文件分享,即时通信,协同处理,流媒体通信等等。通过这些应用的接触,分析和理解,P2P其本质是一种新的网络传播技术,这种新的传播技术打破了传统的C/S架构,逐步地去中心化,扁平化,这或许在一定程度上应证了"世界是平的"趋势。P2P文件分享的应用(BTs/eMules等)是P2P技术最集中的体现,P2P文件分享网络的发展大致有以下几个阶段,包含tracker服务器的网络,无任何服务器的纯DHT网络,混合型P2P网络。DHT网络发展即有"思想/文化"上的"发展",也有一定的商业上的需求(版权管理)。

        DHT全称叫分布式哈希表(Distributed Hash Table),是一种分布式存储方法,一类可由键值来唯一标示的信息按照某种约定/协议被分散地存储在多个节点上,这样也可以有效地避免"中央集权式"的服务器(比如:tracker)的单一故障而带来的整个网络瘫痪。实现DHT的技术/算法有很多种,常用的有:Chord, Pastry, Kademlia等。BT及BT的衍生派(Mainline, Btspilits, Btcomet, uTorrent…),eMule及eMule各类Mods(verycd, easy emules, xtreme…)等P2P文件分享软件都是基于Kademlia算法来实现DHT网络的,BT采用Python的Kademlia实现叫作khashmir。eMule采用C++的Kademlia实现干脆就叫作Kad,当然它们之间有些差别,但基础都是Kademlia。

 

文章来源:

http://www.2cto.com/net/201306/221922.html

http://tieba.baidu.com/p/3047618339

http://blog.csdn.net/mergerly/article/details/7989281

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326451374&siteId=291194637