1. Introduction

1.1 edge computing vs cloud computing

In this past decade, relying on big data, cloud computing has made a very bright development, while also facing some problems: With Internet applications and user scale explosive growth, 5G popularity and increased bandwidth will bring cloud storage pressure; the current online system to deploy large-scale neural network has been increasingly common for cloud computing tremendous pressure; for some real-time applications require a relatively high, with huge clouds of interaction and communication overhead also experience bottlenecks; while cloud "centralized" computing model will bring the cost of operation and maintenance and the risk of failure.

In fact, the edge computing concept has also been proposed a long time, with the rapid development of storage computing power in recent years, terminal devices, especially smart phones performance (a variety of CPU, GPU running points, increasing memory) has been became the main selling point, its computing power now seems far from being fully utilized. Further, the following advantage is that the edges of the calculated four points: 1) localization data, cloud storage solution and privacy issues; 2) calculating a localized, cloud computing solve overload; 3) communication cost is low, and interactive experiences to solve the problem; 4) decentralized computing, fault avoidance and the ultimate personalization.

Pain points 1.2 Recommendation System

In the era of full access to the radio, in order to solve the problem of information overload, more and more of the recommended scenario to get the rise, especially in the form of a list of recommended based information flow is recommended. Hand Amoy flow of information, for example, you may also like to enter the scene of the user, often interest is not clear, often there is no clear demand for commodities when users browse, but gradually to find to buy goods in the course of the visit. The recommendation system during user stroll in, will end issued and presented different types of goods allows users to pick and choose, the process recommendation system will be to capture the interest change user customers, thus recommend a more consistent user interest in commodities. However recommendation system can not be done when the user is given immediately respond interest change it?

Recommended system past practices are triggered by ordering merchandise cloud server after the client request, and then sort to users under good product, then so do commodity side rendering. So there are the following two questions:

Recommended system delay decision-making: QPS pressure due to the cloud server limitation, information flow will adopt the recommended paging request, which would lead to fewer opportunities cloud recommend the system to adjust to the end-user recommendations, unable to respond to changes in the user's interest. As shown below, the user interaction of the four commodities that it will not like the "motorcycle", but due to the paging request only after 50 commodities, then the other "motorcycle" merchandise can not be timely adjusted when the back page.

Delayed perception of real-time user behavior: the current personalized recommendation system are expressed as a feature by the behavior of the user interaction with the goods, but the user's behavior actually occurs on the client, it is recommended to take the system model you want behavioral characteristics to the needs of the end user on the data sent to the server, this time delay can cause problems, as shown in the delay behavior of the user may reach 10s ~ 1min FIG. At the same time, due to the problem of network bandwidth latency, and other details of a large number of user behavior (such as real-time exposure of goods, users swipe gestures, etc.) it can not be modeled.

Overall, we recommend the current pain points system is that the user's preferences and recommend changes to the system users perceive and adjust the timing of the content does not match the recommended content will appear the current time is not the user wants, the user's browser and click the wishes will decline.

1.3 edge computing system recommendation +

Advantages edge computing, is to edge node (referred to herein as a mobile terminal) with the ability to "think independently", which makes part of the decision-making and cloud computing is no longer dependent on the end side can be more real time, more strategy gives results . When it comes to real-time arrival times 5G, its low-latency feature greatly reduces the interaction time and end the cloud, but this does not affect our use of lower cost end intelligent decision-making and rapid response, but for the end intelligent , the benefits of the cloud and can be more closely bound. In addition, as at the end of the second stage can perceive the user's intention to make decisions, product and user posted closer, which gave birth to more real-time play, the product will no longer be limited to a fixed time to make the cloud such as paging request to give feedback to the new content, but think, when users expressed specific user intent, the product should be how to provide content and intent of the match.

EdgeRec end on the recommendation system is calculated by means of the edge of this real-time and real-time feedback of perception, to solve the real-time perception Client-Server architecture recommendation system, real-time feedback of insufficient capacity. EdgeRec recommendation system provides the end user perception of intent, the end of the rearrangement, the side-to-card capabilities. Make by-second user perception of intent on the end side decision-making, and provide feedback to match the intention to enhance the user clicks the will and wishes of view, the overall change in somatosensory waterfall stream.

2. end the algorithm model

2.1 Overview

FIG follows (a), the recommendation algorithm on the model consists mainly mid EdgeRec "real end user perception" and "real-time rearrangement end" two modules. Wherein, "perceived by the user in real-time end" is modeled as a Heterogeneous User Behavior Sequence Modeling, including "series modeling the behavior of commodity exposure (Item Exposure (IE) Behavior Sequence Modeling)" and "product details page series modeling behavior (Item Page-View (IPV) Behavior Sequence modeling) "in two parts;" real end rearrangement "are modeled as Reranking with Behavior Attention Networks (BAN). Next, we will separately detailed description of these two modules.

2.2 real-time end user perception

2.2.1 Significance

First, we personalized search and recommendation, personalized "thousand thousand faces" comes from features, and "personalized" depends on the user behavior data, reference DIN [1] and so on, they are modeling the user recent commodity sequence interactive, personalized model as input. However, previous work generally considered a "positive feedback" user interaction and commodities (such as clicks, transactions), with little regard to the user and the product of "negative feedback" Interactive (exposure). Indeed, the "positive feedback" feature relatively more clear, noise is relatively small; but we believe that users with examples of real-time commodity "negative feedback" interaction is also important to give an intuitive terms: a category of goods after multiple exposures in real time, the merchandise category click rate will be significantly reduced.

另外一方面，之前的“个性化模型”的工作一般只考虑了与用户“交互”的商品特征，这句话的中心词是“交互的商品”。但是，用户与商品的“交互动作”其实也很重要，比如：用户点击商品后在详情页的行为反应的是对这个商品真正的偏好，真实的数据里面可能存在“伪”点击的情况；同样地，如果用户对某个商品虽然没有点击，但是用户在这个商品上的曝光非常聚焦，也就是商品曝光的停留时长非常长，这种情况也不能绝对说明这个商品的曝光未点击代表了用户不喜欢，尤其在现在信息流推荐页面里面商品的图片展示越来越大，也会透出各种关键词，甚至可以自动播放视频，也许点击对于某些用户已经成为了非常“奢侈”的正反馈了。

最后，我们认为用户在推荐场景的“实时行为”也会非常重要，比如：用户实时点击了不喜欢等负反馈，或者某个类目实时多次曝光却不点击，这些都反映了当时用户的实时偏好，因此推荐系统需要具备实时建模用户偏好的能力，并及时作出调整。

总结来说，端上实时用户感知的意义在如下5点：

2.2.2 实时行为特征体系

根据上文的分析，相比目前云端推荐算法的用户感知建模，端上实时用户感知要具备以下特点：1）从“依赖正反馈交互“推进为“同时关注正负反馈交互”，2）从“交互对象商品”改进为“对商品何种程度的交互”，3）从“准实时交互”推进为“超实时交互”。而这三个特点要靠端上特征来体现，基于以上的三个特点，我们与淘宝客户端BehaviX团队一起设计了用于信息流推荐系统的端上实时用户行为特征体系。如下图所示，端上实时用户行为特征主要包含了“(a) 商品曝光行为”和“(b) 商品详情页行为”这两部分。

2.2.3 异构行为序列建模

这里有两方面的异构，第一：“用户行为动作 (Action)”和“交互商品(Item)”的异构，第二：“瀑布流（曝光）行为 (Item Exposure (IE) Behavior)”和“详情页（点击）行为 (Item Page-View (IPV) Behavior)”的异构。首先我们介绍一下模型输入的组织方式：1）用户一个行为定义为一个 Pair <商品 (Item)，动作 (Action)>，行为序列定义为 List (<商品 (Item)，动作 (Action)>)；2）商品曝光行为序列 (Item Exposure (IE) Behavior Sequence)，“商品”是一个曝光的商品，“动作”是用户在瀑布流对这个商品的交互动作，如曝光时长、滚动速度、滚动方向等；3）商品详情页行为序列 (Item Page-View (IPV) Behavior Sequence)，“商品”是一个点击的商品，“动作”是用户在详情页对这个商品的交互动作，如停留时长、是否加购、是否收藏等。

上面的模型图 (a) 中包含了我们对Heterogeneous User Behavior Sequence Modeling的网络结构图的框架，这里重点说明两点：1）“商品曝光行为序列 (IE Behavior Sequence)”和“商品详情页行为序列 (IPV Behavior Sequence)”先分别单独进行建模，最后再进行融合（如果需要的话）。这里主要考虑的是点击行为一般比较稀疏，而曝光行为非常多，如果先融合成一条行为序列再建模的话，很可能模型会被曝光行为主导。2）商品特征 (Item) 和行为动作特征 (Action) 先分别Encode后，再进行Fusion。这里主要考虑的是商品特征和行为动作特征属于异构的输入，如果下游的任务需要对具体的商品进行Attention的话，只有对同构的输入Attention才会有意义，后面讲到端上重排模型的时候会再重点说一下这个问题。

这里，商品特征序列 (包括 IE Item Sequence和 IPV Item Sequence) 使用GRU网络进行Encode，动作特征序列(包括IE Action Sequence和IPV Action Sequence) 直接使用Identity函数进行Encode。商品序列Embedding (包括 IE Item Embedding和 IPV Item Embedding) 和动作序列Embedding (包括 IE Action Embedding和 IPV Action Embedding) 的Fusion采用简单的Concat操作，得到行为序列Embedding (包括 IE Behavior Embedding和 IPV Behavior Embedding)。

2.3 端上重排

2.3.1 意义

端上重排是端上推荐的基础，拥有实时改变商品推荐顺序的能力，可以把端上重排看做用户Local域的推荐优化，也就是在当页推荐结果内进行优化。端上重排依托于实时用户感知，根据实时的正 / 负反馈（曝光、详情页）和更细节的用户行为特征，在信息流里面不断地对待排序商品进行重新排序，真正做到信息流的实时感知+实时推荐。

重排序这个任务无论在搜索还是推荐领域其实都有很多前人的工作 2，这些工作的核心点其实就是context-aware ranking，这里的context指的是待排序商品之间的上下文，对context的建模可以多种多样，比如：RNN，Transformer，或者人工定义全局特征+DNN。

端上实时重排EdgeRerank这个工作也基于context-aware ranking的基础，但是这里的context不仅仅包含待排序商品之间的上下文，还包含了用户实时行为（实时曝光商品、实时点击商品、用户交互行为）的上下文。通过这些上下文信息，EdgeRerank可以做到：我知道已经排了啥，也知道用户在前面排序上的行为，给我一个待排序的商品上下文，如何排可以达到最优。下面重点介绍端上重排的模型框架，我们称作 Reranking with Behavior Attention Networks (BAN)。

2.3.2 Reranking with Behavior Attention Networks

The above model view (a) contains the framework for our network structure with the Behavior Attention Networks of Reranking. Have said in the background, EdgeRerank considered two context information, context modeling treatment between ordering merchandise we still using methods commonly used sequence modeling, introduced GRU network for goods collection Encode; In order to take into account user behavior in real-time context, there is still a commonly employed method, in fact, the Attention (sometimes also referred to as target attention). Recall that real-time user perception inside the input sequence modeling of heterogeneous behavior: the behavior of a user is defined as a Pair <commodity, action> action sequences defined List (<commodity, action>), where the "goods" refers to the user interact with the merchandise, "action" refers to the movement of goods and user interaction. Network diagram can be seen from the above, the commodity Attention role in ordering goods to be action sequences and, in fact, that is, between the goods and commodities. Attention should know the students familiar (Query, Key, Value) the triad, which the model is to be Encode Query results sorted product (Candidate Item Embedding), Key is the result of the behavior sequence Encode commodities (including IE Item Embedding and IPV Item Embedding), Value is the result of the behavior sequence Embedding Fusion (including IE behavior Embedding and IPV behavior Embedding). Described by the vernacular about motivation: to treat a certain sort of goods collection in commodities, take a look at the user interacted goods are long-sawed, focusing on similar characteristics under commodity, same time, look at the user in these commodities the performance is valid, all together as sort of this reference product.

3. Experimental results

3.1 Offline Experiments

In order to verify in real time on the end user perceived introduced onto the end of the rearrangement effectiveness as a context, we first performed off-line experiments. Comparative and experimental results shown in the following table:

Experimental results

EdgeRec: Secret edge computing Taobao important practice recommendation system

1. Introduction

2. end the algorithm model

3. Experimental results

Guess you like