[Turn] a simple commodity-based recommendation system Hbase

Based on a simple product recommendation system blog Hbase Category: Architecture

Throughout the electricity industry's website, I stood on a user's perspective, there are many product recommendations:

 

One is through my search, view items that the system statistics to search, view other users to search the item to view the number of other commodities, the top-ranking recommended to me, a chestnut Dangdang:
I look at the "Hadoop The Definitive Guide", the system recommended to me by a bunch of other books:
 

 

One is through my recent searches, viewed commodity, the system I recommend some of it I think the interest of goods, Taobao a chestnut:

There are several, I feel quite interesting:
 
 

In particular this search Hadoop users, and ultimately a lot of people have bought a down jacket, this is why? Learning hadoop cheese are very cold it? Or, northerners?
 



  1. Multifarious Ha, another matter. Dir only discuss one of the next person YY Design: based on a simple commodity recommendation system Hbase of.
  2. Leaving aside other processes do not say, solely on the terms of this recommendation, Hbase based design, two tables on OK. A table user_item record each user to view all the goods, item_user record view for all users of an item.
  3. user_item: userid as revolves, as clusters and column item: itemid, data are as follows:
  4. user1    item:itemid    timestamp=1234567891, value=item1
  5. user1    item:itemid    timestamp=1234567892, value=item2
  6. user1    item:itemid    timestamp=1234567893, value=item3
  7. user2    item:itemid    timestamp=1234567894, value=item4
  8. user2    item:itemid    timestamp=1234567895, value=item5
  9. user3    item:itemid    timestamp=1234567881, value=item1
  10. user3    item:itemid    timestamp=1234567832, value=item2
  11. user4    item:itemid    timestamp=1234567843, value=item3
  12. user4    item:itemid    timestamp=1234567854, value=item4
  13. user4    item:itemid    timestamp=1234567895, value=item5
  14. ......
  15. item_user:itemid作为行健,列簇和列为:user:userid,数据如下:
  16. item1    user:userid    timestamp=1234567891, value=user1
  17. item1    user:userid    timestamp=1234567892, value=user2
  18. item1    user:userid    timestamp=1234567893, value=user4
  19. item2    user:userid    timestamp=1234567894, value=user3
  20. item3    user:userid    timestamp=1234567895, value=user2
  21. item3    user:userid    timestamp=1234567881, value=user4
  22. item6    user:userid    timestamp=1234567832, value=user5
  23. item6    user:userid    timestamp=1234567843, value=user6
  24. item8    user:userid    timestamp=1234567854, value=user5
  25. item8    user:userid    timestamp=1234567895, value=user4
  26. ......
  27. 大概的业务是:我查看《Hadoop权威指南》(item1)时,系统从item_user表中以item1作为行健查询出所有查看过item1的用户,再分别以各userid为行健,从user_item表中查询出所有查看过的商品,最后去重、统计、排序并显示。

 

http://f.dataguru.cn/thread-33415-1-1.html

Guess you like

Origin www.cnblogs.com/shujuxiong/p/11261840.html