Redis and Memcached what advantages and disadvantages, the main scenario is what?


  1. 1, shows the latest list of items
  2. 2, remove the filter
  3. 3, the relevant rankings
  4. 4, sorted by user voting and time
  5. 5, processing expired items
  6. 6, count
  7. 7, a specific project within a specific time
  8. 8, real-time analysis of what is happening, for statistical and prevent spam
  9. 9、Pub/Sub
  10. 10, queue
  11. 11, caching

Question 1. MySql + Memcached architecture

  The actual MySQL is suitable for mass data storage, through the Memcached loaded hot data to cache, for faster access, many companies have used this architecture, but with the increasing volume of business data, and traffic continues to grow, we encountered many problems:

  1.MySQL requires constant demolition demolition library table, Memcached also need to continue to follow the expansion, expansion and maintenance take up a lot of development time.

  2.Memcached and MySQL database data consistency problems.

  3.Memcached data hit rate is low or down machine, a large number of access directly through to the DB, MySQL can not support.

  4. Cross-room cache synchronization problems.

  Many NoSQL flourishing, how to choose

  In recent years, the industry continue to emerge out of a lot of variety of NoSQL products, then how to properly use these products well, play to maximize its strengths, we need in-depth study and think about the actual final analysis, the most important thing is understand the positioning of these products, and understand tradeoffs of each product, so that weaknesses in practical applications, these NoSQL generally mainly used to solve the following problem

  1. The small amount of data storage, high-speed read and write access. Such products through all the data in-momery way to ensure high-speed access, while providing data landing function, which is actually the most important Redis application scenarios.

  2. mass data storage, distributed systems support, to ensure data consistency, convenience of cluster nodes Add / Remove.

  3. This is the most representative aspects dynamo and bigtable 2 papers set out ideas. Between former is a completely decentralized design, the cluster nodes transmit information by way of gossip, final data to ensure consistency, which is a center of the design to ensure strong consistency through a similar distributed lock service, write data the first to write into memory and redo log, and then periodically compat merge onto the disk, random write optimized for sequential writes, improve write performance.

  4.Schema free, auto-sharding and so on. For example, the current common database of some of the documents are in support of schema-free, direct json format data storage, and supports features such as auto-sharding, such as mongodb.

  Faced with these different types of NoSQL products, we need to choose the most suitable product according to our business scenario.

Redis most suitable for all data in-momory scene, although Redis also provide lasting capabilities, but is actually more of a disk-backed function, with persistence in the traditional sense has a relatively large difference, then we will probably have questions, it seems more like a Redis enhanced version of Memcached, then when to use Memcached, Redis when to use it?

If you simply compare the difference between Redis and Memcached, most will get the following view:

1, Redis only support simple k / v types of data, while also providing a storage list, set, zset, hash and other data structures.
2, Redis supports the backup data, i.e., data backup master-slave mode.
3, Redis supports data persistence, data in memory can be kept on disk, restart when you can load be used again.

2. Redis common data types

Redis data types most commonly used are the following:

  1. String
  2. Hash
  3. List
  4. Set
  5. Sorted set
  6. pub/sub
  7. Transactions

Before describing these types of data types, we first under Redis internal memory management is how to describe these different types of data through a map to know:




Using a first internal Redis redisObject object to represent all of the key and value, redisObject main information as shown above:

value representative of a particular type is what type of data objects,

encoding different types of data are stored in the interior of the redis,

For example: type = string stored in the representative value is an ordinary string, then the corresponding encoding can be raw or int, int represents the internal if redis is actually stored-value type indicating the type of string, of course, provided this value represents the string itself may be used, such as: "123", "456" such strings.

It should explain vm special field, only opened Redis virtual memory capabilities in this field will really allocate memory, this feature is off by default state, this function will be described in detail later. By the graph we can see Redis use redisObject to represent all the key / value data is more a waste of memory, of course, these memory management costs paid primarily also to give different data types Redis provides a unified management interface, the actual author also provides a variety of ways to help us try to save memory usage, we will then be discussed in detail.

3. The various types of data applications and implementations

Below these seven types of data use and analysis of our internal implementation under the first-come, one by one:

  • String:

Strings data structure is a simple key-value type, value fact, not only String, may also be a number.

Commonly used commands: set, get, decr, incr, mget and so on.

Scenarios: String is the most commonly used type of data, common key / value storage can be classified as such which can be fully realized at present Memcached functionality, and more efficient. You can also enjoy regular Redis persistence, operation log and Replication functions. In addition to providing the same Memcached get, set, incr, decr other operations outside, Redis also provides some of the following operations:

    • Get string length

    • String to append content

    • Set and get a piece of content string

    • Set and get a bit string (bit)

    • Batch setting contents of a series of strings

Implementation: String redis in the internal storage is a default string is referenced redisObject, when faced incr, decr other operations will turn into numerical calculation, encoding field redisObject case of an int.

  • Hash

Commonly used commands: hget, HSET, hgetall and so on.

Scenario: In Memcached, we often some structured information packed into HashMap, serialized in the client is stored as a string value, such as the user's nickname, age, gender, integration, etc., at this time needs to be modified wherein after a particular time, is often necessary to remove all the values deserialized, change a value of one, then the sequence of stored back. This not only increases the cost, it does not apply to a number of possible concurrent operations occasions (such as two concurrent operations need to modify points). The Hash structure of Redis can make you look like Update in the database as a property only modify a particular property values.

We give a simple example to describe the Hash application scenario, such as a user information we want to store object data, the following information:

User ID for the key lookup, value stored in the user object contains the name, age, date of birth and other information, if the common key / value structure to store, mainly in the following two kinds of storage:



The first way to find a user ID as a key, the other package information into a sequence of objects stored, disadvantage of this approach is to increase the serialization / deserialization overhead, and need to modify a when the information needed to retrieve the entire object, and that modifications concurrent operation requires protection, introduction of complex problems and CAS.



The second method is the user information object how many members would keep the number of key-value pairs to children, with the corresponding user ID + name of the property to get the value of the corresponding attribute as a unique identifier, while eliminating the overhead of serialization and concurrency issues , but the user ID is stored repeatedly, if such a large amount of data exists, a waste of memory is still very substantial.

So Redis Hash provided a good solution to this problem, Redis of Hash Value is actually stored internally as a HashMap, and provides direct access to members of the Map interface,



In other words, Key is still the user ID, value is a Map, the Map is the key attribute of members, value is the attribute value, so that the data can be accessed and modified directly through its internal Map of Key (Redis in Map key of said inner Field), that is, by a key (user ID) + field (tag attributes) can attribute data corresponding to the operation, and does not require duplicate storage of data, may not provide concurrent modification and sequence control problem. A good solution to the problem.

Also note here, Redis provides an interface (hgetall) can be taken directly to all of the attribute data, but if the members of the internal Map of many, then a traverse of the entire interior of the Map, due Redis single-threaded model, this traversal operation It may be more time-consuming, while the other requests of other clients completely unresponsive, and that needs extra attention.

Method to realize:

Redis has been said above corresponds to the internal Hash Value is actually a HashMap, there will be two different practical realization of this, members of the Hash relatively little time Redis to save memory will adopt a similar approach to the one-dimensional array of compact storage without a true the HashMap configuration, the corresponding value redisObject encoding is zipmap, when increasing the number of members will be automatically converted into a real HashMap, this time for the encoding ht.

  • List

常用命令:lpush,rpush,lpop,rpop,lrange等。

应用场景:

Redis list的应用场景非常多,也是Redis最重要的数据结构之一,比如twitter的关注列表,粉丝列表等都可以用Redis的list结构来实现。

Lists 就是链表,相信略有数据结构知识的人都应该能理解其结构。使用Lists结构,我们可以轻松地实现最新消息排行等功能。Lists的另一个应用就是消息队列,
可以利用Lists的PUSH操作,将任务存在Lists中,然后工作线程再用POP操作将任务取出进行执行。Redis还提供了操作Lists中某一段的api,你可以直接查询,删除Lists中某一段的元素。

实现方式:

Redis list的实现为一个双向链表,即可以支持反向查找和遍历,更方便操作,不过带来了部分额外的内存开销,Redis内部的很多实现,包括发送缓冲队列等也都是用的这个数据结构。

  • Set

常用命令:

sadd,spop,smembers,sunion 等。

应用场景:

Redis set对外提供的功能与list类似是一个列表的功能,特殊之处在于set是可以自动排重的,当你需要存储一个列表数据,又不希望出现重复数据时,set是一个很好的选择,并且set提供了判断某个成员是否在一个set集合内的重要接口,这个也是list所不能提供的。

Sets 集合的概念就是一堆不重复值的组合。利用Redis提供的Sets数据结构,可以存储一些集合性的数据,比如在微博应用中,可以将一个用户所有的关注人存在一个集合中,将其所有粉丝存在一个集合。Redis还为集合提供了求交集、并集、差集等操作,可以非常方便的实现如共同关注、共同喜好、二度好友等功能,对上面的所有集合操作,你还可以使用不同的命令选择将结果返回给客户端还是存集到一个新的集合中。

实现方式:

set 的内部实现是一个 value永远为null的HashMap,实际就是通过计算hash的方式来快速排重的,这也是set能提供判断一个成员是否在集合内的原因。

  • Sorted Set

常用命令:

zadd,zrange,zrem,zcard等

使用场景:

Redis sorted set的使用场景与set类似,区别是set不是自动有序的,而sorted set可以通过用户额外提供一个优先级(score)的参数来为成员排序,并且是插入有序的,即自动排序。当你需要一个有序的并且不重复的集合列表,那么可以选择sorted set数据结构,比如twitter 的public timeline可以以发表时间作为score来存储,这样获取时就是自动按时间排好序的。

另外还可以用Sorted Sets来做带权重的队列,比如普通消息的score为1,重要消息的score为2,然后工作线程可以选择按score的倒序来获取工作任务。让重要的任务优先执行。

实现方式:

Redis sorted set的内部使用HashMap和跳跃表(SkipList)来保证数据的存储和有序,HashMap里放的是成员到score的映射,而跳跃表里存放的是所有的成员,排序依据是HashMap里存的score,使用跳跃表的结构可以获得比较高的查找效率,并且在实现上比较简单。

  • Pub/Sub

Pub/Sub 从字面上理解就是发布(Publish)与订阅(Subscribe),在Redis中,你可以设定对某一个key值进行消息发布及消息订阅,当一个key值上进行了消息发布后,所有订阅它的客户端都会收到相应的消息。这一功能最明显的用法就是用作实时消息系统,比如普通的即时聊天,群聊等功能。

  • Transactions

谁说NoSQL都不支持事务,虽然Redis的Transactions提供的并不是严格的ACID的事务(比如一串用EXEC提交执行的命令,在执行中服务器宕机,那么会有一部分命令执行了,剩下的没执行),但是这个Transactions还是提供了基本的命令打包执行的功能(在服务器不出问题的情况下,可以保证一连串的命令是顺序在一起执行的,中间有会有其它客户端命令插进来执行)。Redis还提供了一个Watch功能,你可以对一个key进行Watch,然后再执行Transactions,在这过程中,如果这个Watched的值进行了修改,那么这个Transactions会发现并拒绝执行。

4. Redis实际应用场景

Redis在很多方面与其他数据库解决方案不同:它使用内存提供主存储支持,而仅使用硬盘做持久性的存储;它的数据模型非常独特,用的是单线程。另一个大区别在于,你可以在开发环境中使用Redis的功能,但却不需要转到Redis。

转向Redis当然也是可取的,许多开发者从一开始就把Redis作为首选数据库;但设想如果你的开发环境已经搭建好,应用已经在上面运行了,那么更换数据库框架显然不那么容易。另外在一些需要大容量数据集的应用,Redis也并不适合,因为它的数据集不会超过系统可用的内存。所以如果你有大数据应用,而且主要是读取访问模式,那么Redis并不是正确的选择。

然而我喜欢Redis的一点就是你可以把它融入到你的系统中来,这就能够解决很多问题,比如那些你现有的数据库处理起来感到缓慢的任务。这些你就可以通过Redis来进行优化,或者为应用创建些新的功能。在本文中,我就想探讨一些怎样将Redis加入到现有的环境中,并利用它的原语命令等功能来解决 传统环境中碰到的一些常见问题。在这些例子中,Redis都不是作为首选数据库。

1、显示最新的项目列表

下面这个语句常用来显示最新项目,随着数据多了,查询毫无疑问会越来越慢。

SELECT * FROM foo WHERE ... ORDER BY time DESC LIMIT 10

在Web应用中,“列出最新的回复”之类的查询非常普遍,这通常会带来可扩展性问题。这令人沮丧,因为项目本来就是按这个顺序被创建的,但要输出这个顺序却不得不进行排序操作。

类似的问题就可以用Redis来解决。比如说,我们的一个Web应用想要列出用户贴出的最新20条评论。在最新的评论边上我们有一个“显示全部”的链接,点击后就可以获得更多的评论。

我们假设数据库中的每条评论都有一个唯一的递增的ID字段。

我们可以使用分页来制作主页和评论页,使用Redis的模板,每次新评论发表时,我们会将它的ID添加到一个Redis列表:

LPUSH latest.comments <ID>


我们将列表裁剪为指定长度,因此Redis只需要保存最新的5000条评论:

LTRIM latest.comments 0 5000

每次我们需要获取最新评论的项目范围时,我们调用一个函数来完成(使用伪代码):

  1. FUNCTION get_latest_comments(start, num_items):
  2. id_list = redis.lrange("latest.comments",start,start+num_items - 1)
  3. IF id_list.length < num_items
  4. id_list = SQL_DB("SELECT ... ORDER BY time LIMIT ...")
  5. END
  6. RETURN id_list
  7. END


这里我们做的很简单。在Redis中我们的最新ID使用了常驻缓存,这是一直更新的。但是我们做了限制不能超过5000个ID,因此我们的获取ID函数会一直询问Redis。只有在start/count参数超出了这个范围的时候,才需要去访问数据库。

我们的系统不会像传统方式那样“刷新”缓存,Redis实例中的信息永远是一致的。SQL数据库(或是硬盘上的其他类型数据库)只是在用户需要获取“很远”的数据时才会被触发,而主页或第一个评论页是不会麻烦到硬盘上的数据库了。

2、删除与过滤

我们可以使用LREM来删除评论。如果删除操作非常少,另一个选择是直接跳过评论条目的入口,报告说该评论已经不存在。

有些时候你想要给不同的列表附加上不同的过滤器。如果过滤器的数量受到限制,你可以简单的为每个不同的过滤器使用不同的Redis列表。毕竟每个列表只有5000条项目,但Redis却能够使用非常少的内存来处理几百万条项目。

3、排行榜相关

另一个很普遍的需求是各种数据库的数据并非存储在内存中,因此在按得分排序以及实时更新这些几乎每秒钟都需要更新的功能上数据库的性能不够理想。

典型的比如那些在线游戏的排行榜,比如一个Facebook的游戏,根据得分你通常想要:

- 列出前100名高分选手

- 列出某用户当前的全球排名

这些操作对于Redis来说小菜一碟,即使你有几百万个用户,每分钟都会有几百万个新的得分。

模式是这样的,每次获得新得分时,我们用这样的代码:

ZADD leaderboard <score> <username>

你可能用userID来取代username,这取决于你是怎么设计的。

得到前100名高分用户很简单:ZREVRANGE leaderboard 0 99。

用户的全球排名也相似,只需要:ZRANK leaderboard <username>。

4、按照用户投票和时间排序

排行榜的一种常见变体模式就像Reddit或Hacker News用的那样,新闻按照类似下面的公式根据得分来排序:

score = points / time^alpha

因此用户的投票会相应的把新闻挖出来,但时间会按照一定的指数将新闻埋下去。下面是我们的模式,当然算法由你决定。

模式是这样的,开始时先观察那些可能是最新的项目,例如首页上的1000条新闻都是候选者,因此我们先忽视掉其他的,这实现起来很简单。

每次新的新闻贴上来后,我们将ID添加到列表中,使用LPUSH + LTRIM,确保只取出最新的1000条项目。

有一项后台任务获取这个列表,并且持续的计算这1000条新闻中每条新闻的最终得分。计算结果由ZADD命令按照新的顺序填充生成列表,老新闻则被清除。这里的关键思路是排序工作是由后台任务来完成的。

5、处理过期项目

另一种常用的项目排序是按照时间排序。我们使用unix时间作为得分即可。

模式如下:

- 每次有新项目添加到我们的非Redis数据库时,我们把它加入到排序集合中。这时我们用的是时间属性,current_time和time_to_live。

- 另一项后台任务使用ZRANGE…SCORES查询排序集合,取出最新的10个项目。如果发现unix时间已经过期,则在数据库中删除条目。

6、计数

Redis是一个很好的计数器,这要感谢INCRBY和其他相似命令。

我相信你曾许多次想要给数据库加上新的计数器,用来获取统计或显示新信息,但是最后却由于写入敏感而不得不放弃它们。

好了,现在使用Redis就不需要再担心了。有了原子递增(atomic increment),你可以放心的加上各种计数,用GETSET重置,或者是让它们过期。

例如这样操作:

INCR user:<id> EXPIRE

user:<id> 60

你可以计算出最近用户在页面间停顿不超过60秒的页面浏览量,当计数达到比如20时,就可以显示出某些条幅提示,或是其它你想显示的东西。

7、特定时间内的特定项目

另一项对于其他数据库很难,但Redis做起来却轻而易举的事就是统计在某段特点时间里有多少特定用户访问了某个特定资源。比如我想要知道某些特定的注册用户或IP地址,他们到底有多少访问了某篇文章。

每次我获得一次新的页面浏览时我只需要这样做:

SADD page:day1:<page_id> <user_id>

当然你可能想用unix时间替换day1,比如time()-(time()%3600*24)等等。

想知道特定用户的数量吗?只需要使用SCARD page:day1:<page_id>。

需要测试某个特定用户是否访问了这个页面?SISMEMBER page:day1:<page_id>。

8、实时分析正在发生的情况,用于数据统计与防止垃圾邮件等

我们只做了几个例子,但如果你研究Redis的命令集,并且组合一下,就能获得大量的实时分析方法,有效而且非常省力。使用Redis原语命令,更容易实施垃圾邮件过滤系统或其他实时跟踪系统。

9、Pub/Sub

Redis的Pub/Sub非常非常简单,运行稳定并且快速。支持模式匹配,能够实时订阅与取消频道。

10、队列

你应该已经注意到像list push和list pop这样的Redis命令能够很方便的执行队列操作了,但能做的可不止这些:比如Redis还有list pop的变体命令,能够在列表为空时阻塞队列。

现代的互联网应用大量地使用了消息队列(Messaging)。消息队列不仅被用于系统内部组件之间的通信,同时也被用于系统跟其它服务之间的交互。消息队列的使用可以增加系统的可扩展性、灵活性和用户体验。非基于消息队列的系统,其运行速度取决于系统中最慢的组件的速度(注:短板效应)。而基于消息队列可以将系统中各组件解除耦合,这样系统就不再受最慢组件的束缚,各组件可以异步运行从而得以更快的速度完成各自的工作。

此外,当服务器处在高并发操作的时候,比如频繁地写入日志文件。可以利用消息队列实现异步处理。从而实现高性能的并发操作。

11、缓存

Redis的缓存部分值得写一篇新文章,我这里只是简单的说一下。Redis能够替代memcached,让你的缓存从只能存储数据变得能够更新数据,因此你不再需要每次都重新生成数据了。

转自:https://blog.csdn.net/wangxiaoxue99/article/details/78563150

Guess you like

Origin www.cnblogs.com/jokmangood/p/11705947.html