Locust 源码理解与分析

前言

相信很多小伙伴会选择Locust作为压测工具辅助测试，本文从Locust源码开始分析它的优劣，结论在最后，最终我还是选择了Jmeter

主要分析了Locust源码的两个文件：main.py 和 runners.py

main.py

它执行程序的入口，以下代码仅包含核心代码。

parse_options()

它用来解析传入的参数，阅读它可以了解Locust都接受哪些参数，并且大致是什么作用。

"""
Handle command-line options with optparse.OptionParser.

Return list of arguments, largely for use in `parse_arguments`.
"""

# Initialize ,-H 和 --host是相同的，默认是None，提示是help

 parser = OptionParser(usage="locust [options] [LocustClass [LocustClass2 ... ]]") parser.add_option(
　　
    '-H', '--host',
    dest="host",
    default=None,
    help="Host to load test in the following format: http://10.21.32.33"
)

find_locustfile(locustfile) 和 load_locustfile(path)

　　作用：找到并加载我们手写的locust用例，即-f 传入的文件，py结尾。
　　核心代码（节选）：

# Perform the import (trimming off the .py)
imported = __import__(os.path.splitext(locustfile)[0])
# Return our two-tuple
locusts = dict(filter(is_locust, vars(imported).items()))

上面是：1.将自身的import导入locust用例文件。2.得到其中的用例类。is_locust 是布尔返回类型的方法，用于判断是否继承了 TaskSet。

main()

　　很长的条件分支，根据输入的参数来走不同的逻辑代码。

　　options 对象代表着传入的参数。
　　locusts 对象代表着我们的用例 TaskSet 类。

如果使用了 --run-time 参数，则调用如下代码，调用协程来执行

def timelimit_stop():
    logger.info("Time limit reached. Stopping Locust.")
    runners.locust_runner.quit()
gevent.spawn_later(options.run_time, timelimit_stop)

使用了协程来执行。

如果没有no-web参数：

main_greenlet = gevent.spawn(web.start, locust_classes, options)
也是用协程，启动了一个web程序，本身是flask的。
locust_classes 和 options 是web程序参数，包含了host port。

如果是master

# spawn client spawning/hatching greenlet
if options.no_web:
    runners.locust_runner.start_hatching(wait=True)
    main_greenlet = runners.locust_runner.greenlet
if options.run_time:
    spawn_run_time_limit_greenlet()

会执行master对应的runners，hatching是孵化，即开始启动。
main_greenlet 是协程的主体。是协程的池子，Group() ，我理解类似于众多任务的一个集合（from gevent.pool import Group）。
协程就不解释了，这里一个main_greenlet就是一个协程的主体，至于你是4核的CPU最好是4个协程，这是定义和启动4个slave实现的，代码不会判断这些。
runners.locust_runner 是另一个重要文件的内容，后面再解释。

后面代码都很类似。
master runner 和 slave runner 都是继承的 LocustRunner 类，都是其中的方法实现。

events.py

Locust事件的框架，简单来说，就是声明一个方法，加入到指定的 events 中。
只要是同样的方法（参数不同），都可以加入到这个 events 中。
之后调用events的 fire(self, **kwargs) ，调用到之前声明定义的方法，完成触发动作。

class EventHook(object):
    """
    Simple event class used to provide hooks for different types of events in Locust.

    Here's how to use the EventHook class::

        my_event = EventHook()
        def on_my_event(a, b, **kw):
            print "Event was fired with arguments: %s, %s" % (a, b)
        my_event += on_my_event
        my_event.fire(a="foo", b="bar")
    """

    def __init__(self):
        self._handlers = []

    def __iadd__(self, handler):
        self._handlers.append(handler)
        return self

    def __isub__(self, handler):
        self._handlers.remove(handler)
        return self

    def fire(self, **kwargs):
        for handler in self._handlers:
            handler(**kwargs)

# 一个例子
request_success = EventHook()

使用的代码举例：

# register listener that resets stats when hatching is complete
def on_hatch_complete(user_count):
    self.state = STATE_RUNNING
    if self.options.reset_stats:
        logger.info("Resetting stats\n")
        self.stats.reset_all()
events.hatch_complete += on_hatch_complete

如上，events.hatch_complete 相当于一个触发的任务链（使用 += 添加任务）。
使用下面代码调用：

events.hatch_complete.fire(user_count=self.num_clients)

runners.py

weight_locusts(self, amount, stop_timeout = None)

根据权重计算出要使用的用户数

def weight_locusts(self, amount, stop_timeout = None):
    """
    Distributes the amount of locusts for each WebLocust-class according to it's weight
    returns a list "bucket" with the weighted locusts
    """
        # 返回值是个数组，装载复制的用例的压力请求
    bucket = []
        # weight_sum 是用例中的所有weight值的综合，weight代表权重值。
    weight_sum = sum((locust.weight for locust in self.locust_classes if locust.task_set))
        # 可以有多个用例。
    for locust in self.locust_classes:
                # 一些判断略过
        if not locust.task_set:
            warnings.warn("Notice: Found Locust class (%s) got no task_set. Skipping..." % locust.__name__)
            continue

        if self.host is not None:
            locust.host = self.host
        if stop_timeout is not None:
            locust.stop_timeout = stop_timeout

        # create locusts depending on weight
                # 在循环中这是一个用例，percent 意味着这个用例在总体权重中的比例。
        percent = locust.weight / float(weight_sum)
                # 比如是设置了1000个用户，根据权重比例，计算出1000个用户中的多少个用户来执行这个用例。
        num_locusts = int(round(amount * percent))
                # 复制并添加到结果集中
        bucket.extend([locust for x in xrange(0, num_locusts)])
    return bucket

spawn_locusts(self, spawn_count=None, stop_timeout=None, wait=False)

利用了sleep来达到每秒运行多少用户的效果。

def spawn_locusts(self, spawn_count=None, stop_timeout=None, wait=False):
    if spawn_count is None:
        spawn_count = self.num_clients

    # 计算后的用户数，实际执行的用户数。
    bucket = self.weight_locusts(spawn_count, stop_timeout)
    spawn_count = len(bucket)
    if self.state == STATE_INIT or self.state == STATE_STOPPED:
        self.state = STATE_HATCHING
        self.num_clients = spawn_count
    else:
        self.num_clients += spawn_count
　　# hatch_rate 的解释：The rate per second in which clients are spawned. Only used together with --no-web
    logger.info("Hatching and swarming %i clients at the rate %g clients/s..." % (spawn_count, self.hatch_rate))
    occurence_count = dict([(l.__name__, 0) for l in self.locust_classes])

     # 定义执行的方法
    def hatch():
        sleep_time = 1.0 / self.hatch_rate
        while True:
            if not bucket:
                logger.info("All locusts hatched: %s" % ", ".join(["%s: %d" % (name, count) for name, count in six.iteritems(occurence_count)]))
                events.hatch_complete.fire(user_count=self.num_clients)
                return

                    # 将用例弹出来
            locust = bucket.pop(random.randint(0, len(bucket)-1))
            occurence_count[locust.__name__] += 1
                    # 定义启动的方法，可以看到是执行run()方法
            def start_locust(_):
                try:
                    locust().run()
                except GreenletExit:
                    pass

                    # 协程的执行方法，也是Group()的spawn
            new_locust = self.locusts.spawn(start_locust, locust)
            if len(self.locusts) % 10 == 0:
                logger.debug("%i locusts hatched" % len(self.locusts))
                    # 睡眠即等待指定时间。
            gevent.sleep(sleep_time)

    hatch()
    if wait:
        self.locusts.join()
        logger.info("All locusts dead\n")

kill_locusts(self, kill_count)

　　1.根据权重计算出要干掉多少个用户。
　　2.被干掉的用户在协程池子中停掉，并从权重池子中弹出。

bucket = self.weight_locusts(kill_count)
kill_count = len(bucket)
self.num_clients -= kill_count
logger.info("Killing %i locusts" % kill_count)
dying = []
for g in self.locusts:
    for l in bucket:
        if l == g.args[0]:
            dying.append(g)
            bucket.remove(l)
            break
for g in dying:
    self.locusts.killone(g)
# 收尾工作，主要是提示给页面和打日志
events.hatch_complete.fire(user_count=self.num_clients)

Locust的一些特点及思考，与Jmeter对比

做过性能测试的都知道Jmeter是一个绕不开的工具，那么Locust和它比起来有什么优缺点？

　　
　　Jmeter几乎每天都在更新，Locust几乎没啥更新。

　　Locust的实现是前端的，在 chart.js 中，LocustLineChart，还是比较简陋的。
　　Jmeter的可以安装插件显示，也简陋。

　　Jmeter也是安装插件实现服务端性能指标监控，简陋。
　　Locust就没有。

　　Locust也没有测试报告。
　　Jmeter3.0开始支持报告生成，但是有硬伤。

测试用例部分：

python脚本是亮点，毕竟代码可以实现一切需求。
但不足之处很明显：
1.util包没有，复杂用例编写代码工作量很大，维护成本很大，同时考验代码功力。
2.没有录制用例，保存用例功能，即便使用HttpRunner支持的录制保存，也只是基础用例。
实际上性能测试刚需的如参数化，还是要手写python脚本。
以上对于时间较紧的测试需求，用Locust明显是撞墙。

Jmeter明显好很多，本身GUI界面简单明了，各种内置函数帮助你写脚本。
就算用例编写很复杂，还提供了beanshell，可以使用Java代码实现（虽然调试比较费劲）。
同时Jmeter拥有各种协议的插件，还是不错的。

并发能力

Locust使用4个slave，造成的压力是1.3k，Jmeter是13k，差了10倍。

Locust作为施压侧能力太弱了经过实验最终得出的结论是单核只能承载500左右的RPS

总结：使用Locust要慎重，慎重。