rasa 中文聊天机器人

环境：ubuntu 16.04

主要是运行github项目https://github.com/zqhZY/_rasa_chatbot ，用来回答一些手机通信业务，项目里面有训练集。

如有问题，期待大家指正，非常感谢！

1.Rasa介绍

rasa是开源的工具，包括rasa core和rasa nlu。官网：https://rasa.com/

rasa nlu主要用于用户意图识别和实体识别，会将原始句子"I am looking for a Mexican restaurant in the center of town"转化为以下格式的数据：

{
  "intent": "search_restaurant",
  "entities": {
    "cuisine" : "Mexican",
    "location" : "center"
  }
}

rasa core主要是对话流程的配置和训练。

2. 运行上述github项目

首先按照项目中的readme.md中安装依赖包（“install depedency”）--train nlu model --test rasa nlu

在train-nlu时出错，报错说没有“models/ivr/demo/metadata.json”这个文件，于是将bot.py文件中的train_nlu()拿出来单独执行，并将train_nlu()修改如下：

#用于训练意图识别和实体识别的模型，并将模型保存到models/ivr/demo文件夹下
def train_nlu():
    from rasa_nlu.training_data import load_data
    from rasa_nlu import config
    from rasa_nlu.model import Trainer

    training_data = load_data("data/mobile_nlu_data.json")
    trainer = Trainer(config.load("mobile_nlu_model_config.json"))
    trainer.train(training_data)
    model_directory = trainer.persist("models/", project_name="ivr", fixed_model_name="demo")

    return model_directory

执行后查看是否生成了“models/ivr/demo/metadata.json”文件。

执行python bot.py train-dialogue，出错“Agent.train()不能支持featurizer和max_history参数”，不知道具体如何解决这个错误，将train-dialogue()函数修改如下：


from rasa_core.policies.keras_policy import KerasPolicy
from rasa_core.policies.memoization import MemoizationPolicy
from rasa_core.featurizers import MaxHistoryTrackerFeaturizer,BinarySingleStateFeaturizer

def train_dialogue(domain_file="mobile_domain.yml",
                   model_path="models/dialogue",
                   training_data_file="data/stories1.md"):
    agent=Agent(domain_file,
                policies=[MemoizationPolicy(max_history=6),
                          KerasPolicy(MaxHistoryTrackerFeaturizer(BinarySingleStateFeaturizer(),max_history=6))])
    training_data=agent.load_data(training_data_file)
    #训练agent的策略policy
    agent.train(training_data,epochs=100)
    agent.persist(model_path)
    return agent

对话模型保存在"models/dialogue"文件夹下。

执行python bot.py online_train，用于在线训练，可以输入句子，并调整返回的intent和action，并将结果保存到stories.md。

执行python bot.py run，测试与机器人对话。

3. 使用Http API 【参考链接：http://www.rasa.com/docs/core/server/】

执行 python -m rasa_core.run --enable_api -d models/dialogue -u models/ivr/demo -o out.log

说明：

--enable_api, 是否可以使用额外的API
-d, Rasa Core model的路径
-u, Rasa NLU model的路径
-o, 日志文件的路径

执行curl -XPOST localhost:5050/conversations/default/parse -d '{"query":"你好"}'

可以看到意图识别和实体识别的结果，其中default是用户编号，也可以改成

curl -XPOST localhost:5050/conversations/1/parse -d '{"query":"你好"}'

执行curl -XPOST localhost:5005/conversations/default/respond -d '{"query":"你好"}'

可以查看对“你好”的回复。

之后会尝试利用http API来构建前端网页，待更新。

4.不足

目前体验到的不足是：

故事需要考虑多种可能性且需要人工定义，在多个故事间不能随意切换，则人工不可能定义出全部可能的流程。

意图识别和实体识别不知道如何用自己训练好的模型取代，而当使用rasa nlu时不知道如何提高准确率。

使用http API给定某个用户的输入后，不知道该如何清空该用户的输入，从而重新开始按故事定义的流程走。

rasa 中文聊天机器人

猜你喜欢