Rasa学习笔记(二):训练中文对话系统

上一篇文章已经介绍了如何训练问答机器人,然而当训练语料为中文时不再可行。解决方法:在config.yml中加入下面几行后尝试训练。

- name: JiebaTokenizer
- name: CRFEntityExtractor
- name: CountVectorsFeaturizer
  OOV_token: oov
  token_pattern: '(?u)\b\w+\b'
- name: EmbeddingIntentClassifier

然而运行后会报错,错误如下:

ComponentNotFoundException: Failed to load the component 'EmbeddingIntentClassifier'. Cannot find class 'EmbeddingIntentClassifier' in global namespace. Please check that there is no typo in the class name and that you have imported the class into the global namespace. Either your pipeline configuration contains an error or the module you are trying to import is broken (e.g. the module is trying to import a package that is not installed). Traceback (most recent call last):
  File "d:\programs\python\python38\lib\site-packages\rasa\nlu\registry.py", line 121, in get_component_class
    return rasa.shared.utils.common.class_from_module_path(component_name)
  File "d:\programs\python\python38\lib\site-packages\rasa\shared\utils\common.py", line 45, in class_from_module_path
    raise ImportError(f"Cannot retrieve class from path {module_path}.")
ImportError: Cannot retrieve class from path EmbeddingIntentClassifier.

在排错的过程中又发现了EmbeddingBertIntentClassifier,这个暂且放一边。看了半天文档才发现,原来Rasa在1.8版本的时候已经不再使用EmbeddingIntentClassifier了。

## Rasa 1.7 to Rasa 1.8

* The Embedding Intent Classifier is now deprecated and will be replaced by [DIETClassifier](./components.mdx#dietclassifier)

将EmbeddingIntentClassifier修改为DIETClassifier,将配置文件修改如下后再次训练。

    - name: JiebaTokenizer
    - name: CountVectorsFeaturizer
    - name: CRFEntityExtractor
      OOV_token: oov
      token_pattern: '(?u)\b\w+\b'
    - name: DIETClassifier

产生模型后进行一些测试,虽然对话过程有些不尽人意,不过已经能够支持中文了。

Hey! How are you?
Your input ->  我想去北京玩
想去玩哪些地方呢?
Your input ->  北京怎么样
想去玩哪些地方呢?
Your input ->  南锣鼓巷
Your input ->  从来都没去过南锣鼓巷附近。
Your input ->  你好
Bye
Your input ->  你会
Bye
Your input ->  你好
Bye
Your input ->  hello
Hey! How are you?
Your input ->  想去北京玩
想去玩哪些地方呢?
Your input ->  从来都没去过南锣鼓巷附近。
包括南锣鼓巷、什刹海、烟袋斜街、鼓楼大街等等。

猜你喜欢

转载自blog.csdn.net/dragon_T1985/article/details/114390912