RASA-选择器组件Selector

选择组件从一系列候选的回复中预测机器人的答复。

ResponseSelector

NLU的解析输出将具有一个名为response_selector的属性,其中包含每个response selector组件的输出。每个response selector都由它的retrieval_intent参数标识,并存储两个信息:

  1. 响应:对应检索意图、预测置信度和相关响应下的预测响应关键字。

  1. 排名:排名前10位候选答案的置信度。

ResponseSelector输入为用户信息的稀疏特征或稠密特征,其输出为一种字典,其关键字作为响应选择器的检索意图,值包含预测响应、置信度和检索意图下的响应关键字。

{
    "response_selector": {
      "faq": {
        "response": {
          "id": 1388783286124361986,
          "confidence": 0.7,
          "intent_response_key": "chitchat/ask_weather",
          "responses": [
            {
              "text": "It's sunny in Berlin today",
              "image": "https://i.imgur.com/nGF1K8f.jpg"
            },
            {
              "text": "I think it's about to rain."
            }
          ],
          "utter_action": "utter_chitchat/ask_weather"
         },
        "ranking": [
          {
            "id": 1388783286124361986,
            "confidence": 0.7,
            "intent_response_key": "chitchat/ask_weather"
          },
          {
            "id": 1388783286124361986,
            "confidence": 0.3,
            "intent_response_key": "chitchat/ask_name"
          }
        ]
      }
    }
}

如果特定response selector的retrieval_intent参数保留为其默认值,则相应的response selector将在返回的输出中标识为默认值。

{
    "response_selector": {
      "default": {
        "response": {
          "id": 1388783286124361986,
          "confidence": 0.7,
          "intent_response_key": "chitchat/ask_weather",
          "responses": [
            {
              "text": "It's sunny in Berlin today",
              "image": "https://i.imgur.com/nGF1K8f.jpg"
            },
            {
              "text": "I think it's about to rain."
            }
          ],
          "utter_action": "utter_chitchat/ask_weather"
         },
        "ranking": [
          {
            "id": 1388783286124361986,
            "confidence": 0.7,
            "intent_response_key": "chitchat/ask_weather"
          },
          {
            "id": 1388783286124361986,
            "confidence": 0.3,
            "intent_response_key": "chitchat/ask_name"
          }
        ]
      }
    }
}

response selector组件可用于构建响应检索模型,以从一组候选响应中直接预测机器人响应。对话管理器使用该模型的预测来说出预测的响应。它将用户输入和响应标签嵌入到同一空间中,并遵循与DIETClassifier完全相同的神经网络架构和优化。

要使用此组件,您的培训数据应包含检索意图。要定义这些,请查看有关NLU培训示例的文档和有关定义检索意图的响应语句的文档。

该算法包括DIETClassifier使用的几乎所有超参数。如果要调整模型,请首先修改以下参数:

epochs:此参数设置算法将看到训练数据的次数(默认值:300)。一次epoch等于所有训练样例的一次前向传播和一次反向传播。有时模型需要更多的 epoch 才能正确学习。有时更多的时代不会影响性能。epoch 的数量越少,模型训练得越快。

hidden_layers_sizes:此参数允许您定义前馈层的数量及其用户消息和意图的输出维度(默认值:text: ], label: [])。列表中的每个条目都对应一个前馈层。例如,如果你设置text: [256, 128],我们将在transformer前面添加两个前馈层。输入标记的向量(来自用户消息)将传递到这些层。第一层的输出维度为 256,第二层的输出维度为 128。如果使用空列表(默认行为),则不会添加前馈层。确保仅使用正整数值。通常,使用 2 的幂数。此外,通常的做法是列表中的值递减:下一个值小于或等于之前的值。

扫描二维码关注公众号,回复: 15064240 查看本文章

embedding_dimension:此参数定义模型内部使用的嵌入层的输出维度(默认值:20)。我们在模型架构中使用多个嵌入层。例如,完整话语和意图的向量在比较和计算损失之前被传递到嵌入层。

number_of_transformer_layers:此参数设置要使用的transformer层数(默认值:2)。transformer层数对应于用于模型的 transformer块。

transformer_size:此参数设置 transformer中的单元数(默认值:256)。来自transformer的矢量将具有给定的transformer_size.

connection_density:此参数定义了模型中所有前馈层设置为非零值的内核权重的分数(默认值:0.2)。该值应介于 0 和 1 之间。如果设置connection_density 为 1,则不会将任何内核权重设置为 0,该层充当标准前馈层。您不应设置connection_density为 0,因为这会导致所有内核权重为 0,即模型无法学习。

constrain_similarities:此参数设置为True对所有相似项应用 sigmoid 交叉熵损失。这有助于将输入标签和负标签之间的相似性保持在较小的值。这应该有助于更好地将模型推广到现实世界的测试集。

model_confidence:此参数允许用户配置在推理期间如何计算置信度。它只能将一个值作为输入,即softmax。在softmax中,置信度在范围内[0, 1]。计算出的相似度用softmax激活函数归一化。

---------------------------------+------------------+--------------------------------------------------------------+
| Parameter                       | Default Value    | Description                                                  |
+=================================+==================+==============================================================+
| hidden_layers_sizes             | text: []         | Hidden layer sizes for layers before the embedding layers    |
|                                 | label: []        | for user messages and labels. The number of hidden layers is |
|                                 |                  | equal to the length of the corresponding list.               |
+---------------------------------+------------------+--------------------------------------------------------------+
| share_hidden_layers             | False            | Whether to share the hidden layer weights between user       |
|                                 |                  | messages and labels.                                         |
+---------------------------------+------------------+--------------------------------------------------------------+
| transformer_size                | 256              | Number of units in transformer.                              |
+---------------------------------+------------------+--------------------------------------------------------------+
| number_of_transformer_layers    | 2                | Number of transformer layers.                                |
+---------------------------------+------------------+--------------------------------------------------------------+
| number_of_attention_heads       | 4                | Number of attention heads in transformer.                    |
+---------------------------------+------------------+--------------------------------------------------------------+
| use_key_relative_attention      | False            | If 'True' use key relative embeddings in attention.          |
+---------------------------------+------------------+--------------------------------------------------------------+
| use_value_relative_attention    | False            | If 'True' use value relative embeddings in attention.        |
+---------------------------------+------------------+--------------------------------------------------------------+
| max_relative_position           | None             | Maximum position for relative embeddings.                    |
+---------------------------------+------------------+--------------------------------------------------------------+
| unidirectional_encoder          | False            | Use a unidirectional or bidirectional encoder.               |
+---------------------------------+------------------+--------------------------------------------------------------+
| batch_size                      | [64, 256]        | Initial and final value for batch sizes.                     |
|                                 |                  | Batch size will be linearly increased for each epoch.        |
|                                 |                  | If constant `batch_size` is required, pass an int, e.g. `8`. |
+---------------------------------+------------------+--------------------------------------------------------------+
| batch_strategy                  | "balanced"       | Strategy used when creating batches.                         |
|                                 |                  | Can be either 'sequence' or 'balanced'.                      |
+---------------------------------+------------------+--------------------------------------------------------------+
| epochs                          | 300              | Number of epochs to train.                                   |
+---------------------------------+------------------+--------------------------------------------------------------+
| random_seed                     | None             | Set random seed to any 'int' to get reproducible results.    |
+---------------------------------+------------------+--------------------------------------------------------------+
| learning_rate                   | 0.001            | Initial learning rate for the optimizer.                     |
+---------------------------------+------------------+--------------------------------------------------------------+
| embedding_dimension             | 20               | Dimension size of embedding vectors.                         |
+---------------------------------+------------------+--------------------------------------------------------------+
| dense_dimension                 | text: 128        | Dense dimension for sparse features to use.                  |
|                                 | label: 20        |                                                              |
+---------------------------------+------------------+--------------------------------------------------------------+
| concat_dimension                | text: 128        | Concat dimension for sequence and sentence features.         |
|                                 | label: 20        |                                                              |
+---------------------------------+------------------+--------------------------------------------------------------+
| number_of_negative_examples     | 20               | The number of incorrect labels. The algorithm will minimize  |
|                                 |                  | their similarity to the user input during training.          |
+---------------------------------+------------------+--------------------------------------------------------------+
| similarity_type                 | "auto"           | Type of similarity measure to use, either 'auto' or 'cosine' |
|                                 |                  | or 'inner'.                                                  |
+---------------------------------+------------------+--------------------------------------------------------------+
| loss_type                       | "cross_entropy"  | The type of the loss function, either 'cross_entropy'        |
|                                 |                  | or 'margin'. If type 'margin' is specified,                  |
|                                 |                  | "model_confidence=cosine" will be used which is deprecated   |
|                                 |                  | as of 2.3.4. See footnote (1).                               |
+---------------------------------+------------------+--------------------------------------------------------------+
| ranking_length                  | 10               | Number of top intents to report. Set to 0 to report all      |
|                                 |                  | intents.                                                     |
+---------------------------------+------------------+--------------------------------------------------------------+
| renormalize_confidences         | False            | Normalize the reported top intents. Applicable only with loss|
|                                 |                  | type 'cross_entropy' and 'softmax' confidences.              |
+---------------------------------+------------------+--------------------------------------------------------------+
| maximum_positive_similarity     | 0.8              | Indicates how similar the algorithm should try to make       |
|                                 |                  | embedding vectors for correct labels.                        |
|                                 |                  | Should be 0.0 < ... < 1.0 for 'cosine' similarity type.      |
+---------------------------------+------------------+--------------------------------------------------------------+
| maximum_negative_similarity     | -0.4             | Maximum negative similarity for incorrect labels.            |
|                                 |                  | Should be -1.0 < ... < 1.0 for 'cosine' similarity type.     |
+---------------------------------+------------------+--------------------------------------------------------------+
| use_maximum_negative_similarity | True             | If 'True' the algorithm only minimizes maximum similarity    |
|                                 |                  | over incorrect intent labels, used only if 'loss_type' is    |
|                                 |                  | set to 'margin'.                                             |
+---------------------------------+------------------+--------------------------------------------------------------+
| scale_loss                      | False            | Scale loss inverse proportionally to confidence of correct   |
|                                 |                  | prediction.                                                  |
+---------------------------------+------------------+--------------------------------------------------------------+
| regularization_constant         | 0.002            | The scale of regularization.                                 |
+---------------------------------+------------------+--------------------------------------------------------------+
| negative_margin_scale           | 0.8              | The scale of how important it is to minimize the maximum     |
|                                 |                  | similarity between embeddings of different labels.           |
+---------------------------------+------------------+--------------------------------------------------------------+
| connection_density              | 0.2              | Connection density of the weights in dense layers.           |
|                                 |                  | Value should be between 0 and 1.                             |
+---------------------------------+------------------+--------------------------------------------------------------+
| drop_rate                       | 0.2              | Dropout rate for encoder. Value should be between 0 and 1.   |
|                                 |                  | The higher the value the higher the regularization effect.   |
+---------------------------------+------------------+--------------------------------------------------------------+
| drop_rate_attention             | 0.0              | Dropout rate for attention. Value should be between 0 and 1. |
|                                 |                  | The higher the value the higher the regularization effect.   |
+---------------------------------+------------------+--------------------------------------------------------------+
| use_sparse_input_dropout        | True             | If 'True' apply dropout to sparse input tensors.             |
+---------------------------------+------------------+--------------------------------------------------------------+
| use_dense_input_dropout         | True             | If 'True' apply dropout to dense input tensors.              |
+---------------------------------+------------------+--------------------------------------------------------------+
| evaluate_every_number_of_epochs | 20               | How often to calculate validation accuracy.                  |
|                                 |                  | Set to '-1' to evaluate just once at the end of training.    |
+---------------------------------+------------------+--------------------------------------------------------------+
| evaluate_on_number_of_examples  | 0                | How many examples to use for hold out validation set.        |
|                                 |                  | Large values may hurt performance, e.g. model accuracy.      |
+---------------------------------+------------------+--------------------------------------------------------------+
| intent_classification           | True             | If 'True' intent classification is trained and intents are   |
|                                 |                  | predicted.                                                   |
+---------------------------------+------------------+--------------------------------------------------------------+
| entity_recognition              | True             | If 'True' entity recognition is trained and entities are     |
|                                 |                  | extracted.                                                   |
+---------------------------------+------------------+--------------------------------------------------------------+
| use_masked_language_model       | False            | If 'True' random tokens of the input message will be masked  |
|                                 |                  | and the model has to predict those tokens. It acts like a    |
|                                 |                  | regularizer and should help to learn a better contextual     |
|                                 |                  | representation of the input.                                 |
+---------------------------------+------------------+--------------------------------------------------------------+
| tensorboard_log_directory       | None             | If you want to use tensorboard to visualize training         |
|                                 |                  | metrics, set this option to a valid output directory. You    |
|                                 |                  | can view the training metrics after training in tensorboard  |
|                                 |                  | via 'tensorboard --logdir <path-to-given-directory>'.        |
+---------------------------------+------------------+--------------------------------------------------------------+
| tensorboard_log_level           | "epoch"          | Define when training metrics for tensorboard should be       |
|                                 |                  | logged. Either after every epoch ('epoch') or for every      |
|                                 |                  | training step ('batch').                                 |
+---------------------------------+------------------+--------------------------------------------------------------+
| featurizers                     | []               | List of featurizer names (alias names). Only features        |
|                                 |                  | coming from the listed names are used. If list is empty      |
|                                 |                  | all available features are used.                             |
+---------------------------------+------------------+--------------------------------------------------------------+
| checkpoint_model                | False            | Save the best performing model during training. Models are   |
|                                 |                  | stored to the location specified by `--out`. Only the one    |
|                                 |                  | best model will be saved.                                    |
|                                 |                  | Requires `evaluate_on_number_of_examples > 0` and            |
|                                 |                  | `evaluate_every_number_of_epochs > 0`                        |
+---------------------------------+------------------+--------------------------------------------------------------+
| split_entities_by_comma         | True             | Splits a list of extracted entities by comma to treat each   |
|                                 |                  | one of them as a single entity. Can either be `True`/`False` |
|                                 |                  | globally, or set per entity type, such as:                   |
|                                 |                  | ```                                                          |
|                                 |                  | ...                                                          |
|                                 |                  | - name: DIETClassifier                                       |
|                                 |                  |   split_entities_by_comma:                                   |
|                                 |                  |     address: True                                            |
|                                 |                  |     ...                                                      |
|                                 |                  | ...                                                          |
|                                 |                  | ```                                                          |
+---------------------------------+------------------+--------------------------------------------------------------+
| constrain_similarities          | False            | If `True`, applies sigmoid on all similarity terms and adds  |
|                                 |                  | it to the loss function to ensure that similarity values are |
|                                 |                  | approximately bounded. Used only if `loss_type=cross_entropy`|
+---------------------------------+------------------+--------------------------------------------------------------+
| model_confidence                | "softmax"        | Affects how model's confidence for each intent               |
|                                 |                  | is computed. Currently, only one value is supported:         |
|                                 |                  | 1. `softmax` - Similarities between input and intent         |
|                                 |                  | embeddings are post-processed with a softmax function,       |
|                                 |                  | as a result of which confidence for all intents sum up to 1. |
|                                 |                  | This parameter does not affect the confidence for entity     |
|                                 |                  | prediction.                                                  |
+---------------------------------+------------------+--------------------------------------------------------------+

该组件还可以被配置为针对特定检索意图训练响应选择器。参数retrieval_intent设置此响应选择器模型所针对的检索意图的名称。默认值为“无”,即模型针对所有检索意图进行训练。

在默认配置中,组件使用检索意图和响应键(例如faq/ask_name)作为训练标签。或者,也可以通过将use_text_as_label切换为True,将其配置为使用响应的文本作为训练标签。在此模式下,组件将使用第一个具有文本属性的可用响应进行训练。如果没有找到,则返回使用检索意图和响应键作为标签。

猜你喜欢

转载自blog.csdn.net/fzz97_/article/details/128959733