今天我学习了DeepLearning.AI的 Building Systems with LLM 的在线课程，我想和大家一起分享一下该门课程的一些主要内容。今天我们来学习输出结果检查。输出结果检查包含以下两部分内容：

检查输出是否存在潜在有害内容
检查输出是否基于提供的产品信息

下面是我们访问大型语言模(LLM)的主要代码：

import openai
 
#您的openai的api key
openai.api_key ='YOUR-OPENAI-API-KEY' 
 
def get_completion_from_messages(messages, 
                                 model="gpt-3.5-turbo", 
                                 temperature=0, 
                                 max_tokens=500):
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature, 
        max_tokens=max_tokens, 
    )
    return response.choices[0].message["content"]

检查输出是否存在潜在有害内容

之前我们学习了如何让LLM对用户提交的propmt进行内容审核，这样可以防止出现有害内容，下面我们来看一个内容审核的例子，在这个例子中我们让LLM对一段电子产品的功能描述信息进行内容审核，很明显电子产品的功能描述信息不应该属于有害信息.

final_response_to_customer = f"""
The SmartX ProPhone has a 6.1-inch display, 128GB storage, \
12MP dual camera, and 5G. The FotoSnap DSLR Camera \
has a 24.2MP sensor, 1080p video, 3-inch LCD, and \
interchangeable lenses. We have a variety of TVs, including \
the CineView 4K TV with a 55-inch display, 4K resolution, \
HDR, and smart TV features. We also have the SoundMax \
Home Theater system with 5.1 channel, 1000W output, wireless \
subwoofer, and Bluetooth. Do you have any specific questions \
about these products or any other products we offer?
"""
response = openai.Moderation.create(
    input=final_response_to_customer
)
moderation_output = response["results"][0]
print(moderation_output)

从上面的输出结果来看，我们的LLM对这段信息做出来正确的判断，即它不属于有害信息(flagged被标记为false)。

检查输出是否基于提供的产品信息

有时候我们需要LLM基于指定的内容来回答客户的问题，比如说，当客户询问有关产品的问题时，我们需要LLM能基于现有的产品的信息来回答客户的问题，此时检查LLM返回的结果是否基于特定的产品信息就非常重要了，我们这样做的目的是为了防止LLM出现“幻觉”而给出错误的答案。在下面的例子中，我们有一堆电子产品的信息包括名称，类别，品牌，价格等，当客户询问相关电子产品的问题时，我们为LLM准备了它需要回复的内容(final_response_to_customer )，然后我们让LLM检查回复的内容是否是基于现有的电子产品信息。

product_information = """{ "name": "SmartX ProPhone", "category": "Smartphones and Accessories", "brand": "SmartX", "model_number": "SX-PP10", "warranty": "1 year", "rating": 4.6, "features": [ "6.1-inch display", "128GB storage", "12MP dual camera", "5G" ], "description": "A powerful smartphone with advanced camera features.", "price": 899.99 } { "name": "FotoSnap DSLR Camera", "category": "Cameras and Camcorders", "brand": "FotoSnap", "model_number": "FS-DSLR200", "warranty": "1 year", "rating": 4.7, "features": [ "24.2MP sensor", "1080p video", "3-inch LCD", "Interchangeable lenses" ], "description": "Capture stunning photos and videos with this versatile DSLR camera.", "price": 599.99 } { "name": "CineView 4K TV", "category": "Televisions and Home Theater Systems", "brand": "CineView", "model_number": "CV-4K55", "warranty": "2 years", "rating": 4.8, "features": [ "55-inch display", "4K resolution", "HDR", "Smart TV" ], "description": "A stunning 4K TV with vibrant colors and smart features.", "price": 599.99 } { "name": "SoundMax Home Theater", "category": "Televisions and Home Theater Systems", "brand": "SoundMax", "model_number": "SM-HT100", "warranty": "1 year", "rating": 4.4, "features": [ "5.1 channel", "1000W output", "Wireless subwoofer", "Bluetooth" ], "description": "A powerful home theater system for an immersive audio experience.", "price": 399.99 } { "name": "CineView 8K TV", "category": "Televisions and Home Theater Systems", "brand": "CineView", "model_number": "CV-8K65", "warranty": "2 years", "rating": 4.9, "features": [ "65-inch display", "8K resolution", "HDR", "Smart TV" ], "description": "Experience the future of television with this stunning 8K TV.", "price": 2999.99 } { "name": "SoundMax Soundbar", "category": "Televisions and Home Theater Systems", "brand": "SoundMax", "model_number": "SM-SB50", "warranty": "1 year", "rating": 4.3, "features": [ "2.1 channel", "300W output", "Wireless subwoofer", "Bluetooth" ], "description": "Upgrade your TV's audio with this sleek and powerful soundbar.", "price": 199.99 } { "name": "CineView OLED TV", "category": "Televisions and Home Theater Systems", "brand": "CineView", "model_number": "CV-OLED55", "warranty": "2 years", "rating": 4.7, "features": [ "55-inch display", "4K resolution", "HDR", "Smart TV" ], "description": "Experience true blacks and vibrant colors with this OLED TV.", "price": 1499.99 }"""

system_message = f"""
You are an assistant that evaluates whether \
customer service agent responses sufficiently \
answer customer questions, and also validates that \
all the facts the assistant cites from the product \
information are correct.
The product information and user and customer \
service agent messages will be delimited by \
3 backticks, i.e. ```.
Respond with a Y or N character, with no punctuation:
Y - if the output sufficiently answers the question \
AND the response correctly uses product information
N - otherwise

Output a single letter only.
"""
customer_message = f"""
tell me about the smartx pro phone and \
the fotosnap camera, the dslr one. \
Also tell me about your tvs"""

q_a_pair = f"""
Customer message: ```{customer_message}```
Product information: ```{product_information}```
Agent response: ```{final_response_to_customer}```

Does the response use the retrieved information correctly?
Does the response sufficiently answer the question

Output Y or N
"""
messages = [
    {'role': 'system', 'content': system_message},
    {'role': 'user', 'content': q_a_pair}
]

response = get_completion_from_messages(messages, max_tokens=1)
print(response)

这里我们的产品信息存储在product_information 变量里面，而LLM回复客户的内容存储在final_response_to_customer变量里，我们要LLM做的是检查final_response_to_customer变量里面的内容是否正确引用了现有的产品信息，如果是则返回Y, 否则返回N。下面我把system_message的内容翻译成中文，这样便于大家理解：

system_message = f"""
你是一名助理，负责评估客户服务代理是否充分回答了客户的问题，
并验证助理从产品信息中引用的所有事实是否正确。 
产品信息以及用户和客户服务代理消息将由 3 个反引号分隔，即```。

用 Y 或 N 字符响应，不要使用标点符号：
Y - 如果输出足以回答问题
并且响应正确使用了产品信息
N - 否则
"""

customer_message = f"""
跟我说说smartx pro手机和fotosnap相机，数码单反相机。也和我说说你们的电视机。
"""

下面我们给LLM产生一段与现有产品毫不相干的回复，我们看看LLM能检查出自己的回复与现有产品不相关吗？：


# 生活就像一盒巧克力
another_response = "life is like a box of chocolates"
q_a_pair = f"""
Customer message: ```{customer_message}```
Product information: ```{product_information}```
Agent response: ```{another_response}```

Does the response use the retrieved information correctly?
Does the response sufficiently answer the question?

Output Y or N
"""
messages = [
    {'role': 'system', 'content': system_message},
    {'role': 'user', 'content': q_a_pair}
]

response = get_completion_from_messages(messages)
print(response)

这里我们给LLM生产了一个与现在产品毫不相干的回复：life is like a box of chocolates，很显然这样的回复没有引用任何产品的信息，并且与客户的问题customer_message毫不相关，所以LLM最终检查后给出了“N”的回答，这也是完全正确的。

总结

今天我们学习了如何让LLM来检查自己的输出结果是否正确，输出结果检查一般分为两种：1.有害内容检查。2.回复的内容是否基于特定产品。这是两种非常实用的LLM开发技巧，在各种LLM的应用场景中基本都会用到。也希望你的内容能帮助到大家。

参考资料

DLAI - Learning Platform Beta

使用大型语言模(LLM)构建系统(五)：输出结果检查

检查输出是否存在潜在有害内容

检查输出是否基于提供的产品信息

总结

参考资料

猜你喜欢