ChatGPT教我正宗美国口语

2024-08-17 01:04

起因是我发现了Llama3在Hugging Face上的一个小bug。按照官网的教程让模型进行function call的时候，会引发一个错误。

不过这不是重点。

我花了大半天时间debug，最终发现是模型的配置文件中，提示词模板的生成逻辑里有个小问题。我写了一个详细的说明，包括复现代码、问题根源、修复方法，发在Hugging Face Llama 3.1 模型的讨论区里，并且@了相关负责的人。

不过这也不是重点。

重点是，我跟ChatGPT说，让它把我的帖子语气改得轻松活泼点。不要太严肃。又不是我在给他们打工！

ChatGPT真给力。以下是我的帖子全文，包含修改前和修改后的对照。一起来学生活化的英语口语吧！

Issue

原文：I was trying the tool use example from the tutorial page on the Meta-Llama-3.1-8B-Instruct model with code (mostly also from the tutorial page):

修改：I was playing around with the tutorial example of tool use on the Meta-Llama-3.1-8B-Instruct model and noticed something off. Here’s what I ran (mostly lifted from the tutorial page):

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "meta-llama/Meta-Llama-3.1-8B-Instruct"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")

def get_current_temperature(location: str, unit: str) -> float:
    """
    Get the current temperature at a location.
    
    Args:
        location: The location to get the temperature for, in the format "City, Country"
        unit: The unit to return the temperature in. (choices: ["celsius", "fahrenheit"])
    Returns:
        The current temperature at the specified location in the specified units, as a float.
    """
    return 22.  # A real function should probably actually get the temperature!

def get_current_wind_speed(location: str) -> float:
    """
    Get the current wind speed in km/h at a given location.
    
    Args:
        location: The location to get the temperature for, in the format "City, Country"
    Returns:
        The current wind speed at the given location in km/h, as a float.
    """
    return 6.  # A real function should probably actually get the wind speed!

tools = [get_current_temperature, get_current_wind_speed]

messages = [
  {"role": "system", "content": "You are a bot that responds to weather queries. You should reply with the unit used in the queried location."},
  {"role": "user", "content": "Hey, what's the temperature in Paris right now?"}
]

inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):]))

原文：At this step the model generated a response of:

修改：The model’s response at this step was:

<|python_tag|>{"name": "get_current_temperature", "parameters": {"location": "Paris, France", "unit": "celsius"}}<|eom_id|>

原文：This is different from what the tutorial page claims:

修改：…which is a bit different from what’s documented on the tutorial page:

<tool_call>
{"arguments": {"location": "Paris, France", "unit": "celsius"}, "name": "get_current_temperature"}
</tool_call><|im_end|>

原文：Note the response in tutorial has a key “arguments”, while the one from actual Llama model has a key “parameters”.

Then I use the response text in the following steps:

修改：The key difference is the tutorial mentions a key “arguments,” but the actual response from the Llama model has “parameters.”

Continuing with the example, I hit a snag at this step:

import json

out_str = tokenizer.decode(out[0][len(inputs["input_ids"][0]):], skip_special_tokens=True)
tool_call = json.loads(out_str)
messages.append({"role": "assistant", "tool_calls": [{"type": "function", "function": tool_call}]})

messages.append({"role": "tool", "name": "get_current_temperature", "content": "22.0"})

inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):]))

原文：The tokenizer.apply_chat_template() will raise a TypeError: Object of type Undefined is not JSON serializable.

修改：When the tokenizer.apply_chat_template() ran, it threw a TypeError: Object of type Undefined is not JSON serializable.

Cause

原文：I did some investigation and found that:

The tokenizer uses Jinja2 to generate the prompt template which is defined in the tokenizer_config.json file. The template rendering logic includes this:

修改：Diving a bit deeper, I discovered that the issue stems from how the tokenizer uses Jinja2 for generating prompt templates defined in tokenizer_config.json. Here’s the juicy part:

{%- for message in messages %}\n
    ...
    {%- elif 'tool_calls' in message %}\n
        ...
        {%- set tool_call = message.tool_calls[0].function %}\n
        ...
        {%- else  %}\n
            {{- '<|start_header_id|>assistant<|end_header_id|>\\n\\n' -}}\n
            {{- '{\"name\": \"' + tool_call.name + '\", ' }}\n
            {{- '\"parameters\": ' }}\n
            {{- tool_call.arguments | tojson }}\n
            {{- \"}\" }}\n
        {%- endif %}\n

原文：So it will read tool_call.arguments from the message that contains tool_calls. However, the tool_call message generated by the model doesn’t have key name “arguments”. It has “parameters” instead. So Jinja2 will raise the error during json parsing.

修改：The crux of the problem is that it looks for tool_call.arguments in the message containing tool_calls, but Llama model spits out “parameters” instead. Hence, Jinja2 isn’t happy during the JSON parsing.

原文：The reason that the model uses “parameters” key name can also be found in the same prompt template:

修改：And the model’s use of “parameters” instead of “arguments” is actually prompted by the very same template:

{{- '<|start_header_id|>user<|end_header_id|>\\n\\n' -}}\n
{{- \"Given the following functions, please respond with a JSON for a function call \" }}\n
{{- \"with its proper arguments that best answers the given prompt.\\n\\n\" }}\n
{{- 'Respond in the format {\"name\": function name, \"parameters\": dictionary of argument name and its value}.' }}\n
{{- \"Do not use variables.\\n\\n\" }}\n

原文：It clearly asks the model to generate response in the format {"name": function name, "parameters": dictionary of argument name and its value}.

修改：It explicitly formats the response with {"name": function name, "parameters": dictionary of argument name and its value}.

原文：The code on the tutorial page doesn’t trigger this problem because it doesn’t use the model response as tool_call, instead, it uses a mock message which contains the correct key name:

修改：Interestingly, the tutorial doesn’t encounter this error because it uses a mock message, not a live model response, which conveniently includes the correct key:

tool_call = {"name": "get_current_temperature", "arguments": {"location": "Paris, France", "unit": "celsius"}}
messages.append({"role": "assistant", "tool_calls": [{"type": "function", "function": tool_call}]})

Fix

原文：Modify the template in tokenizer_config.json by replacing “parameters” with “arguments”, or vice versa.

The tokenizer_config.json files in the model Meta-Llama-3.1-70B-Instruct and Meta-Llama-3.1-405B-Instruct also have the same problem.

修改：Modify the template in tokenizer_config.json by replacing “parameters” with “arguments”, or vice versa.

Also, the tokenizer_config.json files in the model Meta-Llama-3.1-70B-Instruct and Meta-Llama-3.1-405B-Instruct seem to share this glitch.

Hope this helps someone out there!

ChatGPT教我正宗美国口语

Issue

Cause

Fix

List

列表

Coursera 布局调整脚本

博客改版，起用Jekyll

不用transformers库也能加载HuggingFace版Llama模型

又是Debug的一天

ChatGPT教我正宗美国口语

2D Fully Connected Layer：一种无用的对线性全连接层的模仿

在 Windows 上下载和运行 Llama2

用事件响应实现的（伪）加载进度条

ChatGPT教我破解网络课程python环境

关于ChatGPT替我写了一个Android程序的事