ChatGPT教我正宗美国口语

2024-08-17 01:04


起因是我发现了Llama3在Hugging Face上的一个小bug。按照官网的教程让模型进行function call的时候,会引发一个错误。

不过这不是重点。

我花了大半天时间debug,最终发现是模型的配置文件中,提示词模板的生成逻辑里有个小问题。我写了一个详细的说明,包括复现代码、问题根源、修复方法,发在Hugging Face Llama 3.1 模型的讨论区里,并且@了相关负责的人。

不过这也不是重点。

重点是,我跟ChatGPT说,让它把我的帖子语气改得轻松活泼点。不要太严肃。又不是我在给他们打工!

ChatGPT真给力。以下是我的帖子全文,包含修改前和修改后的对照。一起来学生活化的英语口语吧!

Issue

原文:I was trying the tool use example from the tutorial page on the Meta-Llama-3.1-8B-Instruct model with code (mostly also from the tutorial page):

修改:I was playing around with the tutorial example of tool use on the Meta-Llama-3.1-8B-Instruct model and noticed something off. Here’s what I ran (mostly lifted from the tutorial page):

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "meta-llama/Meta-Llama-3.1-8B-Instruct"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")

def get_current_temperature(location: str, unit: str) -> float:
    """
    Get the current temperature at a location.
    
    Args:
        location: The location to get the temperature for, in the format "City, Country"
        unit: The unit to return the temperature in. (choices: ["celsius", "fahrenheit"])
    Returns:
        The current temperature at the specified location in the specified units, as a float.
    """
    return 22.  # A real function should probably actually get the temperature!

def get_current_wind_speed(location: str) -> float:
    """
    Get the current wind speed in km/h at a given location.
    
    Args:
        location: The location to get the temperature for, in the format "City, Country"
    Returns:
        The current wind speed at the given location in km/h, as a float.
    """
    return 6.  # A real function should probably actually get the wind speed!

tools = [get_current_temperature, get_current_wind_speed]

messages = [
  {"role": "system", "content": "You are a bot that responds to weather queries. You should reply with the unit used in the queried location."},
  {"role": "user", "content": "Hey, what's the temperature in Paris right now?"}
]

inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):]))

原文:At this step the model generated a response of:

修改:The model’s response at this step was:

<|python_tag|>{"name": "get_current_temperature", "parameters": {"location": "Paris, France", "unit": "celsius"}}<|eom_id|>

原文:This is different from what the tutorial page claims:

修改:…which is a bit different from what’s documented on the tutorial page:

<tool_call>
{"arguments": {"location": "Paris, France", "unit": "celsius"}, "name": "get_current_temperature"}
</tool_call><|im_end|>

原文:Note the response in tutorial has a key “arguments”, while the one from actual Llama model has a key “parameters”.

Then I use the response text in the following steps:

修改:The key difference is the tutorial mentions a key “arguments,” but the actual response from the Llama model has “parameters.”

Continuing with the example, I hit a snag at this step:

import json

out_str = tokenizer.decode(out[0][len(inputs["input_ids"][0]):], skip_special_tokens=True)
tool_call = json.loads(out_str)
messages.append({"role": "assistant", "tool_calls": [{"type": "function", "function": tool_call}]})

messages.append({"role": "tool", "name": "get_current_temperature", "content": "22.0"})

inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):]))

原文:The tokenizer.apply_chat_template() will raise a TypeError: Object of type Undefined is not JSON serializable.

修改:When the tokenizer.apply_chat_template() ran, it threw a TypeError: Object of type Undefined is not JSON serializable.

Cause

原文:I did some investigation and found that:

The tokenizer uses Jinja2 to generate the prompt template which is defined in the tokenizer_config.json file. The template rendering logic includes this:

修改:Diving a bit deeper, I discovered that the issue stems from how the tokenizer uses Jinja2 for generating prompt templates defined in tokenizer_config.json. Here’s the juicy part:


{%- for message in messages %}\n
    ...
    {%- elif 'tool_calls' in message %}\n
        ...
        {%- set tool_call = message.tool_calls[0].function %}\n
        ...
        {%- else  %}\n
            {{- '<|start_header_id|>assistant<|end_header_id|>\\n\\n' -}}\n
            {{- '{\"name\": \"' + tool_call.name + '\", ' }}\n
            {{- '\"parameters\": ' }}\n
            {{- tool_call.arguments | tojson }}\n
            {{- \"}\" }}\n
        {%- endif %}\n

原文:So it will read tool_call.arguments from the message that contains tool_calls. However, the tool_call message generated by the model doesn’t have key name “arguments”. It has “parameters” instead. So Jinja2 will raise the error during json parsing.

修改:The crux of the problem is that it looks for tool_call.arguments in the message containing tool_calls, but Llama model spits out “parameters” instead. Hence, Jinja2 isn’t happy during the JSON parsing.

原文:The reason that the model uses “parameters” key name can also be found in the same prompt template:

修改:And the model’s use of “parameters” instead of “arguments” is actually prompted by the very same template:


{{- '<|start_header_id|>user<|end_header_id|>\\n\\n' -}}\n
{{- \"Given the following functions, please respond with a JSON for a function call \" }}\n
{{- \"with its proper arguments that best answers the given prompt.\\n\\n\" }}\n
{{- 'Respond in the format {\"name\": function name, \"parameters\": dictionary of argument name and its value}.' }}\n
{{- \"Do not use variables.\\n\\n\" }}\n

原文:It clearly asks the model to generate response in the format {"name": function name, "parameters": dictionary of argument name and its value}.

修改:It explicitly formats the response with {"name": function name, "parameters": dictionary of argument name and its value}.

原文:The code on the tutorial page doesn’t trigger this problem because it doesn’t use the model response as tool_call, instead, it uses a mock message which contains the correct key name:

修改:Interestingly, the tutorial doesn’t encounter this error because it uses a mock message, not a live model response, which conveniently includes the correct key:

tool_call = {"name": "get_current_temperature", "arguments": {"location": "Paris, France", "unit": "celsius"}}
messages.append({"role": "assistant", "tool_calls": [{"type": "function", "function": tool_call}]})

Fix

原文:Modify the template in tokenizer_config.json by replacing “parameters” with “arguments”, or vice versa.

The tokenizer_config.json files in the model Meta-Llama-3.1-70B-Instruct and Meta-Llama-3.1-405B-Instruct also have the same problem.

修改:Modify the template in tokenizer_config.json by replacing “parameters” with “arguments”, or vice versa.

Also, the tokenizer_config.json files in the model Meta-Llama-3.1-70B-Instruct and Meta-Llama-3.1-405B-Instruct seem to share this glitch.

Hope this helps someone out there!