结构化输出

原文：结构化输出

一句话

结构化输出允许 Agent 返回特定、可预测格式的数据，而非自然语言响应，使应用程序可以直接使用 JSON 对象、Pydantic 模型或 dataclasses 等结构化数据。

什么时候翻这页

当你需要 Agent 返回格式化数据而非自然语言文本时，例如提取结构化信息、分析数据并返回特定格式的结果、或需要确保输出符合特定数据结构时。

核心概念

结构化输出：Agent 返回特定、可预测格式的数据，而非自然语言响应
ProviderStrategy：使用模型提供商原生的结构化输出功能（如 OpenAI、Anthropic、xAI）
ToolStrategy：通过工具调用实现结构化输出，适用于所有支持工具调用的模型
Schema：定义结构化输出格式的规范，支持 Pydantic 模型、dataclasses、TypedDict 和 JSON Schema
response_format：create_agent 参数，控制 Agent 如何返回结构化数据

怎么做

使用 create_agent 函数创建支持结构化输出的 Agent
通过 response_format 参数指定输出格式
根据模型能力选择适当的策略：
- 直接传递 schema 类型：LangChain 自动选择最佳策略
- 明确指定 ProviderStrategy 或 ToolStrategy

def create_agent(
    ...
    response_format: Union[
        ToolStrategy[StructuredResponseT],
        ProviderStrategy[StructuredResponseT],
        type[StructuredResponseT],
        None,
    ]
)

ProviderStrategy 使用

适用于支持原生结构化输出的模型（如 OpenAI、Anthropic、xAI）：

from pydantic import BaseModel, Field
from langchain.agents import create_agent

class ContactInfo(BaseModel):
    name: str = Field(description="The name of the person")
    email: str = Field(description="The email address of the person")
    phone: str = Field(description="The phone number of the person")

agent = create_agent(
    model="gpt-5.4",
    response_format=ContactInfo  # 自动选择 ProviderStrategy
)

ToolStrategy 使用

适用于不支持原生结构化输出的模型：

from pydantic import BaseModel, Field
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy

class ProductReview(BaseModel):
    rating: int | None = Field(description="The rating of the product", ge=1, le=5)
    sentiment: str = Field(description="The sentiment of the review")
    key_points: list[str] = Field(description="The key points of the review")

agent = create_agent(
    model="gpt-5.4",
    tools=tools,
    response_format=ToolStrategy(ProductReview)
)

自定义工具消息内容

agent = create_agent(
    model="gpt-5.4",
    tools=[],
    response_format=ToolStrategy(
        schema=MeetingAction,
        tool_message_content="Action item captured and added to meeting notes!"
    )
)

错误处理

# 自定义错误消息
ToolStrategy(
    schema=ProductRating,
    handle_errors="Please provide a valid rating between 1-5 and include a comment."
)

# 处理特定异常
ToolStrategy(
    schema=ProductRating,
    handle_errors=ValueError  # 仅对 ValueError 重试
)

# 自定义错误处理函数
def custom_error_handler(error: Exception) -> str:
    if isinstance(error, StructuredOutputValidationError):
        return "There was an issue with the format. Try again."
    elif isinstance(error, MultipleStructuredOutputsError):
        return "Multiple structured outputs were returned. Pick the most relevant one."
    else:
        return f"Error: {str(error)}"

ToolStrategy(
    schema=ProductRating,
    handle_errors=custom_error_handler
)

命令 / API 速查

create_agent(response_format=...) - 创建支持结构化输出的 Agent
ProviderStrategy(schema) - 使用提供商原生结构化输出
ToolStrategy(schema, tool_message_content, handle_errors) - 使用工具调用实现结构化输出
StructuredResponseT - 结构化响应类型
StructuredOutputValidationError - 结构化输出验证错误
MultipleStructuredOutputsError - 多个结构化输出错误

与 LangGraph / RAG 手册的联系

结构化输出与 LangGraph 的状态管理机制相辅相成，可以在 LangGraph 的状态中存储结构化数据。在 RAG 应用中，结构化输出可用于格式化检索到的信息，使其更易于后续处理。

初学者易错点

忘记检查模型是否支持原生结构化输出，导致使用不合适的策略
错误处理配置不当，导致 Agent 在遇到格式错误时无法正确恢复
未考虑 schema 的严格性要求，导致输出不符合预期格式
同时使用工具和结构化输出时，未确认模型是否支持这两种功能同时使用

语义检索