We can instructed models to respond with particular output structure like in this figure.

image.png

Key concepts

  1. Define Schema
  2. Returning Structured output

with_structured_output() method is used to automates the process of binding the schema to the model and parsing the output.

# Define schema
schema = {"foo": "bar"}
# Bind schema to model
model_with_structure = model.with_structured_output(schema)
# Invoke the model to produce structured output that matches the schema
structured_output = model_with_structure.invoke(user_input)

Define Schema

There are two method of defining schema:

  1. The simplest and most common format is JSON-like structure.

    {
      "answer": "The answer to the user's question",
      "followup_question": "A followup question the user could ask"
    }
    
  2. Using Pydantic library because it offers type hints and validation.

    from pydantic import BaseModel, Field
    class ResponseFormatter(BaseModel):
        """Always use this tool to structure your response to the user."""
        answer: str = Field(description="The answer to the user's question")
        followup_question: str = Field(description="A followup question the user could ask")
    
    

Returning Structured output

  1. Using tool calling

    1. it involves binding tool to a model and when appropriate the model can decide to call this tool and ensure its response conforms to the tool’s schema.

    2. simply bind our schema to a model as a tool!

      from langchain_openai import ChatOpenAI
      model = ChatOpenAI(model="gpt-4o", temperature=0)
      # Bind responseformatter schema as a tool to the model
      model_with_tools = model.bind_tools([ResponseFormatter])
      # Invoke the model
      ai_msg = model_with_tools.invoke("What is the powerhouse of the cell?")
      
    3. JSON mode

      1. Some model providers support a feature called JSON mode.

      2. This supports JSON schema definition as input and enforces the model to produce a conforming JSON output.

        from langchain_openai import ChatOpenAI
        model = ChatOpenAI(model="gpt-4o", model_kwargs={ "response_format": { "type": "json_object" } })
        ai_msg = model.invoke("Return a JSON object with key 'random_ints' and a value of 10 random ints in [0-99]")
        ai_msg.content
        '\\n{\\n  "random_ints": [23, 47, 89, 15, 34, 76, 58, 3, 62, 91]\\n}'
        
      3. One important point is the model still returns a string, which needs to be parsed into a JSON object.

      4. This can use the json library or a JSON output parser if you need more advanced functionality.

    Structured output method

    There are a few challenges when producing structured output with the above methods:

    1. When tool calling is used, tool call arguments needs to be parsed from a dictionary back to the original schema.
    2. In addition, the model needs to be instructed to always use the tool when we want to enforce structured output, which is a provider specific setting.
    3. When JSON mode is used, the output needs to be parsed into a JSON object.

image.png

This both binds the schema to the model as a tool and parses the output to the specified output schema.

# Bind the schema to the model
model_with_structure = model.with_structured_output(ResponseFormatter)
# Invoke the model
structured_output = model_with_structure.invoke("What is the powerhouse of the cell?")
# Get back the pydantic object
structured_output

Output:
ResponseFormatter(answer="The powerhouse of the cell is the mitochondrion. Mitochondria are organelles that generate most of the cell's supply of adenosine triphosphate (ATP), which is used as a source of chemical energy.", followup_question='What is the function of ATP in the cell?')