LangGraph – Finxter Academy

Hello and welcome back to part 3 of this tutorial series. In this part, we’ll be getting started with LangGraph. Instead of having a lot of explanation before we start, we’ll see how stuff works as we go along. So without further ado, let’s just jump right in.

Let’s start by actually installing LangGraph, as it doesn’t get installed by default with LangChain. To install LangGraph, you can use the following command in your terminal:

pipenv install langgraph==0.0.30 langchainhub==0.1.15

Once you’ve installed LangGraph, let’s start by creating a new file called simple_langgraph.py:

📂 FINX_LANGGRAPH
    📂 images
    📂 tools
    📄 .env
    📄 langchain_basics.py
    📄 Pipfile
    📄 Pipfile.lock
    📄 setup_environment.py
    📄 simple_langgraph.py    ✨New file

Over the next three parts, we’ll be looking at different ways in which you can use LangGraph to chain LLMs and tools together. In this first part we’ll be looking at a simple classic LLM –> goes to a tool executor –> and then back to LLM type setup.

Open up simple_langgraph.py and let’s start by importing the necessary modules:

import operator
from typing import Annotated, TypedDict, Union

from colorama import Fore, Style
from langchain import hub
from langchain.agents import create_openai_functions_agent
from langchain_core.agents import AgentAction, AgentActionMessageLog, AgentFinish
from langchain_core.messages import BaseMessage
from langchain_core.runnables.base import Runnable
from langchain_openai.chat_models import ChatOpenAI
from langgraph.graph import END, StateGraph
from langgraph.prebuilt.tool_executor import ToolExecutor

from setup_environment import set_environment_variables
from tools import generate_image, get_weather

That is a lot of stuff! Don’t worry, most of it is actually not as complex as it seems. Usually, I’ll go over all the imports before we get started, but as there are quite a few to go through, I’ll cover each import when we get to the part where it’s used instead. For now, just have them copied.

Next, we’ll set the environment variables and define a couple of constants:

set_environment_variables("LangGraph Basics")

LLM = ChatOpenAI(model="gpt-3.5-turbo-0125", streaming=True)
TOOLS = [get_weather, generate_image]
PROMPT = hub.pull("hwchase17/openai-functions-agent")

We reused our set_environment_variables function from the previous part to set the environment variables and set the name for the LangSmith traces to LangGraph Basics. We then define our LLM just like we did in part 1, also setting the streaming parameter to True. We then define a list of tools which is literally just a list containing the two tools that we wrote.

The LangChain Hub

For the prompt template, we pull it from the LangChain Hub this time, mostly because I want to show you that it exists! The LangChain Hub is kind of like a mini-GitHub for storing LangChain ChatPromptTemplates just like the simple ones we wrote in part 1. You can push new commits to your templates and pull them like we just did here, kind of like GitHub.

You can go to https://smith.langchain.com/ and scroll down to find the Hub button:

Click it to visually browse the prompts available on the hub:

You can use this as a convenient place to store your prompts. You can also set them to private if you don’t want to share them with the world and you can even fork other public prompts that you like to your own repositories. It’s a handy tool for development. For production or highly sensitive company data, you might want to store your prompts in a more secure location.

If we look up the prompt we just pulled, we can see that it is a fairly simple prompt:

It has an extremely basic system message of "You are a helpful assistant" and we can see that it has placeholders for chat_history, human input and an agent_scratchpad. The chat_history and input are kind of self-explanatory in that they hold the chat history so far and the human input, but what about this agent_scratchpad?

The agent_scratchpad is kind of like a place where the agent can take notes while going through its reasoning process of what action should be taken next and what functions should be called. Think of it as a notepad where the LLM can jot down its thoughts. Think of it kind of like the following:

user:
    "Can you recommend me a zombie game from the year 2022?"

    > Entering new AgentExecutor chain...
    Thought: Oh, I love zombie games! There are so many great ones out there. Let me think about the best zombie game from 2022.
    Action: use_search_engine
    Action Input: "best zombie game 2022"

    Observation:[{list of search result objects for query "best zombie game 2022"}]
    There are three great zombie games from 2022 that I found: Zombie Cure Lab, Zombie Survivors, and SurrounDead. Let me think about which one to recommend.
    Action: use_search_engine
    Action Input: "Zombie Cure Lab"

    Observation:[{list of search result objects for query "Zombie Cure Lab"}]
    Zombie Cure Lab is a game where you manage a lab and try to cure the zombie virus. (Bunch more info here yadayada...) I recommend Zombie Cure Lab as the best zombie game from 2022.

    Final Answer: The best zombie game from 2022 is Zombie Cure Lab.

This is just a conceptual example here to describe the idea, but the agent takes reasoning steps and makes observations along the way, first deciding to call a search engine tool to better answer the user question, then deciding to call the search engine tool to get more information on one of the games in particular, and then finally deciding that it has enough information to answer the user question.

So the agent_scratchpad is used to store these intermediate observations on what action to take next, but also to decide when the agent is done, so that it doesn’t just keep looping indefinitely. We’ll get back to how we can see when the agent is done in a moment.

The State Object

Ok, we have an LLM, some tools, and a prompt template. The next thing we need is a state object to keep track of the state for each step along our graph. So a LangGraph is kind of like a state machine, and it is going to take this state object and pass it along each node of the graph. Let’s look at a simplified example:

# Simplified example
StateObject():
    user_input = "please do a for me"
    chat_history = [list of previous chat messages for context...]
    am_i_done = False
    steps_taken = []

So say we have this state object above. We have received the user input question, and whatever chat history has come before if we have decided to implement memory. We have a flag am_i_done which is obviously set to False at the start, and we have a list of steps_taken which is empty at the start. Now we hand this state object to node A in our graph ->

# Simplified example Node A
StateObject():
    user_input = "please do a for me"
    chat_history = [list of previous chat messages for context...]
    am_i_done = False
    steps_taken = ["action_a was taken"]

It does some action we will just call action_a, which has taken it a step closer to answering the user question but it is not quite done yet so the am_i_done flag is still set to false. Now node A passes this state object to node B in our graph ->

# Simplified example Node B
StateObject():
    user_input = "please do a for me"
    chat_history = [list of previous chat messages for context...]
    am_i_done = True
    steps_taken = ["action_a was taken", "action_b was taken"]

This node does some action_b stuff and now has the final answer it needs to give to the user. It sets the am_i_done flag to True because it is done. We can use this am_i_done flag to test if the graph is completed yet (e.g. the user question or request has been fully answered).

So as the graph traverses over the nodes we define, each node will receive the state object, update it where needed, and then pass it along to the next node, or perhaps back to the previous one, or sideways to node D if a certain condition is met. So let’s define the real state object that we will be using:

class AgentState(TypedDict):
    input: str
    chat_history: list[BaseMessage]
    agent_outcome: Union[AgentAction, AgentFinish, None]
    intermediate_steps: Annotated[list[tuple[AgentAction, str]], operator.add]

We use a TypedDict to define a specific dictionary structure, defining the keys that this dictionary will have and the types of values that will be stored for each of those keys. The first entry is simply the user input, which is a str string value.

The second entry is the chat history, which is a list of BaseMessage objects. A BaseMessage object is just any one of the lines of this object below where you have a message and the originator of the message like “system”, “human”, or “ai”:

# Example BaseMessages
("system", "You are a helpful AI bot. Your name is {name}."),
("human", "Hello, how are you doing?"),
("ai", "I'm doing well, thanks!"),
("human", "{user_input}"),

The third item in the state object will be agent_outcome. The agent here will do its thing and then either return an AgentAction object or an AgentFinish object to us.

AgentAction: An AgentAction object simply contains the name of the tool the agent wants to call and the input arguments for that tool call, maybe like get_weather and {"location": "New York"}.
AgentFinish: An AgentFinish object simply means that the agent considers its task finished and holds the final return_values inside.

Using this agent_outcome object we can see what the next step is or if it is done.

The fourth and last entry in the AgentState object is a bit easier to read from the inside. We have a list of tuples where each tuple contains an AgentAction object and a str string. The AgentAction here is the same object that we described in the step above, containing a tool to be called and its input arguments. The difference here is that the step is already taken and the string which is the second item in the tuple is the tool output after it was called. So something like this:

## Fictional example object
[
    (
        AgentAction(tool="get_weather", input={"location": "New York"}),
        "{API response JSON object...}",
    ),
    (
        AgentAction(tool="generate_image", input={"image_description": "cat"}),
        "Path/to/image.png",
    ),
]

The Annotated type hint is used to add metadata to the type hint. In this case, we are using the operator.add function to tell the type checker that this list will be added to, so we are describing the AgentState object’s intermediate_steps list as a list that will be added to, like the example above.

The Agent

Now that we have our state object defined, we will define our agent that will have access to both the generate_image and get_weather tools:

runnable_agent: Runnable = create_openai_functions_agent(LLM, TOOLS, PROMPT)

We use the create_openai_functions_agent function we imported from LangChain to create an agent that has access to the LLM, the tools, and the prompt we defined so far. LangChain will make this into an OpenAI compatible agent by combining them for us into a Runnable type object. We have seen this Runnable object before in part 1 in the form of our chains. All Runnable type objects have the invoke, stream, and batch methods just like the chains we used in part 1.

Before we move on with the nodes and graph let’s test the agent we have so far. We’ll manually create a quick input here (as we haven’t built our graph yet) and then call invoke on the agent:

inputs = {
    "input": "give me the weather for New York please.",
    "chat_history": [],
    "intermediate_steps": [],
}

agent_outcome = runnable_agent.invoke(inputs)
print(agent_outcome)

Now go ahead and run this to test the agent so far and you should see something like this:

API Keys loaded and tracing set with project name:  LangGraph Basics
tool='get_weather' tool_input={'location': 'New York'} log="\nInvoking: `get_weather` with `{'location': 'New York'}`\n\n\n" message_log=[AIMessage(content='', additional_kwargs={'function_call': {'arguments': '{"location":"New York"}', 'name': 'get_weather'}}, response_metadata={'finish_reason': 'function_call'})]

We can see the agent wants to call the get_weather tool with the input {"location": "New York"}, so it’s asking us to call this function with these input arguments. Of course, it stopped running there as we haven’t linked up any other nodes yet, but we know that the agent is working so far.

Go ahead and remove the test inputs and agent_outcome code. Just for clarity, here is what you should have so far:

import operator
from typing import Annotated, TypedDict, Union

from colorama import Fore, Style
from langchain import hub
from langchain.agents import create_openai_functions_agent
from langchain_core.agents import AgentAction, AgentActionMessageLog, AgentFinish
from langchain_core.messages import BaseMessage
from langchain_core.runnables.base import Runnable
from langchain_openai.chat_models import ChatOpenAI
from langgraph.graph import END, StateGraph
from langgraph.prebuilt.tool_executor import ToolExecutor

from setup_environment import set_environment_variables
from tools import generate_image, get_weather


set_environment_variables("LangGraph Basics")

LLM = ChatOpenAI(model="gpt-3.5-turbo-0125", streaming=True)
TOOLS = [get_weather, generate_image]
PROMPT = hub.pull("hwchase17/openai-functions-agent")


class AgentState(TypedDict):
    input: str
    chat_history: list[BaseMessage]
    agent_outcome: Union[AgentAction, AgentFinish, None]
    intermediate_steps: Annotated[list[tuple[AgentAction, str]], operator.add]


runnable_agent: Runnable = create_openai_functions_agent(LLM, TOOLS, PROMPT)

The Nodes

So now the first thing we need to do is to create some nodes here so we can string them together into a graph. Let’s start with the Agent Node:

def agent_node(input: AgentState):
    agent_outcome: AgentActionMessageLog = runnable_agent.invoke(input)
    return {"agent_outcome": agent_outcome}

We define the node as a simple function that takes input which will be the AgentState object for all nodes. It then calls the invoke method on the agent with the input and catches the return in a variable named agent_outcome which is of type AgentActionMessageLog. This agent_outcome will have either the AgentAction object or the AgentFinish object that we talked about earlier, indicating what the next step is or if the agent is done. Whatever is in the agent_outcome, this function simply returns it in a dictionary.

Now that we have an agent node we need another node to execute the tools that the agent wants to call. Let’s define the Tool Executor Node:

tool_executor = ToolExecutor(TOOLS)

def tool_executor_node(input: AgentState):
    agent_action = input["agent_outcome"]
    output = tool_executor.invoke(agent_action)
    print(f"Executed {agent_action} with output: {output}")
    return {"intermediate_steps": [(agent_action, output)]}

First, we create a new instance of the ToolExecutor class that we imported from LangGraph. This ToolExecutor is initialized by giving it our list of tools which includes two tools in this case. The ToolExecutor provides a prebuilt interface that will extract the function and arguments the agent wants to call from the AgentAction object and then call the function with the arguments so we don’t have to do this manually.

Then we define the tool_executor_node function which again is just a simple function with input (which will be the state object). We extract the agent_action from the input dictionary and then call the invoke method on the tool_executor object which will run whatever tool the agent wants to call for us.

We have a print statement just for our own visual feedback here, and then we return the intermediate_steps list with the agent_action and the output of the tool call. Notice that this is the intermediate steps list that we defined in the AgentState object and talked about earlier and will be added to whatever steps were already there.

Now that we have these two functions for the nodes, we need a way to test if we want to finish the graph because the Agent Node has arrived at the final answer or if we need to continue on to the Executor node because it needs to execute a tool call. We can do this by defining a function that will check if the agent is done:

def continue_or_end_test(data: AgentState):
    if isinstance(data["agent_outcome"], AgentFinish):
        return "END"
    else:
        return "continue"

This function takes the AgentState object as input. Then it simply indexes into the agent_outcome. We said earlier that the agent_outcome will either be an AgentAction object (if still working) or an AgentFinish object if the agent is done. So if the agent_outcome is an instance of AgentFinish we return "END" to signal that the graph is done, otherwise, we return "continue" to signal that the graph should continue.

Creating our Graph

Now that we have two nodes and a test to see if we need to continue (this is just a very simple first example to explain the concepts), we can define our graph. The main type of graph in LangGraph is called a StatefulGraph, which passes a state object around as we discussed. Each node then returns some kind of update to that state, either setting specific attributes or adding to the existing attribute like the intermediate_steps list.

Setting up our graph is easy:

workflow = StateGraph(AgentState)

workflow.add_node("agent", agent_node)
workflow.add_node("tool_executor", tool_executor_node)

workflow.set_entry_point("agent")

First, we instantiate a new StateGraph passing in our AgentState object that we defined. We then simply add our two nodes, giving them a string name and passing in the functions we wrote second. Lastly, we set the entry point to the agent node, which is the first node that will be called when we start the graph.

Now we have a graph with an entry point. The next step is to define the connections called edges between the nodes. This is also very easy:

workflow.add_edge("tool_executor", "agent")

workflow.add_conditional_edges(
    "agent", continue_or_end_test, {"continue": "tool_executor", "END": END}
)

First, we add an edge from the tool_executor node back to the agent node. After we execute a tool call, we always want to feed the result back into the agent node.

Then we add a conditional edge from the agent node. We pass in our continue_or_end_test function that will determine where this edge will lead. If the function returns "continue" we will go to the tool_executor node, and if it returns "END" we will go to the END node. The END node is a special pre-built node that was part of our imports when we started this file.

Our simple graph in visual form now looks like this:

Now that we have our graph defined, we need to take the final step which is to compile the graph before we can use it:

weather_app = workflow.compile()

Testing our Graph

Now let’s whip up a quick function to test our graph:

def call_weather_app(query: str):
    inputs = {"input": query, "chat_history": []}
    output = weather_app.invoke(inputs)
    result = output.get("agent_outcome").return_values["output"]  # type: ignore
    steps = output.get("intermediate_steps")

    print(f"{Fore.BLUE}Result: {result}{Style.RESET_ALL}")
    print(f"{Fore.YELLOW}Steps: {steps}{Style.RESET_ALL}")

    return result

The function will take a string query. As input, we need to define the input key with the query and an empty chat_history list as we don’t have a previous history for now. We then call invoke on the weather_app graph object and catch the output in a variable named output. The agent_outcome will have an AgentFinish which has the return_values attribute that holds the final answer as we discussed.

# type: ignore is just for the type checker here as it doesn’t know that agent_outcome will always be an AgentFinish object and I don’t want to go too far into type hinting in this tutorial. If you don’t use type checking you won’t need the comment. We also extract the intermediate_steps list from the output into a variable named steps.

When we started the file we imported Fore and Style from the colorama library. This library has already been installed as a dependency of something else, so we didn’t have to install it. The Fore.BLUE sets the text foreground color to blue and the Style.RESET_ALL resets the color back to the default, repeating the pattern on the next line with yellow for easy readability.

Now we can test our graph by calling the function with a query:

call_weather_app("What is the weather in New York?")

Go ahead and run this and you should see the final answer in blue:

Result: The current weather in New York is sunny with a temperature of 35.1°F (1.7°C). The wind is coming from the north at 11.2 km/h. The humidity is at 52%, and
the visibility is 16.0 km.
Steps: All the steps here in yellow...

Good! That worked. The steps are a bit hard to read, but that is what we have LangSmith for. Head over to https://smith.langchain.com/ and check out your trace under the project name of LangGraph Basics. Take the one named LangGraph as the RunnableSequence one is from when we did the partial test before we built our graph:

We can see that the graph started with our agent, then went to the tool_executor, back to the agent, and then ended. Click on any of the steps to see more detail. Nice and readable right?

Something a bit cooler!

So let’s give our simple graph test here a bit of a bigger challenge! Comment out the old query and let’s ask something a bit harder:

# call_weather_app("What is the weather in New York?")

call_weather_app("Give me a visual image displaying the current weather in Seoul, South Korea.")

Let’s run this and see what we get (it should auto-save an image in the project’s images folder):

Result: Here is the visual image displaying the current weather in Seoul, South Korea:

![Seoul, South Korea Weather](c:\Coding_Vault\FINX_LANGGRAPH\images\152cf0e0-c50e-483b-be63-50ef40ea3255.png)

That’s pretty good! It has the temperature and the rain. I can confirm that it is currently dark and rainy over here and this also corresponds to the weather data the API sent back. Pretty dang cool right!?

If we look at the LangSmith trace we’ll see exactly what we expect:

The agent calls the weather function, it comes back to the agent which calls the image function, and then it ends by giving us the image. I’ll leave you to click on any of the steps if you want to see the in and outputs at each step.

Of course, we can put this information of wanting a visual image into the prompt so the user doesn’t have to type it and improve on this in many ways like directly displaying the image to the end user but that is not the point here, this is just a simple demonstration of how the edges and nodes come together to create a simple graph.

In the next part we’ll take this up a step. Where we basically have a single agent now, we’ll look at having a whole team of agents working together! I’ll see you in the next part!

P.S. I generated another one just for fun and it’s pretty good: