Welcome back to part 6! In this part, we’re going to look at the LangChain Expression Language (LCEL) for chaining together elements to create an LLM chain that we can run. You’ve seen this syntax briefly in part 1 but we haven’t gone much into it after that. In this part, you will learn how to build a complex LLM chain with several layers interwoven with each other. Despite this complexity, it will still be very readable and easy to understand thanks to the LangChain expression language.
Before we get started though, as we’re going to be building an RCI chain, let’s talk about what an RCI chain is and why it is useful. RCI stands for:
Recursive Criticism and Improvement
What this basically means is that we’re going to ask a question to the language model, ChatGPT in our case. Then we’re going to get a criticism on this answer, looking for any problems, errors, or areas where this answer can be improved. Then we call ChatGPT again and ask it to improve its answer based on the critique.
So who is going to be doing the critiquing? Well, it turns out that Large Language Models are surprisingly good at critiquing themselves! So we’re going to ask ChatGPT to critique its own answer and then improve its answer based on the critique. This is the basic idea behind RCI.
Why is this useful? As you can read in this paper (https://arxiv.org/pdf/2303.17491.pdf), which I think is the origination of the term RCI itself, it is very effective at solving computer tasks and more tricky reasoning problems. The future of AI is clearly going to involve more and more AI doing things for us, especially computer tasks. So these researchers seem to be on to something with this RCI idea, and I’ve actually already seen this used in real-life applications as well, but more on that later.
Let’s build stuff!
So let’s get started with a practical example, before we get too stuck in theory, and create a new folder called ‘6_RCI_and_LCEL
‘ and inside we’ll create a new file named ‘1_RCI_chain.py
‘ like this:
πFinx_LangChain π1_Summarizing_long_texts π2_Chat_with_large_documents π3_Agents_and_tools π4_Custom_tools π5_Understanding_agents π6_RCI_and_langchain_expression_language π1_RCI_chain.py π.env
The file structure for this part of the tutorial will be pretty simple, to give you a break after that last one π!
Inside our ‘1_RCI_chain.py
‘ file we’ll start by importing what we’ll need, as usual:
from dataclasses import dataclass from typing import Optional import langchain from decouple import config from langchain.chat_models import ChatOpenAI from langchain.prompts.chat import ( ChatPromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate, ) from langchain.schema.output_parser import StrOutputParser
We’re going to use Python’s built-in dataclasses
and Optional
type to define a simple data structure, we import langchain so we can discuss the debug feature later, config as always, and ChatOpenAI
. We also import some prompt template convenience classes which will make it easier to create our prompts from the templates, and a StrOutputParser
which will basically just return the string output of the LLM back to us from the JSON object ChatGPT sends back.
Now let’s set up our ChatGPT API:
chatgpt_api = ChatOpenAI( model="gpt-3.5-turbo", temperature=0, openai_api_key=config("OPENAI_API_KEY") )
RCI_log dataclass
Before we dive into the chain, let’s use a simple dataclass to define our own simple data structure. We will use this type of data structure to represent and store all the stages of an RCI call. The question, initial answer, critique, and final answer. We’ll call this data class an RCI_log
and define it below:
@dataclass class RCI_log: question: str initial_answer: Optional[str] = None constructive_criticism: Optional[str] = None final_answer: Optional[str] = None def dict(self): return self.__dict__.copy()
We use Python’s @dataclass
decorator to define a dataclass
, which is basically just a class that is used to store data. We can define the fields of the dataclass
in the class definition, and then we can create instances of this class and store data in them. We name our dataclass RCI_log
and define the fields question
, initial_answer
, constructive_criticism
, and final_answer
.
The question is of type string and required, whereas the other three fields use the Optional type with [str]
in the brackets. This means that these three fields should either be a string, or they should be ‘None
‘, which makes them optional. We set the default value for these three fields to None
, as this will allow us to create an RCI_log
object passing in just the question, and then we can fill out the other fields later.
We can also define methods in the class, which we do here with the ‘dict
‘ method. This method just returns a copy of the .__dict__
attribute. The .__dict__
attribute is a special attribute in Python that returns a dictionary containing the attributes and their values of an object. It is used to access the internal dictionary that holds the instance attributes of an object and basically exposes what’s actually stored in memory. We can use this to get a dictionary representation of our RCI_log
object, which will be useful later.
One benefit of this dataclass over just using a normal dictionary is that your type-checker (if you have one turned on), will complain if we accidentally mistype a property or try to set one that doesn’t exist and we also get IntelliSense autocompletion because we have predefined the properties. This dataclass is a bit overkill for just this small tutorial but if you’re working with large projects passing loads of types of data around this can really help keep things organized.
Creating our RCI chain
Anyway, that was just a little detour, let’s get cracking on our RCI LLM chain! Let’s declare a function that will run our RCI chain, below and outside of the dataclass indentation:
def run_rci_chain(question: str) -> RCI_log: log: RCI_log = RCI_log(question)
Our function named run_rci_chain
takes a question as a string and returns an RCI_log
, which is the type we just defined above using our dataclass. We create an instance of this dataclass and store it in a variable named log
, passing in the question as the first argument.
Now define an inner method inside this method:
def run_rci_chain(question: str) -> RCI_log: log: RCI_log = RCI_log(question) def combine_system_plus_human_chat_prompt( sys_template: str, human_template: str ) -> ChatPromptTemplate: return ChatPromptTemplate.from_messages( [ SystemMessagePromptTemplate.from_template(sys_template), HumanMessagePromptTemplate.from_template(human_template), ] )
We define a function called combine_system_plus_human_chat_prompt
which takes a system template and human template as strings and will return a ChatPromptTemplate
object. What is a ChatPromptTemplate
object? Go ahead and hover over the object name in your IDE and you’ll see that it is basically just a list of tuples with a message history.
This function then returns a ChatPromptTemplate
created with the .from_messages
method, which takes a list of messages as an argument. We create this list of messages by using the .from_template
method for both a system message and a human message passing in our system and human templates (we haven’t created these yet, but whatever was input into the function).
This will basically just return an object like this, with whatever system and human prompt templates we passed in:
[ ("system", "You are a helpful AI bot.... blabla setup instructions"), ("human", "Whatever we want the LLM to do for us on this ChatGPT call."), ] # Do not put in your file #
That’s all a ChatPromptTemplate
object is, a combination of several prompt templates into a chat history type list of tuples, with a role like “system
” or “human
” assigned for each message. So now we will need to create several of these ChatPromptTemplate
objects for every call we will make to ChatGPT.
Still inside the run_rci_chain
function, but outside the inner function:
def run_rci_chain(question: str) -> RCI_log: log: RCI_log = RCI_log(question) def combine_system_plus_human_chat_prompt(): ..... initial_chat_prompt = combine_system_plus_human_chat_prompt( "You are a helpful assistant that provides people with correct and accurate answers.", "{question}", )
We create our initial_chat_prompt
, by using the function we just created to combine a system prompt of "You are a helpful assistant that provides people with correct and accurate answers."
and then a human prompt of whatever the user’s question or input was by replacing the {question}
placeholder. Again, this will just return a list of tuples with the first tuple holding the ("system": "instructions"
) system role and message and the second tuple holding the ("human": "question"
) human role and message.
After we ask the initial question we will need to get a critique on the answer we got, so let’s set up that prompt as well:
critique_chat_prompt = combine_system_plus_human_chat_prompt( "You are a helpful assistant that looks at a question and it's given answer. You will find out what is wrong with the answer and give a critique.", "Question:\n{question}\nAnswer Given:\n{initial_answer}\nReview the answer and find out what is wrong with it.", )
We run our function again to create the second ChatPromptTemplate
object but this time the system prompt with ChatGPT’s instructions is completely different. We ask it to find out what is wrong and critique the first answer. (If there’s nothing wrong with it, it will tell us). We then feed it the original question and the answer that was given.
Now, we just need another ChatPromptTemplate
with a system and user message for the third and final call:
improvement_chat_prompt = combine_system_plus_human_chat_prompt( "You are a helpful assistant that will look at a question, its answer and a critique on the answer. Based on this answer and the critique, you will write a new improved answer.", "Question:\n{question}\nAnswer Given:\n{initial_answer}\nConstructive Criticism:\n{constructive_criticism}\nBased on this information, give only the correct answer.\nFinal Answer:", )
This time we ask it for a new and improved answer, based on the question, the initial answer, and the constructive criticism which we all feed into the template using {placeholders}
.
LangChain Expression Language
So what is this LangChain Expression Language that we’ll be using? It’s actually very simple, and you’ve already seen it in part 1 of the tutorial.
chain = prompt | model # Do not put in your file #
Expression language allows you to compose chains in LangChain by simply using the | pipe operator. So the above simply means that the prompt feeds into the model. It’s kind of like the pipe operator in Bash, where the output of the first item is ‘piped’ into the input of the second, creating a ‘chain’ for Large Language Models, hence the name ‘LangChain’.
So let us try this out, and create a chain to run our first initial_chat_prompt
through ChatGPT, still continuing inside the run_rci_chain
function:
def run_rci_chain(question: str) -> RCI_log: ..... ..... ..... initial_chain = initial_chat_prompt | chatgpt_api | StrOutputParser()
So we declare a new LangChain chain named initial_chain
, and we use the | pipe operator to pipe the initial_chat_prompt
containing our initial system message and the human message containing the user query into the chatgpt_api
and then pipe the output of that into the StrOutputParser
. Again, the StrOutputParser
will simply return the LLM’s final answer to us.
So if we call this first chain, we will get our first answer, but there is one last thing we need to call a chain like this. We’ve actually also seen this in part 1. The initial chat prompt has the {placeholder}
values in there, which need to be replaced by our values. We pass these in using a simple dictionary. So we could invoke our initial chain like this:
# Example, do not keep in your code # answer = initial_chain.invoke({"question": "What is an elephant?"})
And this would totally work. (Note that if you insert this into your file and run it you will return nothing as the run_rci_chain
function is not called anywhere yet, we’ll do that later). However, remember that we create the RCI_log
data type that just so happens to contain entries for all the variables our prompt templates will need! (Which is of course no coincidence.) We also gave it a convenient function to output all its data to a dictionary.
So the above can be replaced by:
answer = initial_chain.invoke(log.dict())
Which simply passes in the dictionary form of the data inside our RCI_log
, and then calls the chain. Convenient! We now have:
initial_chain = initial_chat_prompt | chatgpt_api | StrOutputParser() answer = initial_chain.invoke(log.dict())
This answer will contain our initial answer, which we need to store in our RCI_log
object, so let’s change the code:
initial_chain = initial_chat_prompt | chatgpt_api | StrOutputParser() log.initial_answer = initial_chain.invoke(log.dict())
We invoke the initial chain, passing in our log dictionary which only has the question in it, and in return, we get the initial answer, which we store in our log.initial_answer
field. Note how simple and readable this is, thanks to the LangChain expression language. So now let’s add the second step.
critique_chain = critique_chat_prompt | chatgpt_api | StrOutputParser() log.constructive_criticism = critique_chain.invoke(log.dict())
We create a critique chain, using the critique prompt we already set up, pipe it into ChatGPT, and then into the string output parser. We invoke this chain, passing in our log
dictionary, which by now contains both the question and the initial answer, allowing the critique chat prompt’s {placeholders}
to be filled in. We then store the output of this chain in our log.constructive_criticism
field.
Now for the last one:
improvement_chain = improvement_chat_prompt | chatgpt_api | StrOutputParser() log.final_answer = improvement_chain.invoke(log.dict())
We do exactly the same again, creating our final chain and passing in our log
‘s dictionary which has all three values needed by now and then we store the final answer in our log dataclass.
Now let’s add a print statement for some nice readable output:
print( f""" Question: {log.question} Answer Given: {log.initial_answer} Constructive Criticism: {log.constructive_criticism} Final Answer: {log.final_answer} """ ) return log
We print a multi-line string with all the data in our RCI_log
object, and finally, we also return the log, as we promised on declaring this function that we would return an object of type RCI_log
(-> RCI_log
) and we should keep our promises!
So here’s the whole run_rci_chain
function:
def run_rci_chain(question: str) -> RCI_log: log: RCI_log = RCI_log(question) def combine_system_plus_human_chat_prompt( sys_template: str, human_template: str ) -> ChatPromptTemplate: return ChatPromptTemplate.from_messages( [ SystemMessagePromptTemplate.from_template(sys_template), HumanMessagePromptTemplate.from_template(human_template), ] ) initial_chat_prompt = combine_system_plus_human_chat_prompt( "You are a helpful assistant that provides people with correct and accurate answers.", "{question}", ) critique_chat_prompt = combine_system_plus_human_chat_prompt( "You are a helpful assistant that looks at a question and its given answer. You will find out what is wrong with the answer and give a critique.", "Question:\n{question}\nAnswer Given:\n{initial_answer}\nReview the answer and find out what is wrong with it.", ) improvement_chat_prompt = combine_system_plus_human_chat_prompt( " "You are a helpful assistant that will look at a question, its answer and a critique on the answer. Based on this answer and the critique, you will write a new improved answer.", "Question:\n{question}\nAnswer Given:\n{initial_answer}\nConstructive Criticism:\n{constructive_criticism}\nBased on this information, give only the correct answer.\nFinal Answer:", ) initial_chain = initial_chat_prompt | chatgpt_api | StrOutputParser() log.initial_answer = initial_chain.invoke(log.dict()) critique_chain = critique_chat_prompt | chatgpt_api | StrOutputParser() log.constructive_criticism = critique_chain.invoke(log.dict()) improvement_chain = improvement_chat_prompt | chatgpt_api | StrOutputParser() log.final_answer = improvement_chain.invoke(log.dict()) print( f""" Question: {log.question} Answer Given: {log.initial_answer} Constructive Criticism: {log.constructive_criticism} Final Answer: {log.final_answer} """ ) return log
Of course, in a real project, we probably should not store the templates inside the function itself, but store them in some kind of data object of their own, but I don’t want to pollute this tutorial with too many distractions. So let’s give our RCI chain a test!
Testing our RCI chain
We’re going to be asking a trick question that will be hard even for most humans to answer, as it’s related to a particular niche and particular people. Add the following:
question = "who was the first man to win 9 consecutive races in formula 1?" print(run_rci_chain(question))
And then run the file and I get:
Question: who was the first man to win 9 consecutive races in formula 1? Answer Given: The first man to win 9 consecutive races in Formula 1 was Alberto Ascari. He achieved this remarkable feat between 1952 and 1953. Constructive Criticism: The answer provided is incorrect. While Alberto Ascari was indeed a successful Formula 1 driver, he did not win 9 consecutive races. The correct answer to the question is Sebastian Vettel. He achieved this impressive feat between 2013 and 2014, winning 9 consecutive races. Final Answer: The first man to win 9 consecutive races in Formula 1 was Sebastian Vettel. He achieved this feat between 2013 and 2014.
This is a tricky question. It is a specific topic and a very specific question, and Alberto Ascari did not win 9 in a row, because he technically did not compete in a race in between. But, Formula 1 racing trivia aside, the point is that even ChatGPT-3.5-turbo can be fooled. But, quite impressively enough, it is capable of finding its own mistake and correcting its own wrongs! This is the idea behind RCI.
Practical uses for RCI
So why is this useful? How is this used in real life? Surely just asking trick questions is not the only use right? It can be used for particularly difficult questions. For instance, the coding helper “Github Copilot” which integrates in VScode uses this approach. It will generate something for you if you ask it for a fix or help and then go over it again using another pass just like our RCI-chain to catch mistakes it has made and provide an improved version of the answer straight away. It doesn’t always come up with a useful or perfect answer, but nor does it need to, it’s not a perfect tool. But the RCI mechanism in this case significantly improves the likelihood of the answer being useful or at least pointing the coder in the right direction.
In addition, as the article linked at the start of this tutorial mentions, it’s showing promise in executing computer tasks. LLMs tend to find it hard to compile the correct steps in the correct order right away instead of simply compiling a list of some of the steps in some order, which as you well know will not work for computer tasks. You need all the steps and you need them to be in the correct order. This makes it especially tricky to have an LLM reliably execute computer operations instead of a human operator. But this is where the future is inevitably going, and RCI and similar approaches are looking like a stepping stone in that direction.
LangChain’s debug setting
Before we wrap up this LangChain course and send you off into the wild to build your own LangChain stuff, I want to give you one more tool in your toolbag as a LangChain developer. I’ve deliberately left this out of the course so far as the output can be quite overwhelming and more confusing than helpful at first, but now that you have a good grasp of LangChain, let’s talk about the debug feature.
When you’re building complex chains and applications, sometimes it’ll not quite work as you expect and you want to truly see everything that is going on under the hood. This is where LangChain’s debug setting will come to the rescue. Scroll back up to the top of your file and directly below the imports add the following line:
langchain.debug = True
Now if you run your file again, you will see everything and I mean everything in your console! You will see every single call to ChatGPT and its JSON responses that were received including the token usage, the input for each chain and prompt that was generated, etc. If something is going wrong somewhere but you cannot exactly pinpoint where, this is useful to really figure out on a granular level where you should be looking for a bug in your code.
So that’s it for this LangChain tutorial series! I really hope you enjoyed it and learned a lot. As always, it was my pleasure and an honor, and I hope to see you again in the next one!
This tutorial is part of our original course on Python LangChain. You can find the course URL here: π
π§βπ» Original Course Link: Becoming a Langchain Prompt Engineer with Python – and Build Cool Stuff π¦π