Machine Learning Spot

LangChain RAG: How to Implement in 7 Easy Steps

LangChain RAG a easy deep dive for beginners

In this post on LangChain RAG, we will design a basic RAG (Retrieval Augmented Generation) pipeline using LangChain, and this is super easy; it takes only 7 simple steps to develop a simple RAG using the LangChain framework.

RAG is a pipeline that solves the problem of AI hallucination and customizes your app according to the particular use case. You can read more about RAG by reading our post on RAG, which tells how RAG works and how we can even improve RAG. So, let’s move!

Steps to Involve

When we talk about RAG, Its structure is based on three steps

  1. Indexing
  2. Retrieval
  3. Generation

To break it down more, these are the steps that we will follow

  1. Document Loading
  2. Splitting the text
  3. Vectorization
  4. Storing in vector store
  5. Retrieval of relevant information
  6. Augmentation of retrieved information
  7. Response Generation

Now, you’ll get the details of these steps, so stay tuned here.

Step 0: Preparations For the Code

Before we start building our pipeline of LangChain Rag, we need to install the required packages and then import the right modules, so let’s do them first before moving on.

To Install Packages:

pip install langchain -U langchain-community faiss-cpu langchain-openai tiktoken pypdf

Importing Every Module

Like always, for your convenience, each module that we will keep using below is also combined here in one place.

from langchain_openai import OpenAI # This is to interact with OpenAI
from langchain_community.document_loaders import PyPDFLoader # To load pdf file
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings #To convert our text to embeings
from langchain_text_splitters import RecursiveCharacterTextSplitter # To split text to reduce token size
import os #For OpenAI API
from langchain.chains import LLMChain # Used to Chain LLM with retrieved information
from langchain import PromptTemplate #For prompt template

Starting The 7 Steps: LangChain RAG Pipeline

LangChain RAG: Visualization no 1 for Indexing
A Visualization of Indexing: Steps 1–4

1. Document Loading: The First Step in LangChain RAG

We need a document from which we are going to retrieve the information. There are various document loaders available in the langChain that can be used to develop a RAG, but here we are going to use PyPDF to load and split our PDF at the same time.

The book that we are using is the philosophical novel Metamorphosis by Kafka. You can download it from here.

loader = PyPDFLoader("/content/Metamorphosis.pdf") # Replace it with your file path.
pages = loader.load_and_split()

The good thing about this loader is that it not only loads but also splits out documents.

2. Splitting the Text in Chunks

Our document is already split, and we now need to split the text that it contains into smaller chunks. This is important as each LLM we use has a limit on the number of tokens it can take, and we have to use the LLM to convert the text into embeddings first. We are using OpenAI to do this, and in our case, the total limit was 15k.

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(pages)

Our chunk size is 1000, and to check how many chunks our text is split into, use len.

len(docs)

The number of chunks our text was split into was 198, while the document was split into 70 pages, which was the same as that of our PDF.

3. Vectorization: Embeddings Creation in LangChain RAG

Let’s create embeddings now. To create embeddings, we are using the OpenAI API, so make sure you have successfully set the openAI environment using your own personal API key by getting from here.

os.environ["OPENAI_API_KEY"] = "My API" # Replace My API with your APi key.

Now creating vector embeddings.

embeddings = OpenAIEmbeddings() # embeddings object creation
our_database = FAISS.from_documents(docs, embeddings)

4. Storing Embeddings in VectorStore

We are using our machine as a vector store to store the embeddings that we just created above. Since this one is the simplest one to use, you can use others too. Soon, we are going to post articles on how to use other vector stores like Pinecone, Chroma DB, and others. Stay in touch with us to learn about them too.

As we have already used FAISS to save in code, to save it in the local machine as a pickle, use the code below.

our_database.save_local("faiss_index") # Replace faiss_index with name of your choice

So now that you have created a vector store in LangChain RAG, let’s retrieve relevant information to fulfill the purpose of R in our RAG!

LangChain RAG: Step 2 The REtrieval Pipeleine
Retrieval Pipeline of our LangChain RAG: Step 5

5. The R of Our RAG: Retrieval in LangChain RAG

Retrieval is super easy since we have just created a vector store. We don’t need to load vector embeddings from the store, as they are already here, but if you had to, then this is how you would do it.

our_database = FAISS.load_local("faiss_index", embeddings) # I have used same name our_database to load

While loading, you might see an error that the file might be dangerous as it is from an untrusted source. To resolve it, use the below code, in which we have allowed a dangerous file as true.

our_database = FAISS.load_local("faiss_index", embeddings, allow_dangerous_deserialization=True)

Now that we have loaded this, this is how we retrieve it. To see what has been retrieved, we are printing it too with the query that we have given to make our Langchain RAG pipeline: “Who was Gregor?”

our_query = "who was Gregor?" #As python is case sensitive take good care of letter's capitalization.
docs = our_database.similarity_search(our_query)
print(docs)

Let’s see the output to see what is retrieved from the vector embeddings.

Output:

[Document(page_content='several hours as he sat reading a number of different \nnewspapers. On the wall exactly opposite there was \nphotograph of Gregor when he was a lieutenant in the \na r m y , h i s s w o r d i n h i s h a n d a n d a c a r e f r e e s m i l e o n h i s', metadata={'source': '/content/Metamorphosis.pdf', 'page': 16}), ..........  })]

To make it simple for you to understand, I’ve shortened the output so you can see everything clearly. You can download the whole document here to see how the LangChain RAG pipeline works.

Step 3 of our Lang
Chain RAG Pipe Line
Generation: Step 6 & 7

6. The A in RAG: Augmentation in LangChain RAG Pipeline

To augment the information retrieved in our LangChain RAG pipeline, we will utilize LangChain’s distinctive feature hinted at in its name – LangChain chains.

To augment it within the chain with a query, we need to make a prompt template using which we can get a proper answer for our query, so first let’s design a prompt template and then augment that prompt template into the chain.

To use prompt templates and chains, we need to import them too.

from langchain.chains import LLMChain 
from langchain import PromptTemplate 

#Written here just as a reminder to explain. Both modules are already imported if you have run each line of code from the beginning.

We will use the same query: Who was Gregor? and the retrieved information as a context in our prompt template.

If you don’t know how a prompt template is designed, read our post on how to craft a LangChain prompt template, then come back here, and if you already know, then keep reading.

Prompt Template Crafting

Prompt = "This context is taken from a pdf of novel given the context defined in my context answer the user_query.

user_query : {query} 

my context :{context} "

While designing the prompt template, ensure proper indentation according to your IDE. If any errors are indicated, you may remove any unnecessary line breaks in the prompt and bring the entire prompt in one single line.

Since this is just a text string, we need to give it to a prompt template structure to complete this part of our LangChain RAG pipeline.

Our_template = PromptTemplate(
input_variables=["context","query"], #input variables tell what will be replaced
template=Prompt)

Crafting a good prompt template is one of the most crucial steps in designing our LangChain RAG pipeline. To see how it looks, use .format method.

Prompt.format(query = our_query , context = docs)

This is how our Prompt appeared to see complete output Download this document, or you can see the trimmed output below.

this context is taken from a pdf of novel given the context defined in my context answer the user_query \n user_query : Who was Gregor?, Context :[Document(page_content=\'several hours as he sat reading a number of different \\nnewspapers. On the wall exactly opposite there was \\nphotograph of Gregor when he was a lieutenant in the \\na r m y , h i s s w o r d i n h i s h a n d a n d a c a r e f r e e s m i l e o n h i s\...

After the prompt section, let’s move on to the chain section and see what our LangChain RAG pipeline has to offer.

Augmenting Context and Query with LLM


llm = OpenAI(temperature=0.5)#Bring output from OpenAI with randmoness of 0.5

chain1 = LLMChain(llm=llm, prompt=Our_template) # chaining our template with LLM

In the above code, we assigned an LLM with an OpenAI temperature of 0.5. This instructs the LLM to determine the desired level of randomness; the greater the number, the more randomness we desire. Then we gave our prompt in the structure of Langchain prompt templates.
This was the step, step of augmentation in our LangChain RAG pipeline

7. Final step in our LangChain RAG pipeline: The Generation

After augmentation is done, the last step is to invoke and complete the process; this will generate the output, thus marking the end of our LangChain RAG pipeline.

chain1.invoke({"context": docs, "query": our_query}) #Invoking with our query and retrieved document.

Since the chain that we have used here is the simplest chain, we got an output combined with the retrieved document along with the answer that we were looking for. To get a controlled output, we use sequential chains. You can learn about different chains in our future article that we will soon post.

The answer we received using our LangChain RAG pipeline is shown below. To view the entire output, click here.

Output:

Answer: Gregor was a lieutenant in the army with a carefree smile on his face. He later became a travelling representative and was able to support his family financially, but his family was still suffering from his transformation.

Conclusion:

Hurray! You now know how to design a basic LangChain RAG pipeline. After reading this blog, I hope you have gained some practical knowledge and experience. We are continuously uploading posts on LangChain; keep following our blog posts, and you can read other already uploaded posts from here.

Please kindly enlighten us with your valuable feedback as we strive to add value to your coding experience, and your feedback really matters.

Stay tuned for more posts like chroma db., pinecone, chains in LangChain, improving token limits in LangChain, and many others.

To improve your LangChain RAG Pipeline you can also read our other article on RAG which covers an explanation of the advanced techniques of RAG developed to take RAG to the next level by just giving some prompts.

Liked the Post?  You can share as well

Facebook
Twitter
LinkedIn

More From Machine Learning Spot

Get The Latest AI News and Insights

Directly To Your Inbox
Subscribe

Signup ML Spot Newsletter

What Will You Get?

Bonus

Get A Free Workshop on
AI Development