top of page

Revolutionizing E-Commerce: Contextual Searching with RAG and GPT

Header Contextual Search

In the vast digital landscape of today's e-commerce world, finding the right product can often feel like searching for a needle in a haystack. Shoppers are presented with an abundance of choices, and sifting through countless options can be overwhelming. That's where the power of contextual searching comes into play, revolutionizing the way we navigate online stores. Imagine a solution that can understand your needs and guide you effortlessly to the products you seek. Thanks to cutting-edge technology, such a solution now exists, and it's powered by the remarkable synergy of LangChain and GPT. Welcome to a new era of shopping, where the quest for the perfect item is simplified and streamlined through a user-friendly Streamlit application. In this blog, we'll delve into the world of contextual searching and unveil the innovative solution that's changing the game. We'll explore how this application harnesses the power of natural language processing to analyze your plain text questions, transforming them into tailored product recommendations. Join us on this journey as we explore the development and functionality of the app and how it can reshape the way we shop online.


LangChain is a framework for developing applications powered by language models. The two significant advantages of LangChain are:

  1. Easily connect a language model to other sources of data

  2. Allows a language model to interact with its environment

LangChain provides multiple tools to work with LLMs. The ones used in this blog are:

1. Indexes: Indexes refer to ways to structure documents so that LLMs can best interact with them. Once the document is loaded, The text is split into smaller chunks. On input query, only relevant documents are retrieved using similarity scores to combine with language models.

2. Models: There are two types of models, one is LLM: Here we will define the llm for our question-answering use case, in our case it is GPT-4, and the other is Text Embedding Models: This model is used to get the embeddings for document and is later used to retrieve similar documents

3. Prompts: a prompt refers to the input given to the model to generate a response, you typically provide a prompt to guide the model's response in a specific direction. The prompt can be a question, an incomplete sentence, or a statement that sets the context for the generated text. The model then uses the provided prompt as a starting point to generate a continuation, completing the text based on the learned patterns and knowledge encoded in its training.

4. Chains: combines a PromptTemplate, a Model, and Guardrails to take user input, format it accordingly, pass it to the model, get a response, and then validate and fix (if necessary) the model output.

5. Memory: In some applications (chatbots being a GREAT example) it is highly important to remember previous interactions, both at a short-term and a long-term level. Memory does exactly that.

Retrieval-Augmented Generation

RAG Architecture
RAG Architecture

RAG, or Retrieval-Augmented Generation, is a powerful AI framework that boosts the performance of large language models (LLMs) like never before. LLMs are known for giving both accurate and baffling answers; this is because they rely on patterns and statistics rather than true understanding. RAG steps in to change the game by connecting these models to external knowledge sources, ensuring that they provide the most reliable and up-to-date information.

The brilliance of RAG doesn't stop there. By integrating this framework into LLM-based question-answering systems, two key benefits emerge. First, it guarantees that the model always has access to the latest, most trustworthy facts. Second, it lets users know where the model's information is coming from, making sure that its responses can be checked for accuracy, building trust in the process. Plus, RAG reduces the chances of the LLM making mistakes and helps it stay on track with the context from external sources.

In practical terms, RAG acts as a knowledge anchor for LLMs. It prevents them from making wild guesses or providing irrelevant information. Instead, it ensures that the responses are solidly grounded in real-world facts. This not only improves the quality of the answers but also makes the LLMs more reliable and consistent. So, thanks to RAG, we can have LLMs that provide smarter, more accurate responses, all while maintaining a firm connection to the real world.

Developing the Streamlit App

Dataset: Github

We will create a streamlit app to demonstrate the use case. We will use colab as the execution environment. The user here will be able to input the dataset and ask relevant questions.

Approach: The development process follows a structured three-step approach:

  1. Data Input and Preprocessing:

  • The Streamlit app provides a user-friendly interface to upload multiple datasets.

  • Once the file is uploaded, the app efficiently reads and preprocesses the files, ensuring that the data is in an optimal format for subsequent analysis.

  • This crucial step prepares the data for further processing.

2. Model Integration:

  • This step focuses on creating a vector store and loading the relevant language model, which powers the analysis.

  • By leveraging the loaded model, the app provides recommendations based on the user's input and the dataset provided.

3. Output Postprocessing and Visualization:

  • The final step involves refining the generated output to ensure clarity and relevance.

  • The app incorporates a memory tool that allows users to ask follow-up questions based on the previous output, fostering a seamless interaction.

  • The processed output is displayed on the Streamlit interface, providing users with a user-friendly and intuitive experience. Additionally, users can conveniently access their conversation history and download the entire conversation for future reference.

1. Read and preprocess the files

Within the "read_files" function mentioned above, we have implemented various methods to read the uploaded file, catering to different file formats. This ensures compatibility with multiple file types. As part of the functionality, the function also performs document segmentation by splitting it into smaller chunks with Item as splitting criteria. This strategic segmentation allows us to overcome the token length limitation of GPT while still achieving the desired analysis outcomes.

2. Build the Model with Azure OpenAI embeddings and GPT4

Within the mentioned "model" function, we employ faiss to create a vector store for the document chunks, which are then stored locally. This approach eliminates the need to recreate the vector store for every question, resulting in improved efficiency. For the

prompt, a simple and straightforward statement is used, indicating that the document contains chunks of Product information and requires analysis. In the question-answering process, we leverage RetrievalQAWithSourcesChain, which utilizes an Index for document lookup and provides answers along with their sources. While GPT-4 is utilized as the primary model, alternative models such as GPT-3.5 can also be employed within the same context.

3. Output Postprocessing

In the provided code snippet, when a user poses a question, we retrieve the corresponding output from the chain and present it within the Streamlit interface. Additionally, we utilize session states to facilitate the display of the conversation history, allowing users to review the entire interaction. Moreover, the option to download the complete conversation is made available, providing users with the ability to save and access the conversation at their convenience.

To get the streamlit interface in colab, we can use localtunnel, you will receive a URL on !npx localtunnel --port command. Upon clicking the URL, you will be prompted to provide an endpoint IP address. To obtain the IP address, use the command "!curl" within the notebook. Once you enter the IP address and submit it, you will be redirected to the Streamlit app. The following images depict the step-by-step process.


Access local tunnel


Documents uploaded: Amazon_Books.txt

Question 1: Recommend me any Photography books with the number of pages between 100-300 pages and less than 100 USD.

RAG output

GPT Answer:

Based on your criteria, you might want to consider the following Photography books:

  1. "Art of Everyday Photography: Move Toward Manual and Make Creative Photos" by Susan Tuttle with 144 pages and is listed for $26.99. This is recommended for those who are interested in everyday photography and want to enhance their creativity. [Item 100]

  2. "Photography" by Barbara London and John Upton, a comprehensive guide for aspiring photographers, has 408 pages, which is a little more than your specified range but it's a great book packed with lots of information. However, it is listed for $81.00, which is quite steep compared to the others. [Item 68]

In addition to that, it presented two additional suggestions that slightly surpass the specified page limit, suggesting them as worthwhile choices for those open to exploring more comprehensive collections and also providing reliable sources.


In summary, the synergy of LangChain and GPT has ushered in a transformative era of contextual searching, simplifying the e-commerce experience. Through Retrieval-Augmented Generation (RAG), this streamlined Streamlit application combines real-world knowledge with natural language processing to provide users with precise and reliable product recommendations. It's a game-changer in the world of online shopping, promising efficiency and trust in every interaction.

75 views0 comments


bottom of page