Skeptic Dev Logo

Menu

Sidebar for navigation

Langchain
RAG
Ollama

Creating Your Own PDF-Bot: A Guide to Building a Local RAG for PDFs with Langchain, Chroma, and Ollama

In this post, we’ll dive into the process of building a powerful PDF-Bot—a lightweight, local retrieval-augmented generation (RAG) system that lets you index and query PDF documents. By combining Langchain, Chroma, and Ollama, you can develop a tool that processes PDFs, extracts useful text chunks, stores them in a vector database, and generates detailed answers using an LLM. Whether you’re a developer eager to expand your project toolkit or simply curious about modern document processing, this guide is for you.

R
Ryo StylesAuthor
March 16, 2025
RAG Ollama Langchain
RAG Ollama Langchain

Overview of PDF-Bot

PDF-Bot is designed to work entirely on your local machine. It automatically ingests PDF files from a designated folder, splits the content into manageable pieces, indexes these chunks in a Chroma vector database, and finally uses an Ollama-powered language model to generate context-based answers. This system is ideal for those who want to explore retrieval-augmented generation without relying on external APIs.

Key features include:

  • PDF Ingestion: Automatically load and process PDF files from a designated folder.

  • Text Splitting: Utilize Langchain's RecursiveCharacterTextSplitter to break documents into chunks.

  • Vector Database: Persistently store document chunks using Chroma.

  • Retrieval-Augmented Generation: Perform similarity searches to retrieve relevant context and generate thorough answers.

  • Modular Architecture: Benefit from a clear separation of concerns across various modules.

Prerequisites

  • Before we start, ensure you have:

  • Python 3.8+ installed

  • Basic terminal/Python knowledge

  • Ollama installed and running locally

  • 4-8GB RAM free (for LLM operations)

Structure

The project follows a modular design, which makes it easy to maintain and extend. Here’s a quick look at the file organization:


Step 1: Project Setup

1.1 Create Project Structure

1.2 Install Dependencies

Create requirements.txt:

Next, install the dependencies using:

1.3 Downloading and Serving Ollama

Before you run PDF-Bot, you must download and serve Ollama, the language model backend powering our answer generation. Follow these steps to get it up and running:

  1. Download Ollama: Visit the Ollama official website to download the latest version for your operating system.

  2. Install Ollama: Follow the installation instructions provided on the website to set up Ollama on your machine.

  3. Run the Ollama Server: Once installed, start the Ollama server. By default, the server listens on http://localhost:11434. If you need a different URL or port, update the BASE_URL in config.py.

  4. Verify the Server: Open a browser or use a command-line tool (like curl) to ensure that Ollama is running at the specified URL.

For example, run:

You should receive a response indicating that the server is active.


Step 2: Core Components Explained

2.1 Configuration (config.py)

Key Settings:

  • BASE_URL: Your Ollama server URL

  • CHROMA_PATH: Chroma DB will be in this directory

  • DATA_FOLDER: Where your PDFs live

  • CHAT_MODEL: Local LLM via Ollama

  • EMBEDDING_MODEL: Text embedding model

2.2 Document Processing Pipeline

PDF Loading (document_loader.py):

Text Splitting (text_splitter.py):

Chunk Size: This parameter (set to 800) determines the maximum number of characters in each text chunk. It ensures that large documents are broken down into manageable pieces for processing.

Chunk Overlap: This parameter (set to 80) specifies the number of characters that will be shared between consecutive chunks. The overlap helps maintain context at the boundaries, ensuring that important information isn’t lost if it spans across two chunks.

Ollama Embedding (ollama_embeddings.py):

Vector Database (database.py):

Finally, processing the documents (process_documents.py):

2.2 Setting up the Main App

PDF Loading (llm_handler.py):

And, at last, the entry point of the program (main.py):


Step 3: Initialize Ollama Models

Run these commands in your terminal:

Keep Ollama running in background!


Step 4: Add Your First PDF

Place PDFs in ./data folder

Example test document:


Step 5: Run the Bot!

5.1 First-Time Setup

The system will:

  1. Process PDFs → Split text → Create vector database

  2. Perform similarity search

  3. Generate answer using Llama3

5.2 Sample Output


Troubleshooting Tips

Ollama Not Responding?

Missing Dependencies?

Empty Chroma DB?

Delete ./chroma_db folder and rerun


Further Improvement

This is a really simple project but we can extend it further by:

  1. Add web interface with Gradio

  2. Support other document types (DOCX, TXT)

  3. Implement batch processing for large PDF collections

Conclusion

  • You've just built a fully local RAG system for PDF analysis! This tutorial covered:
    PDF ingestion and text processing

  • Vector database setup with Chroma

  • Local LLM integration via Ollama

If you are stuck anywhere, feel free to check my Github repo for the source code.