Creating Your Own PDF-Bot: A Guide to Building a Local RAG f...

Overview of PDF-Bot

PDF-Bot is designed to work entirely on your local machine. It automatically ingests PDF files from a designated folder, splits the content into manageable pieces, indexes these chunks in a Chroma vector database, and finally uses an Ollama-powered language model to generate context-based answers. This system is ideal for those who want to explore retrieval-augmented generation without relying on external APIs.

Key features include:

PDF Ingestion: Automatically load and process PDF files from a designated folder.
Text Splitting: Utilize Langchain's RecursiveCharacterTextSplitter to break documents into chunks.
Vector Database: Persistently store document chunks using Chroma.
Retrieval-Augmented Generation: Perform similarity searches to retrieve relevant context and generate thorough answers.
Modular Architecture: Benefit from a clear separation of concerns across various modules.

Prerequisites

Before we start, ensure you have:
Python 3.8+ installed
Basic terminal/Python knowledge
Ollama installed and running locally
4-8GB RAM free (for LLM operations)

Structure

The project follows a modular design, which makes it easy to maintain and extend. Here’s a quick look at the file organization:

Step 1: Project Setup

1.1 Create Project Structure

1.2 Install Dependencies

Create requirements.txt:

Next, install the dependencies using:

1.3 Downloading and Serving Ollama

Before you run PDF-Bot, you must download and serve Ollama, the language model backend powering our answer generation. Follow these steps to get it up and running:

Download Ollama: Visit the Ollama official website to download the latest version for your operating system.
Install Ollama: Follow the installation instructions provided on the website to set up Ollama on your machine.
Run the Ollama Server: Once installed, start the Ollama server. By default, the server listens on http://localhost:11434. If you need a different URL or port, update the BASE_URL in config.py.
Verify the Server: Open a browser or use a command-line tool (like curl) to ensure that Ollama is running at the specified URL.

For example, run:

You should receive a response indicating that the server is active.

Step 2: Core Components Explained

2.1 Configuration (config.py)

Key Settings:

BASE_URL: Your Ollama server URL
CHROMA_PATH: Chroma DB will be in this directory
DATA_FOLDER: Where your PDFs live
CHAT_MODEL: Local LLM via Ollama
EMBEDDING_MODEL: Text embedding model

2.2 Document Processing Pipeline

PDF Loading (document_loader.py):

Text Splitting (text_splitter.py):

Chunk Size: This parameter (set to 800) determines the maximum number of characters in each text chunk. It ensures that large documents are broken down into manageable pieces for processing.

Chunk Overlap: This parameter (set to 80) specifies the number of characters that will be shared between consecutive chunks. The overlap helps maintain context at the boundaries, ensuring that important information isn’t lost if it spans across two chunks.

Ollama Embedding (ollama_embeddings.py):

Vector Database (database.py):

Finally, processing the documents (process_documents.py):

2.2 Setting up the Main App

PDF Loading (llm_handler.py):

And, at last, the entry point of the program (main.py):

Step 3: Initialize Ollama Models

Run these commands in your terminal:

Keep Ollama running in background!

Step 4: Add Your First PDF

Place PDFs in ./data folder

Example test document:

Step 5: Run the Bot!

5.1 First-Time Setup

The system will:

Process PDFs → Split text → Create vector database
Perform similarity search
Generate answer using Llama3

5.2 Sample Output

Troubleshooting Tips

Ollama Not Responding?

Missing Dependencies?

Empty Chroma DB?

Delete ./chroma_db folder and rerun

Further Improvement

This is a really simple project but we can extend it further by:

Add web interface with Gradio
Support other document types (DOCX, TXT)
Implement batch processing for large PDF collections

Conclusion

You've just built a fully local RAG system for PDF analysis! This tutorial covered:
PDF ingestion and text processing
Vector database setup with Chroma
Local LLM integration via Ollama

If you are stuck anywhere, feel free to check my Github repo for the source code.