Multimodal RAG with Docling¶

Retrieval Augmented Generation (RAG) is an architectural pattern that can be used to augment the performance of language models by recalling factual information from a knowledge base, and adding that information to the model query.

In this lab we will combine the skills we learned in the two previous labs to build a Docling-enhanced RAG system.

Prerequisites¶

This lab is a Jupyter notebook. Please follow the instructions in pre-work to run the lab.

This lab has two options for running. The first will be using Replicate, and the second will be using LM Studio to run entirely locally.

Lab¶

With Replicate

With LM Studio

To run the notebook from your command line in Jupyter using the active virtual environment from the pre-work, run:

jupyter notebook notebooks/RAG.ipynb

The path of the notebook file above is relative to the docling-workshop folder from the git clone in the pre-work.