Document Conversion with Docling¶
The primary purpose of Docling is Document conversion. Docling enables us to convert documents various format into formats that are more useful in AI applications, while preserving document structure.
This lab walks through the different document conversion options Docling offers, as well as some enrichment features. We will also explore the converted documents to examine how Docling stores metadata to preserve document structure.
Prerequisites¶
This lab is a Jupyter notebook. Please follow the instructions in pre-work to run the lab.
Lab¶
Google Colab (Replicate Edition)¶
If you are running in Google Colab and want a streamlined experience using Replicate for hosted model inference (no local GPU or Ollama required), use this alternate notebook:
To run the notebook from your command line in Jupyter using the active virtual environment from the pre-work, run:
jupyter notebook notebooks/Conversion.ipynb
The path of the notebook file above is relative to the docling-workshop folder from the git clone in the pre-work.