Return to Projects

Docling

Project

Highlights

  • ⚡ Converts any PDF document to JSON or Markdown format, stable and lightning fast
  • 📑 Understands detailed page layout, reading order and recovers table structures
  • 🔍 Includes OCR support for scanned PDFs
  • 🤖 Integrates easily with LLM app / RAG frameworks like 🦙 LlamaIndex and 🦜🔗 LangChain
  • 💻 Provides a simple and convenient CLI


Project Goals

  • Document AI by enabling advanced workflows unlocking knowledge extraction and exploration from documents.
  • Drive Open-Source Innovation by fostering a collaborative ecosystem around document AI and understanding.
  • Data formats aligning document-based datasets to a uniform format for a common downstream consumption.