Return to Articles

From Layout to Logic: How Docling is Redefining Document AI  

Agata Ferretti
Michele Dolfi
Peter Star
docling open ai open data YouTube

Original Interview from January 2025

When you need document processing that is seamless, efficient, and entirely controlled by you-free of restrictions or reliance on external API, Docling is the answer. Docling  is an open-source Python package designed to prepare documents for GenAI models with precision while not compromising speed.  

As an affiliated project of the AI Alliance, Docling started as a project within IBM research and gained substantial traction. Since its release, Docling earned more than 25,000 stars on GitHub

To get an idea of what Docling is and where to start watch our video

open data open source AI ALLIANCE

So far, Docling offers: 
Local Processing – No cloud dependencies, ensuring data privacy 
High-Quality Output – Small yet powerful models for fast and accurate results 
Flexible Licensing – Designed to empower enterprise and community adoption 

In the spirit of collaboration and innovation, we’re excited to see AI Alliance member Hugging Face partnering with IBM Research to enhance Docling’s capabilities. Hugging Face recently introduced two lightweight Vision Language Models, SmolVLM-256M and SmolVLM-500M. These highly efficient, multimodal models offer powerful vision and text understanding in a compact format, making them ideal for document processing, image captioning, and visual reasoning—key areas where Docling excels. 

SmolDocling, built on Hugging Face’s SmolVLM, is a groundbreaking open-source document processing model that redefines document conversion. Packing 256M parameters into an end-to-end solution, it introduces DocTags, a universal markup format that not only captures content but also preserves structural information and spatial positioning.  

With its remarkable efficiency, SmolDocling accurately interprets and reproduces complex document features in a single step. Such features include code listings, tables, equations, charts, and lists, layout segmentation across diverse document types—including business reports, academic papers, patents, and structured forms.  

Notably, it competes with Vision Language Models up to 27 times larger, while significantly reducing computational demands. Click here to see a demo of Smoldocling capabilities. In addition, the Smoldocling team is contributing new, publicly sourced datasets for charts, tables, equations, and code recognition to further advance open AI research. The model is already available, and the datasets will be released soon. 

By March 27, 2025, SmolDocling became the #1 trending model on Hugging Face—one of the few small Vision Language Models (VLMs) to achieve this milestone! This success, driven by the AI Alliance mission, underscores the power of multi-stakeholder collaboration in developing and deploying open foundation models. 

To continue this momentum, we invite the broader community of users and developers to contribute to Docling and SmolDocling by building new features, creating plug-in extensions, and enabling seamless integrations. Together, we can push the boundaries of open AI innovation! 

There are three key ways to get involved

  1. Create New Datasets – Contribute datasets for training or evaluation. We encourage you to use Docling to curate and share open datasets that can enrich the AI Alliance Open Trusted Data catalog and benefit the open-source community. We are particularly interested in high-quality, domain-specific, and reasoning datasets that support model tuning, application grounding, and enrichment. 
  2. Develop Advanced Applications – Use Docling in increasingly sophisticated application scenarios across domains such as legal, materials science, semiconductors, and finance. Help expand its functionality by building new features and integration patterns. Join our Application and Tools Working Group in Focus Area 3
  3. Enhance SmolDocling Models – Contribute by developing new SmolDocling models tailored to your needs. Stay tuned for updates on open-source fine-tuning efforts! 

docling smoldocling

docling data prep

docling open data

Are you working on an open AI-driven project that could use a bigger stage? The AI Alliance is committed to helping you scale through visibility, collaboration, and support. 

Join the movement. Build. Contribute. Collaborate. Innovate.  

References:

 https://scholar.google.com/citations?view_op=view_citation&hl=en&user=sPuvIfgAAAAJ&cst[…]&sortby=pubdate&citation_for_view=sPuvIfgAAAAJ:VOx2b1Wkg3QC

https://scholar.google.com/citations?view_op=view_citation&hl=en&user=sPuvIfgAAAAJ&sortby=pubdate&citation_for_view=sPuvIfgAAAAJ:wbdj-CoPYUoC

https://scholar.google.com/citations?view_op=view_citation&hl=en&user=sPuvIfgAAAAJ&sortby=pubdate&citation_for_view=sPuvIfgAAAAJ:t6usbXjVLHcC

Related Articles

View All

Mastering Data Cleaning for Fine-Tuning LLMs and RAG Architectures

News

In the rapidly advancing field of artificial intelligence, data cleaning has become a mission-critical step in ensuring the success of Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) architectures. This blog emphasizes the importance of high-quality, structured data in preventing AI model hallucinations, reducing algorithmic bias, enhancing embedding quality, and improving information retrieval accuracy. It covers essential AI data preprocessing techniques like deduplication, PII redaction, noise filtering, and text normalization, while spotlighting top tools such as IBM Data Prep Kit, AI Fairness 360, and OpenRefine. With real-world applications ranging from LLM fine-tuning to graph-based knowledge systems, the post offers a practical guide for data scientists and AI engineers looking to optimize performance, ensure ethical compliance, and build scalable, trustworthy AI systems.

Gofannon AI Alliance Project

GoFannon: Stop Rewriting AI Tools for Every Framework

Write once, use anywhere—an open-source tool library for portable AI agents

The State of Open Source AI Trust and Safety - End of 2024 Edition

News

We conducted a survey with 100 AI Alliance members to learn about the state of open source AI trust and safety for 2024. This blog post highlights key findings on AI applications, model popularity, safety concerns, regulatory focus, and gaps in current safety practices, while also providing an overview of notable open-source projects, tools, and research in the field of AI trust and safety.