Blog & Articles

Perspectives, news, and technical reports from our community.

Blog Posts & Articles

V0.1 of the OTDI dataset specification

Announcing the Open Trusted Data Initiative (OTDI) draft v0.1 dataset specification

Announcing the Open Trusted Data Initiative (OTDI) draft v0.1 dataset specification...

abstract gradient

Defining Open Source AI: The Road Ahead

News

Open source and open science in AI is a practical, proven approach to enabling access, innovation, trust, and value creation now. Let’s focus on that as we better define it.

abstract gradient

Introducing the AI Alliance Open Innovation Principles

News

The AI Alliance has a released a set of 14 principles covering six areas...

Gofannon AI Alliance Project

GoFannon: Stop Rewriting AI Tools for Every Framework

Write once, use anywhere—an open-source tool library for portable AI agents

docling open ai open data YouTube

From Layout to Logic: How Docling is Redefining Document AI  

Discover Docling, the powerful open-source AI document processing tool developed by IBM Research and supported by the AI Alliance, designed for fast, local, and privacy-first workflows. With no reliance on cloud APIs, Docling offers high-quality outputs and flexible licensing, making it ideal for enterprise and research use. Now enhanced by Hugging Face’s SmolVLM models, SmolDocling brings lightweight, multimodal AI to complex document layouts—handling code, charts, tables, and more with precision. Join the growing open-source community transforming document AI and contribute to the future of trusted, efficient, and collaborative AI innovation.

Transform Pipelines in Data Prep Kit 

Technical Report

The blog post explores how Kubeflow Pipelines (KFP) automate Data Prep Kit (DPK) transforms on Kubernetes, simplifying execution, scaling, and scheduling. It details the required Kubernetes infrastructure, reusable KFP components, and a pipeline generator for automating workflows. By integrating KFP, DPK streamlines orchestrating and managing complex data transformations.

Architecture of Data Prep Kit Framework 

Technical Report

The Data Prep Kit (DPK) framework enables scalable data transformation using Python, Ray, and Spark, while supporting various data sources such as local disk, S3, and Hugging Face datasets. It defines abstract base classes for transformations, allowing developers to implement custom data and folder transforms that operate seamlessly across different runtimes. DPK also introduces a data abstraction layer to streamline data access and facilitate checkpointing. To support large-scale processing, it provides three runtimes: Python for small datasets, Ray for distributed execution across clusters, and Spark for highly scalable processing using Resilient Distributed Datasets (RDDs). Additionally, DPK integrates with Kubeflow Pipelines (KFP) for automating transformations within Kubernetes environments. The framework includes transform utilities, testing support, and simplified APIs for invoking transforms efficiently. By abstracting complexity, DPK simplifies development, deployment, and execution of data processing pipelines in both local and distributed environments.

Navigating The AI Risk Labyrinth

Transitioning from a successful AI proof-of-concept to a scalable product brings significant challenges, including accuracy, bias, data security, and regulatory compliance. Risk Atlas Nexus from IBM Research is an open-source initiative designed to help organizations structure, assess, and mitigate AI risks through a shared ontology, AI-assisted governance tools, and knowledge graphs linking industry standards like NIST and OWASP. As part of the AI Alliance Trust and Safety Evaluation initiative, this project fosters a collaborative ecosystem to make AI governance more accessible and actionable. Join us in shaping the future of AI governance!

Spotlight on Supratik Mukhopadhyay of LSU

Member spotlight

In this AI Alliance member spotlight we meet Supratic Mukhopadhyay of LSU

Trust and Safety Evaluations Initiative

Announcing the Trust and Safety Evaluations Initiative (TSEI)

News

The AI Alliance is proud to announce the Trust and Safety Evaluations Initiative (TSEI) at the Artificial Intelligence Action Summit in Paris.

Open Trusted Data Initiative OTDI

Open Trusted Data Initiative Launched at the AI Action Summit, Paris

News

The AI Alliance is proud to announce the Open Trusted Data Initiative (OTDI) at the Artificial Intelligence Action Summit in Paris.

The State of Open Source AI Trust and Safety - End of 2024 Edition

News

We conducted a survey with 100 AI Alliance members to learn about the state of open source AI trust and safety for 2024. This blog post highlights key findings on AI applications, model popularity, safety concerns, regulatory focus, and gaps in current safety practices, while also providing an overview of notable open-source projects, tools, and research in the field of AI trust and safety.