
Announcing the Open Trusted Data Initiative (OTDI) draft v0.1 dataset specification
Announcing the Open Trusted Data Initiative (OTDI) draft v0.1 dataset specification...
Announcing the Open Trusted Data Initiative (OTDI) draft v0.1 dataset specification...
Open source and open science in AI is a practical, proven approach to enabling access, innovation, trust, and value creation now. Let’s focus on that as we better define it.
The AI Alliance has a released a set of 14 principles covering six areas...
Write once, use anywhere—an open-source tool library for portable AI agents
Discover Docling, the powerful open-source AI document processing tool developed by IBM Research and supported by the AI Alliance, designed for fast, local, and privacy-first workflows. With no reliance on cloud APIs, Docling offers high-quality outputs and flexible licensing, making it ideal for enterprise and research use. Now enhanced by Hugging Face’s SmolVLM models, SmolDocling brings lightweight, multimodal AI to complex document layouts—handling code, charts, tables, and more with precision. Join the growing open-source community transforming document AI and contribute to the future of trusted, efficient, and collaborative AI innovation.
The blog post explores how Kubeflow Pipelines (KFP) automate Data Prep Kit (DPK) transforms on Kubernetes, simplifying execution, scaling, and scheduling. It details the required Kubernetes infrastructure, reusable KFP components, and a pipeline generator for automating workflows. By integrating KFP, DPK streamlines orchestrating and managing complex data transformations.
The Data Prep Kit (DPK) framework enables scalable data transformation using Python, Ray, and Spark, while supporting various data sources such as local disk, S3, and Hugging Face datasets. It defines abstract base classes for transformations, allowing developers to implement custom data and folder transforms that operate seamlessly across different runtimes. DPK also introduces a data abstraction layer to streamline data access and facilitate checkpointing. To support large-scale processing, it provides three runtimes: Python for small datasets, Ray for distributed execution across clusters, and Spark for highly scalable processing using Resilient Distributed Datasets (RDDs). Additionally, DPK integrates with Kubeflow Pipelines (KFP) for automating transformations within Kubernetes environments. The framework includes transform utilities, testing support, and simplified APIs for invoking transforms efficiently. By abstracting complexity, DPK simplifies development, deployment, and execution of data processing pipelines in both local and distributed environments.
Transitioning from a successful AI proof-of-concept to a scalable product brings significant challenges, including accuracy, bias, data security, and regulatory compliance. Risk Atlas Nexus from IBM Research is an open-source initiative designed to help organizations structure, assess, and mitigate AI risks through a shared ontology, AI-assisted governance tools, and knowledge graphs linking industry standards like NIST and OWASP. As part of the AI Alliance Trust and Safety Evaluation initiative, this project fosters a collaborative ecosystem to make AI governance more accessible and actionable. Join us in shaping the future of AI governance!
In this AI Alliance member spotlight we meet Supratic Mukhopadhyay of LSU
The AI Alliance is proud to announce the Trust and Safety Evaluations Initiative (TSEI) at the Artificial Intelligence Action Summit in Paris.
The AI Alliance is proud to announce the Open Trusted Data Initiative (OTDI) at the Artificial Intelligence Action Summit in Paris.
We conducted a survey with 100 AI Alliance members to learn about the state of open source AI trust and safety for 2024. This blog post highlights key findings on AI applications, model popularity, safety concerns, regulatory focus, and gaps in current safety practices, while also providing an overview of notable open-source projects, tools, and research in the field of AI trust and safety.