Return to Articles

Getting started with AI trust and safety

Technical Report
Screen shot of the User Guide website.

Introducing The AI Alliance Trust and Safety User Guide, now available here: the-ai-alliance.github.io/trust-safety-user-guide/

This “living” document provides an introduction to current trends in research and development for ensuring AI models and applications meet requirements for trustworthy results, and in particular, results that satisfy various safety criteria. Aimed at developers and leaders who are relatively new to this topic, the guide defines some common terms, provides an overview of several leading trust and safety education and technology projects, and offers recommendations for how to build-in trust and safety into your AI-based applications.

The leading trust and safety projects discussed include the Risk Management Framework from the National Institute of Standards and Technology (NIST), Trust and Safety at Meta, The Mozilla Foundation’s guidance on Trustworthy AI, The MLCommons Taxonomy of Hazards, and others. 

We welcome your contributions! 

We intend to evolve this living document, in collaboration with the broader AI community, to reflect trends in trust and safety, and to provide more in-depth guidance and usable examples. The guide is published using GitHub Pages, allowing anyone to contribute improvements as pull requests in the guide source repo.

Related Articles

View All

Mastering Data Cleaning for Fine-Tuning LLMs and RAG Architectures

News

In the rapidly advancing field of artificial intelligence, data cleaning has become a mission-critical step in ensuring the success of Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) architectures. This blog emphasizes the importance of high-quality, structured data in preventing AI model hallucinations, reducing algorithmic bias, enhancing embedding quality, and improving information retrieval accuracy. It covers essential AI data preprocessing techniques like deduplication, PII redaction, noise filtering, and text normalization, while spotlighting top tools such as IBM Data Prep Kit, AI Fairness 360, and OpenRefine. With real-world applications ranging from LLM fine-tuning to graph-based knowledge systems, the post offers a practical guide for data scientists and AI engineers looking to optimize performance, ensure ethical compliance, and build scalable, trustworthy AI systems.

Transform Pipelines in Data Prep Kit 

Technical Report

The blog post explores how Kubeflow Pipelines (KFP) automate Data Prep Kit (DPK) transforms on Kubernetes, simplifying execution, scaling, and scheduling. It details the required Kubernetes infrastructure, reusable KFP components, and a pipeline generator for automating workflows. By integrating KFP, DPK streamlines orchestrating and managing complex data transformations.

The State of Open Source AI Trust and Safety - End of 2024 Edition

News

We conducted a survey with 100 AI Alliance members to learn about the state of open source AI trust and safety for 2024. This blog post highlights key findings on AI applications, model popularity, safety concerns, regulatory focus, and gaps in current safety practices, while also providing an overview of notable open-source projects, tools, and research in the field of AI trust and safety.