Trust and Safety

Ensuring safe and trusted generative AI with benchmarks, tools, and methodologies

Our members work together to understand the landscape of AI trust and safety risks, as well as other uses of AI system evaluation. We identify use cases specific to various important domains (finance, healthcare, education, and others). We work together and with external collaborators to build tools, refine methods, and create benchmarks for detecting and mitigating those risks and performing other kinds of evaluation. We also help educate the public about responsible AI and the developer community about responsible model and application development. 

Our work

These are the projects we are currently working on.

Understanding AI Trust and Safety: A Living Guide

Trust & Safety

A major challenge for the successful use of AI is the importance of understanding potential trust and safety issues, along with their mitigation strategies. Failure to consider these issues could impact an organization's operations and the experience of its users. Concerns about safety are also a driver for current regulatory initiatives. Hence, applications built with AI must be designed and implemented with AI trust and safety in mind. This guide provides an introduction to trust and safety concerns, and offers guidance for AI projects.

Ranking AI Safety Priorities by Domain

Trust & Safety

A challenge for software development teams adopting generative AI is making sense of the safety issues that their applications must address. The AI safety ecosystem is broad and growing quickly, making it difficult for these development teams to know where they should focus their efforts. What safety concerns are most important for them to work on first?

Trust and Safety Evaluations

Trust & Safety

Much like other software, generative AI (“GenAI”) models and the AI systems that use them need to be trusted and useful to their users. The Trust and Safety Evaluations project fills gaps in the current landscape of the taxonomy of different kinds of evaluation, the tools for creating and running evaluations, and leaderboards to address particular categories of user needs.

abstract gradient

Trusted evals request for proposals

Trust & Safety

The AI Alliance Trusted Evals request for proposals is aimed at seeking new perspectives in the AI evaluation domain. We are excited to work with those in academia, industry, startups and anyone excited to collaborate in the open and build an ecosystem around their work.