AAAI-2025 Workshop on “Open-Source AI for Mainstream Use”

Discussing the technical challenges to create an open-source AI ecosystem

This workshop is collocated with AAAI 2025

The 39th Annual AAAI Conference on Artificial Intelligence

About the Workshop:

According to the 2024 AI Index Report, 65.7% of the 149 foundation models released in 2023 were open source and there were 1.8 million AI-related projects on GitHub in 2023, a 59.3% rise in just one year.  Typical reasons for adopting open models are faster access to innovation, cost effectiveness, transparency, and the ability to modify the model. In addition to foundation models, an open-source AI ecosystem must also include tools and techniques to support downstream activities (e.g. model adaptation, human alignment, testing & evaluation, etc.).  With the increasing number of AI regulations around the world that attempt to specify what is acceptable for societal use, how the open-source AI ecosystem manages the risk of building, deploying and managing these systems matters immensely.  Therefore, while bringing many economic and social benefits, there are many technical challenges to create an open-source AI ecosystem.

The goal of this interdisciplinary workshop is to explore the following five areas:

  • Unique aspects of open source that make them ideal to build responsible AI applications.
  • Technology challenges to make open-source AI the mainstream platform.
  • Demonstration of the real progress already made in the open-source AI community.
  • Technical guidance to support practical and meaningful regulations that promote open technology.
  • Building a vibrant open-source AI community and ecosystem.  

Due to the importance of practical aspects in these areas, we want to address both active research areas and the practical implementations that shed light on the increasing role of open-source AI in society.

As is evident from the description above, the impact of open-source AI is undeniable. However, to make it the mainstream approach to develop responsible AI applications, we need to answer many questions, such as:

  • How good are the leading open-source models compared to proprietary models?
  • What real problem can open-source AI solve that a proprietary approach does not?
  • How do we make open-source AI safe and secure?
  • Is it viable to have a completely open-source solution stack? 
  • How is open-source AI affected by the evolving regulations around the world? 


Examples of specific research and demonstration topics are given below:

Research topics at the intersection of AI and Open Source:

  • Openness:  Current and evolving frameworks for defining openness of AI models
  • Assurance during AI system development: Specifications (e.g. resources, performance, execution speed, etc.) and safety requirements (use cases, context, failure modes, etc.), metrics & benchmarks, model-level and system-level alignment, measurement, continuous evaluation and reporting.
  • Safety & Security: Post deployment concerns such as unintended usage, model jailbreaking, model watermarking, guardrails, etc.  
  • Transparency: Visibility to AI system components (weights, training procedure and results, etc.), particularly the unique challenges in collection, use and potential exposure of data.
  • Accountability: Due to the prevalent use of AI in business applications, open-source poses a unique problem in the ownership of liability compared to proprietary models.
  • Privacy: Enumeration of privacy guarantees required of open-source implementations. 
  • Low resource options: Creation of open-source AI components that do not need enormous computing resources of the closed source options.
  • Frameworks/Platform: Creation of a decentralized open-source option to support End-to-End AI application development.
  • IP ownership and Licensing: Creation of appropriate legal constructs to address the needs of commercial usage of models trained on non-proprietary data.   

 

Examples of open-source demo topics:

  • Adaptation of an LLM with various techniques, RAG, LoRA, etc.
  • Building Mixture-of-Experts from LLaMA with Continual Pre-training
  • End to End RAG implementation using open source stack
  • Incremental knowledge addition to LLMs (InstructLab)
  • Simplifying GenAI deployments with Open Platform for Enterprise AI (OPEA).
  • Open source tools for AI guardrails (e.g. PurpleLlama, LlamaGuard)
  • Hate, Abuse, Profanity detection and mitigation
  • Hallucination detection
  • Structured generation-Improved performance at reduced costs.
  • Memory-Efficient LLM Training 
  • Best practices on development, deployment and monitoring
  • Open stack Contrastive Language-Image Pre-Training (CLIP) embeddings
  • Quantization & Pruning