Open Foundation Models and Datasets
Enabling an ecosystem of open foundation models, including those with multilingual and multi-modal capabilities, and open datasets.
We are responsibly enhancing the ecosystem of open foundation models and datasets. We are embracing multilingual and multimodal models, as well as science models tackling broad societal issues like climate change and education.
To aid AI model builders and application developers, we’re collaborating to develop and promote open-source tools for model training, tuning, and inference. We are also launching programs to foster the open development of AI in safe and beneficial ways, and hosting events to explore AI use cases.
Without good datasets, model training and tuning would be impossible. We are promoting the development of open datasets with clear governance and provenance controls so they can be used without concerns for legal and other risks.
Working groups
AI for Drug Discovery
This working group aims to create a world-class research community that harnesses the potential of AI foundation models, transforms the field of drug discovery, and accelerates scientific progress by driving interdisciplinary collaboration on AI-powered drug discovery projects in the open.
Foundation Models
Materials and Chemistry
We aim to curate datasets, tasks and benchmarks for materials science, build out foundation models in chemistry for prediction of properties, experimental outcomes or generation of new candidates and create a framework to foster collaboration between human experts and AI agents that will ultimately help solve global urgent challenges in sustainability and safety of materials.
Current or recent projects
Open Trusted Data Initiative
Foundation Models and Datasets
Cataloging and managing trustworthy datasets.
Time Series Data and Model Initiative
Foundation Models and Datasets
Time-series applications are an important target for AI. In addition to gathering high-quality and fully-governed time series datasets as part of the Open Trusted Data Initiative, Alliance members are collaborating on new and improved time series models (as part of the Industry Open FMs Initiative and benchmarks, both general-purpose and application-specific.
Please join us. We need time series and domain experts, including especially subject matter experts and use case and product owners who would like to apply emerging time series foundation models to new applications. There is an acute shortage of good, open datasets for time series and data specially benchmarks and evaluation methods for various use cases. Contributions are especially welcome here.
Industry Open FMs Initiative
Foundation Models and Datasets
We have seen rapid progress in building and releasing highly-capable and open foundation models for general language, coding, scientific discovery, and multi-modal scenarios.
A key development in model strategies is a focus on building smaller, more specialized models.
More details are coming soon, but we would love for you to join us. We need both model-building and domain experts, including those outside the target domains listed above.