Frontier · Sovereign AI

Project
Tapestry

A consortium approach to training frontier foundation models and sovereign derivatives.

Yann LeCun · Chief Science Advisor

The N+1 technical architecture: a shared base model at the core surrounded by sovereign nodes, with arrows showing weights exchanged and updated while no data leaves any node.

Announcement

AI Alliance launches Project Tapestry to build a collaborative foundation for open and sovereign AI.

The AI Alliance, a non-profit AI research and open-source technology coalition with more than 200 member organizations, introduces Project Tapestry — a new open-source platform for globally federated development of frontier AI models — preserving local control and long-term independence.

Today, open-weight AI models are everywhere. But open weights alone do not make pretraining participatory.

The infrastructure, data pipelines, and design decisions behind these models remain concentrated in a small number of companies and regions. Most of the world downloads the result. Almost no one shapes the process.

— Yann LeCun · Chief Science Advisor

Built on advances in distributed training which have demonstrated that globally federated model development can match synchronous baselines, the project enables institutions, industries, and nations to co-train a shared open foundation model while retaining control of their data and the ability to build sovereign derivatives aligned to their own priorities.

The thesis is testable and the science is ready. What remains is coordination.

The thesis

The time is now for frontier sovereign AI

Today, models with frontier performance are owned and controlled by a small number of organizations based in a few regions. Only a fraction of the world's data, compute infrastructure, and technical community are involved in frontier model development.

What if we could build a new global foundation model with this broader set of knowledge, resources, people and organizations working together — achieving performance no single organization could replicate, that every participating organization could build on?

Tapestry goals

Performance is the path to sovereignty

Tapestry brings together data, compute and talent from a global consortium to build the next frontier of foundation models — trained on a larger, more diverse corpus than ever before.

Frontier performance

A global foundation model system more performant than any other, while enabling sovereign derivatives owned and controlled by partners.

Sovereignty

A person, organization, or nation can own and control their AI, use it how they see fit, and retain the value they create with it — independence and agency.

Structural advantage

Consortium-managed access to data creates a structural advantage. Distributed ownership, sustainable funding, and mission-locked governance ensure trust.

How it works

Own your data, share in the model

Consortium model

One base model, many owners

Partners keep full ownership of their data and compute — and the value they create with it. Every contribution makes a base model that all owners build on stronger.

Core

Shared base model

↑ shared training · weight contributions

↓ improved base · right to sovereign derivatives

Partner noderetains own data & compute

Diagram A — A shared base model at the core; partner nodes contribute to shared training and receive the improved base plus the right to build sovereign derivatives. Each partner node retains its own data and compute.

Operating model

Decentralized contribution, centralized integration, trusted governance

The design principle behind Tapestry's operating model — contribution happens everywhere, technical integration happens at the core, and governance is trusted by everyone.

Decentralized contribution

Centralized technical integration

Globally trusted governance

Distributed training

Only the weights are shared

Data can be contributed privately to the core training corpus for base-model training only. Sensitive data is held back at the partner node and used locally to update weights — only the weights are shared.

Partner node

Stays local · never leavesLocal data (sensitive)

Computed on local dataLocal weight update

Only weights shared

Core

Receives weights onlyCore corpus / aggregation

Diagram B — At the partner node, sensitive local data never leaves; it is used locally to compute a weight update. Only the weights travel to the core corpus for aggregation. Data stays local — weights travel.

Sovereign derivatives

Your model, fully owned

Any partner can create a derivative model of the base that they fully own and control — using the same infrastructure.

Globally sourced training data

Globally Sourced Training Data

Training data sourced from partners around the world may become one of Tapestry's most important assets and the source of our differentiation. As a starting point we seek to catalog potential sources of training data from partners globally.

What can be contributed

Public or private datasets
Domain / industry data
Language & cultural resources
Government / public records where lawful
Evaluation datasets & environments
Preference & feedback data

What must be described

Provenance & ownership
License & use restrictions
Privacy / PII status
Quality & AI-readiness
Geography & jurisdiction
Permitted use (pre/post-training, evals, synthetic gen)

Benefits to contributors

Metadata-first contribution — partners describe contributions by type and metadata; data stays local and raw corpora are never shipped.

Propose a dataset →

Phase 0

Phase 0 · active now

Phase 0: First technical milestones

Specific early goals to validate hypotheses and de-risk scale-up.

Cultural alignment

Release and publish the first model re-aligned to 2+ cultures without core performance loss.

Distributed training

Demonstrate and open source the first multi-node weight update and aggregation framework; run experiments to understand scale-up requirements.

GitHub ↗

Training data catalog

Create a registry of metadata associated with candidate data sources available to the consortium.

Catalog ↗

Roadmap · 2026–2027

Long-term roadmap

Progression between phases is gated by technical proof points, partner commitments, compute availability, data readiness, and funding thresholds.

Phase 0

Jun 2026

Initial commitments

MOUs, initial sponsors, tech demos, data catalog prototype; formation steering committee.

● Active now

Phase 1

Sep 2026

Training platform v1

N-node training, local tuning, initial aggregation, first legal/data framework; initial governance & entity.

Phase 2

EOY 2026

First base model

Small model from scratch, continued pretraining pipeline, first sovereign derivatives.

Phase 3

1H 2027

Early deployment

Industry/government use cases, larger runs, developer & partner ecosystem.

Phase 4

Summer 2027+

Frontier-scale effort

Contingent on compute, data, and capital thresholds.

Join the founding coalition

Who should join

Tapestry is assembling its first contributors — the researchers, systems engineers, compute providers, governments, and institutions who will build the first sovereign federated training run.

01ML researchers working on distributed optimization, federated learning, or low-communication training
02Systems engineers with experience in large-scale GPU cluster orchestration
03Compute providers — cloud, sovereign cloud, or national HPC centers
04Government and policy leaders responsible for national AI strategy
05Universities and research labs with multilingual, domain-specific, or institutional datasets

Paris workshop official report

Next steps · get involved

Join us to build frontier sovereign AI

Path 01

Learn more / get involved

New to Tapestry? Tell us how you'd like to participate and we'll be in touch.

Get involved

Path 02

Contribute

Explore the code, follow the work, and open issues or pull requests on the Tapestry repository.

GitHub →

Path 03

Propose data

Have a data source for the consortium? Tell us about it through the form.

Propose data →

Get involved

Tell us how you'd like to participate and we'll be in touch.

ProjectTapestry

AI Alliance launches Project Tapestry to build a collaborative foundation for open and sovereign AI.

The time is now for frontier sovereign AI

Performance is the path to sovereignty

Frontier performance

Sovereignty

Structural advantage

Own your data, share in the model

One base model, many owners

Decentralized contribution, centralized integration, trusted governance

Only the weights are shared

Your model, fully owned

Globally Sourced Training Data

What can be contributed

What must be described

Benefits to contributors

Phase 0: First technical milestones

Cultural alignment

Distributed training

Training data catalog

Long-term roadmap

Initial commitments

Training platform v1

First base model

Early deployment

Frontier-scale effort

Who should join

Join us to build frontier sovereign AI

Learn more / get involved

Contribute

Propose data

Get involved

Project
Tapestry