
GEO-Bench-2: From Performance to Capability, Rethinking Evaluation in Geospatial AI
GEO-Bench-2, developed by IBM, ServiceNow, and the AI Alliance Climate & Sustainability Working Group, establishes a new global standard for evaluating Geospatial Foundation Models (GeoFMs). By combining 19 datasets across 8 subsets, a flexible evaluation protocol, and over 15,000 baseline experiments, it delivers a transparent, rigorous, and collaborative framework for advancing geospatial AI. Integrated with tools like TerraTorch and hosted on Hugging Face, GEO-Bench-2 bridges research and real-world impact—empowering scientists, industry, and policymakers to measure progress, accelerate innovation, and apply trustworthy AI to climate, sustainability, and disaster resilience challenges.