Towards a Cloud-Native OGC
This is Part 1 of a two-part post. See Part 2 of this blog post, 'Towards a Cloud-Native Geospatial standards baseline', here.
About six months ago I started as the first ‘Visiting Fellow’ of the Open Geospatial Consortium. It’s been a true pleasure to explore various aspects of OGC more deeply, working with staff and members. The time has flown by, and so I wanted to share my progress and some thoughts on what comes next.
The open-ended scope of the fellowship was amazing, but I realized that I’d quickly have to focus if I was to actually make an impact while working a half day a week for six months. The theme that emerged I call ‘Cloud-Native OGC’, exploring the fundamental components that enable geospatial standards on the cloud, at a level ‘below’ APIs.
This is an evolution of the ideas I presented four years ago in a blog series called ‘Cloud-Native Geospatial’, which opened with the question ‘What would the geospatial world look like if we built everything from the ground up on the cloud?’. I’ve spent much of my time since then focused on two core parts of that transformation — Cloud Optimized GeoTIFF’s and SpatioTemporal Asset Catalogs. We’ve seen some incredibly early adoption of both of those formats, but it’s been mostly centered on multi-spectral satellite imagery, which is only a small corner of the overall geospatial world. So my time as an OGC Visiting Fellow has been spent on a riff on that original question: ‘What would geospatial standards look like if they were built for the cloud?’ I was able to take the time to look at the entire geospatial landscape, not just imagery, and the potential for OGC to play the key leadership role in making the Cloud-Native Geospatial vision a reality.
The Cloud Native Geospatial Vision
In digging in more, I found that OGC’s existing standards work could easily evolve to align the industry on Cloud-Native Geospatial architectures. There is no organization better situated to make it a reality than OGC: it is already trusted by every government as the steward of geospatial standards and has the largest community of geospatial experts working together, across commercial, non-profit, government, and academia.
Before I go deep into details of the standards necessary to support this, it’s worth a full articulation of the future state enabled by this vision.
The OGC mission is to ‘make location information Findable, Accessible, Interoperable, and Reusable (FAIR)’. Cloud-Native Geospatial shares the exact same goal, but leverages the cloud to radically simplify the effort needed to make geospatial data FAIR. Instead of forcing data providers to stand up, maintain, and scale their own APIs, the requirement should be as simple as using the right cloud-native geospatial format and metadata, and uploading it to any cloud. All the APIs and scalability come from the cloud itself, enabling geospatial to ride the continuous waves of innovation in the broader IT world instead of continually playing ‘catch up’.
A core aim of cloud-native geospatial is to decrease the burden on data providers, and in turn enable far more geospatial data to be FAIR. The only cost that providers should need to pay is for the cloud storage, which currently is between US$1 and US$5 a month for 100 gigabytes of data. If that core data is hosted on the cloud, then general cloud-native technologies enable the cost equation to be flipped on its head, as the users of the data pay for any computation they do, and with ‘requestor pays’ the users even pay for the egress costs.
Once the data is in the right cloud-native geospatial formats then it’s easy for anyone to stand up a traditional geospatial server, ideally one making the data available as OGC APIs. But the data itself becomes FAIR, even if it’s not in an advanced API, as the cloud plus key standards provides all that is needed to provide the data.
But things get really exciting when thinking about a whole new class of cloud-native geospatial tools that can layer on top of the core FAIR data, sitting alongside traditional geospatial services. Google Earth Engine has been operating in this future for years, enabling global scale computations that run across tens of thousands of compute nodes simultaneously to deliver answers in seconds. They have done an incredible job of curating a vast amount of data, but GEE has traditionally been a walled garden where only data ingested into GEE could tap into its capabilities. In the Cloud-Native Geospatial vision, any data on the cloud could be used by GEE (and indeed they have started to embrace the CNG vision with COG registration).
More importantly, any new cloud-scale compute tool like GEE wouldn’t need to build up its own data catalog as it could just access the same cloud-native geo formats that other tools use. Having a suite of cloud-native geospatial tools with cheap data hosting then opens up the potential for a much longer tail of geospatial data to be FAIR, as smaller organizations who have valuable information but not the wherewithal to run servers will embrace cloud-native geospatial, as putting their data on the cloud will enable many awesome tools and analysis. Access to all the world’s information in one place combined with infinite scale computation, in turn, should usher in a whole new wave of innovative tools that move beyond traditional geospatial analysis to finding broader patterns. Then the line between geospatial and non-geospatial information will blur once it is cloud native, greatly magnifying the potential impact of geospatial insight - but that’s worth a blog post of its own.
Getting to a critical mass of data that is actually usable by advanced tools of the cloud then opens up the possibility of real ‘geospatial search’. The main paradigm today is that you must know or find a particular geospatial server and then you can perform geospatial searches to find the information you need. There is no ‘google for location information’ because there is no standard format to ‘crawl’ like there is html for the web. Simple metadata and data formats that live on the cloud provide the core ‘crawlability’, particularly when they have an html equivalent that is crawlable by traditional web search engines, as described by the Spatial Data on the Web Best Practices.
The key thing cloud-native geospatial enables for search is access to the actual data - you can stream it directly into diverse tools that provide real value. Previous attempts at geospatial search engines would at best show a preview image, and often only a text description, and often the actual data wouldn’t even be available for direct download: it was just a search of the metadata. With cloud-native geospatial, the search tool can stream full resolution data directly in the browser, or link to more powerful tools that enable cloud-based analysis of the search results. The cloud-native geospatial vision focuses first on getting a critical mass of data to the cloud, but once there are sufficiently valuable masses of information it opens the possibility of a whole new class of technologies and companies focused on more innovative geospatial search tools.
Towards a Cloud-Native Geospatial Standards Baseline
So how do we actually make progress towards this vision? The core standards are much closer than one might expect. But it must be emphasized that achieving this vision will take far more work from the entire geospatial industry than just releasing some standards. We need a sustained effort to bring every piece of location information to the cloud in standard formats, to update every tool to be able to work with it, and to build together a whole new class of next-generation tools that show the power of having petabytes of information about the world in one place.
This means a very solid foundation to build on, enabling layers and layers of innovation on top. But this standard baseline must also be adaptable to the overall technology landscape, to be able to ride larger tech trends (like the shift from XML to JSON and to whatever will come next). The key to this is to build small pieces that are loosely coupled, with few moving parts, and really focusing on the truly geospatial components. This approach is one of radical simplicity, getting the core atomic units right to enable unimagined innovation on top.
I’ll dive deep into a practical plan to get to a minimal viable standards baseline for cloud-native geospatial, leveraging all the great work the OGC and broader geospatial community has done. But my work under the fellowship the last few months does show that we are potentially close to the baseline, and if we work together to realize that and build out the interoperable ecosystem of tools around it then the power of geospatial information will be available to all. Those of us who work in the field know the power, and if we can build simple cloud-native interoperability that makes it easy for anyone to access our data then the impact on the world will be immeasurable.
Interested in this vision? Be sure to read Part 2 of this blog post 'Towards a Cloud-Native Geospatial standards baseline'