Cross-domain Glossary Management Pilot

Contributed by: 
Ankita Tripathi and Cameron Shorter

Cross-domain glossary management pilot banner

Glossaries are easy to set up for simple examples but extremely hard to scale - especially when a project wants to inherit terms from other organizations. So we are kicking off a pilot to address cross-domain management of glossaries. Contributors are coming from the tech writing community, OGC, ISO/TC 211 Terminology Maintenance Group (TMG), and OSGeo projects. Hopefully, OGC members will stand up to glossaries too. Interested readers are encouraged to  sign up to the GeoLexicon mailing list.

Use cases

This project plans to address the following use cases:

  • As a general document reader, I want to find definitions for the terms and acronyms in the document I am reading. There may be multiple definitions for a term, determined by context, having multiple upstream source definitions.
  • As an advanced document reader or term maintainer, I want to understand the inheritance path back to upstream source definitions.
  • As a technical writer, I want to find the preferred spelling, capitalization, and word choice for a term.
  • As a document translator, I want glossary terms to be translated into my target languages, so I can consistently translate a source term to the same target term.
  • As a project, I want a glossary that includes terms specific to my project, as well as terms sourced from multiple external glossaries.
  • As a foundation, I want a glossary which sources terms from all the sub-communities.
  • As a glossary owner, I want to ensure my glossary is continuously updated to align with updates in my source glossaries.
  • As a software developer, I want terms and relationships between glossaries in a machine-readable form so that I can integrate glossary functionality into software.
  • As a data modeller, I want the terms described in my model to use existing term definitions, (from APIs, standards, etc), as defined within my domain, so that I can share my terms with others, and seamlessly integrate datasets from multiple sources. 

Schedule

The schedule is as follows:

This schedule aligns with The Good Docs Project's Season of Docs initiatives. 

Deliverables

The following deliverables are likely to be achieved within the pilot’s timeframe:

  • Schemas for various term types
  • Processes for managing glossaries
  • How-to guides
  • Tools
  • Glossaries as web services for:
    • OSGeo
    • Various OSGeo projects
    • OGC
    • ISO/TC 211
    • Spatial organizations (possibly)
  • Translations

Why now? Why Geospatial?

As of August 2020, many things are lining up to enable us to collectively solve the tough challenges around the cross-domain management of glossaries.

Aligning activities include:

  • The Good Docs Project has been making progress tackling technical writing problems. We have recently built a How to apply/customize a writing style guide for software projects. The next step is to explain how to apply word lists and glossaries. And we have volunteers willing to push this forward.
  • The geospatial community is very advanced at trying to solve terminology management challenges:
    • Through the OSGeo Foundation, we have relationships with ~ 50 geospatial open source projects who all need glossaries, and through the OSGeoLive project, we have contact points with each of these projects as well as access to volunteer translators for OSGeo documentation. In the 2019 Season of Docs program, we connected with all these communities and updated their quickstarts. We can do it again for glossaries.
    • OGC has recently modernized and expanded its Definitions Server and has linkages with a number of other standards development organizations that have their own terminology projects.
    • We have experienced volunteers from the OGC and ISO/TC 211 standards bodies keen to bring their expertise to advance this challenge. These volunteers are already working on this problem.
    • From theISO/TC 211 and OGC communities, we have access to open-source software for term management and access to the people who wrote it.
    • ISO/TC 211 (TMG) is redeveloping its Multilingual Glossary of Terms (MLGT) as an ISO SMART project for machine-readable/interpretable terminology that encompasses management of lifecycle to the usage of such content. 
    • The ISO/TC 211 MLGT SMART work is performed in partnership with Ribose who supplies the Glossarist software and the Geolexica terminology web platform. Ribose volunteers to support OSGeo lexicon work and its workflow in both of these offerings. 
    • Through the Geolexicon working group, we have OSGeo volunteers who have been maintaining a glossary of terms. They will be able to apply these terms and add more.
  • The Good Docs Project is starting a sprint of work, aligned with Google's Season of Docs. We are shooting for a soft launch in December 2020, a hard launch around February 2021. This helps frame a sense of purpose, timing, and scope which we can tap into.
  • There are other initiatives within The Good Docs Project which will complement this work and facilitate cross-pollination of ideas.

If you feel this is something that interests you and you wish to be a part, join us by signing up on our GeoLexicon mailing list (You may need to check spam for your confirmation email). The more the members, the bigger the impact!