In this blogpost Taxonomist, Author, Speaker and Blogger Heather Hedden (The Accidental Taxonomist) - introduces a survey she currently conducts. Read about her angle on the specific purposes of licensing third party Knowledge Organization Systems (KOS) as well as potential sources and limitations before taking the survey (Link at the bottom of this page).
The build or buy question applies not only to software but also to knowledge organizations systems (KOS). A KOS, such as a taxonomy, thesaurus or ontology, should be custom designed for each implementation, taking into consideration the unique set of content and the users. Nevertheless, there are situations when licensing a KOS created by a third party may be desirable, such as for a start of a KOS that is then modified, for a single facet of a faceted taxonomy, or for tagging multi-source research content.
Taking an existing KOS created by a third party, without modification, can have several problems. Its scope may be slightly narrower or it might not be as detailed, so needed concepts would be missing. Its scope may be slightly broader or it may be more detailed than needed, so it’s cumbersome and not user friendly, and indexing with it would be inconsistent. Its language style might not suit the new users, so users cannot find what they are looking for. Its terms and even their alternative labels (synonyms), may not match the language of the content, so content may not get indexed properly. Finally, it might not even have the desired structure, such as the difference between a thesaurus and a hierarchical taxonomy.
Licensing a KOS can be done as a starting point, which can thus be sufficiently modified for its new use. Modifications include removing concepts out of scope and not needed, adding missing concepts and their relationships, creating additional alternative labels to existing or new concepts, and changing the wording of selected preferred labels to conform with the preference of the users. If only a fraction of concepts need changing, and it’s more a matter of adding new concepts, then licensing can be a good way to get a KOS up and running much more quickly than starting from scratch.
Licensing a KOS to serve for just one or two facets or metadata properties of a larger KOS set may also be practical option. A faceted taxonomy enables the user to filter or limit search results by a combination of concepts selected from multiple facets/filters. For example, for images these could be: geographic place, location type, occasion, person type, time of year, activity, and object. It might be desirable to license a KOS for geographic place or person type and create the other vocabularies. Other examples of a single-facet KOS that might be of interest for licensing include product types and industries.
Licensing a KOS as is, with little or no modification, is sometimes appropriate if the original purpose and the new purpose is the same and the type of user is the same. This would not be the case for internally created content, but if the content comes from multiple external sources, such as published articles, and the users are conducting external research, then a third-party created KOS in the desired discipline or industry might be appropriate. Fields such as medicine, pharmaceuticals, engineering, and the sciences in general may be suitable for licensing a KOS.
The licensed KOS not only needs to be in the appropriate subject area but needs to have been initially created for a similar audience and purpose, which can be determined by contacting the original creator/publisher of the KOS. For example, a subject area of “finance” will have somewhat different concepts depending on whether it was created for academic/research use or for internal enterprise content management use.
The licensed KOS should be of the desired type: classification system, taxonomy, thesaurus, ontology, etc. This is not always obvious, since the distinctions between taxonomies, thesauri, and ontologies can be blurred, and the term “taxonomy” is sometimes used for many different kinds. So, it’s important to ask the KOS publisher specific questions, such as how many top terms there are, what kinds of relationships there are between concepts, and whether there are classes or categories assigned to concepts.
If modification is going to be done, which is often the case, the license needs to permit modification. An open source and free KOS may restrict modification and require attribution to the source of the unaltered KOS. An open source and free KOS will typically not allow commercial reuse either. A paid license typically permits modification and commercial reuse.
A KOS that is available for license typically comes in standard interchangeable format, such as CSV, RDF/SKOS, or RDF/OWL, so it can be imported into taxonomy/thesaurus/ontology management software, such as PoolParty Thesaurus Server, where it can be further modified. An understanding of the formats is needed to select the most desirable one when multiple formats are supported.
Finding the right KOS is important. There is a relatively new international resource, developed and maintained by the University of Basel Library, the Basel Register of Thesauri, Ontologies & Classifications (BARTOC). Each KOS is classified and assigned metadata for subject, category, KOS type, file format, language, and license type, among other classifications. It’s quite comprehensive for open source/free vocabularies, and has some, but is not as inclusive yet of, commercially licensed vocabularies, but it’s growing.
Some major information publishers who have developed extensive thesauri or taxonomies to index their published content do offer the vocabularies for license, but they do not promote it, so this is little known, and they reserve the right not to license vocabularies to a party considered a competitor. Examples include the Gale Subject Thesaurus and the Associated Press’ News Taxonomy.
To what extent do organizations seek to license a KOS as part of their knowledge or content management strategy? That’s a good question. Thus, we have created a short multiple-choice questionnaire. Please take a couple of minutes (3-4) to fill out the Taxonomy Licensing Interest Survey.
Interested in the results of this survey? Follow us on Twitter!
The annual SEMANTiCS conference is the meeting place for professionals who make semantic computing work, and understand its benefits and know its limitations. Every year, SEMANTiCS attracts information managers, IT-architects, software engineers, and researchers, from organisations ranging from NPOs, universities, public administrations to the largest companies in the world. http://www.semantics.cc