Surprise, surprise - last Thursday's debate on this proposition was a
pushover for the opposition. To defeat any argument of the form “XXX has no
place in YYY”, all you have to provide is one counter-example.
Just for starters:
- The UK Data Archive, powered by the HASSET thesaurus
- The FAO’s AGRIS database, searchable using AGROVOC, and
- EUROVOC, used for searching publications of the EU institutions and others
were among 11 such examples that Leonard Will managed to
cram on to one slide. He could have gone on to cite dozens more cases where a
thesaurus provides sophisticated and indispensable search capabilities.
The “expert witness” Philip Carlisle backed him up by
describing the nine vocabularies and related services that English Heritage built
and maintains for the heritage community. Contributions from the floor drew
attention to the power of a thesaurus to cross language boundaries, not to
mention image searching, where indexing with a controlled vocabulary still
outperforms all the other methods.
But simply overthrowing the proposition misses the point –
the role of the thesaurus in modern Information retrieval has shrunk from what
it once was. The high development and maintenance costs of an extensive
controlled vocabulary deter most potential implementers. Most users simply do
not want to know about such a complicated-looking beast, and so the shy
thesaurus needs to perform discreetly but cost-effectively behind the scenes.
Given a discerning team of developers, curators, IT support staff and indexers,
this sophisticated tool can and should function interoperably alongside
statistical algorithms, NLP techniques, data mining, clustering, latent
semantic indexing. linked data, etc. Networking and collaboration, not rivalry,
are the future.
As the professional body that has grown up around
classification, indexing, use of thesauri and other knowledge organization
systems, ISKO has a mandate to mark out that future. Follow-up activities could
usefully explore:
- The contexts in which the thesaurus is or is not a useful tool;
- how to choose between a thesaurus and another type of knowledge organization system;
- how to integrate a thesaurus with the other components of a modern information retrieval system;
- how to adapt a standard thesaurus to the needs of special contexts;
- features of the software needed for thesaurus management.
The knowledge organizer with a grasp of these topics is
ideally placed to develop the hybrid vocabulary structures (e.g. a layer of
thesaurus model hooked on to upper level ontologies and coated with taxonomy
features) needed in today’s networked environments.