Facets of Knowledge Organization - A tribute to Professor Brian C. Vickery (1918-2009)

The second biennial ISKO UK national conference was held in honour of Brian C. Vickery, information scientist and knowledge organization pioneer. His work ran as a theme throughout the two-day event, drawing together a number of key topics - such as the way search and retrieval form the core of the information science field - while also providing an interesting historical perspective across 50 years of astonishing technological development and change.

Stephen Robertson, of Microsoft Research, gave the opening address, in which he highlighted key themes of Vickery's "On Retrieval System Theory", first published in 1961. Robertson raised one of the key themes for debate of the conference - should there be a single core general theory of information retrieval, or should we remain a "mongrel" field, weaving together strands of theory from diverse fields?

Claudio Gnoli, University of Pavia, then talked about some aspects of Vickery's thinking that have not yet been widely published, especially concerning classification. Vickery thought much about ways of defining facets with a slightly different approach to S. R. Ranganathan, whom he had known personally. As a member of the Classification Research Group (CRG), Vickery favoured a "faceted" approach to the "operator" approach of Jason Farradane, Eric J. Coates, and Derek Austin. Vickery was interested in moving away from academic disciplines as the core of knowledge organization theory and considered looking for potentially useful facets by dividing phenomena from activities. Vickery was extremely pragmatic. The Léon manifesto proposed key facets of reality - noumena; phenomena; perspectives; and carriers - although Gnoli suggested that Vickery would not have recognised a need for a reality above phenomena as we know them, on the grounds that we are organising knowledge as we understand it, and there is no need to invoke mysterious realms beyond that. Gnoli also suggested considering Activity Theory, which looks at behaviours from the point of view of motivations, goals, tools, and outcomes, as a possible framework for organizing knowledge about disciplines.

D. Grant Campbell, University of Western Ontario, proposed a novel use of Farradane's matrix of dimensions of association and discrimination to highlight ambiguities and differences in text and speech of doctors and patients. Much work in healthcare text analytics has focused on semantics and ontologies, but one of the core problems is that professionals and patients speak almost incompatible "languages" and building "crosswalks" to translate from one mode of discourse to another is extremely difficult. By scoring texts and using Farradane's matrix and looking for points where his operators break down, areas of divergence in the texts become obvious.

Marianne Lykke, Aalborg University, described a project to compare free text tagging with a tagging system that was underpinned by a controlled vocabulary. Taggers using the CV-enhanced system tended to produce more narrow and specific tags, and more consistent results. It was more helpful to provide small and relevant sections of the CV than to provide a general tag cloud of all the tags that were being applied. The taggers liked the CV but wanted it to be in a simple and appropriate format. Dewey Decimal headings were not particularly user-friendly.

Rebecca Green, OCLC, explained how the OCLC had automated an abridgement of DDC, filtering out certain aspects - e.g. subtopics like North, as in Warwickshire, North Warwickshire, South Warwickshire. She pointed out that the automated process sometimes threw out anomalies and inconsistencies in the original schema and that sometimes editorial amendments, for example to indexing notes, had to be made manually. They are aiming to create one machine readable DDC that can be exposed in different forms.

Gary Steele, Glasgow Caledonian University, pointed out that many studies into indexer inconsistency are unhelpful because they tend to judge indexers against an arbitrary and subject definition of good and bad indexing. He used a system of counting indexing and matching for similarity, then weighting the most popular indexing terms, deeming these to be the most generally useful.

Simon Spero, University of North Carolina at Chapel Hill, spoke on the lack of a recognised formal semantics for pre-coordinated index headings. He pointed out that SKOS is currently inadequate for modelling subdivided headings and that this would be helpful in modelling subdivisions in a corresponding ontology. Formal semantics and support for different types of subdivision is required to provide automated support for complex classifications (e.g. some types of subdivision can only be applied to certain types of headings). The full semantics are not captured by treating subdivided headings purely as labels. He also pointed out the importance of getting your subject matter experts to communicate with your ontologists otherwise you get nonsense.

(In the parallel track, Lucy Bell from the University of Essex, talked about metadata for academic resource discovery, Kathryn La Barre, University of Illinois, compared contemporary and traditional facet theory, and Maja Zumer, University of Ljubljana, and Marcia Lei Zeng, Kent State University, discussed conceptual and data models.)

Vivien Petras, Humboldt University, explained how the advent of the semantic web has revived interest in KOSs (now ontologies) but that terminology has diverged to the point that tracing research through the literature is becoming increasingly difficult. Unlike precision and recall measures, there are no benchmarks or quality standards for evaluating browse and contextual exploration, so there is as yet no clear way of evaluating the usefulness of an ontological approach. From existing studies, equivalence and narrower term relationships are generally good, but related terms less useful for improving search results. The ontological approach faces problems of over-expansion and confusing the user. There are as yet no interfaces that can elegantly expose a wealth of complex ontological relationships.

Marianne Lykke, Aalborg University, returned to talk about a project to define "semantic components" within documents used by doctors, and to use this component as a way of refining their search results. The doctors did use the component to help g them formulate queries and it appeared to improve their search results.

Stella Dextre Clarke explained how the new ISO 25964 standard will enhance system interoperability. The standard has looked at what will be needed to support developments in the semantic web, with particular regard to standardization of mapping and handling of pre- coordinated terms.

The evening reception was sponsored by Synaptica.

Day two opened with a keynote address by Professor Amanda Spink. Professor Spink gave an overview of the development of human information behaviours, going right back to cognitive archaeology and neuro-evolution of early hominids, which tells us that humans began classifying and categorising at a very early stage of their evolution, as they began to communicate. When humans developed larger brains they became able to hold more information in active attention, enabling planning and more complex behaviours. Evidence of complex thought includes cave paintings (30,000 BC), ideographs (6000 BC) and calendars (4240 BC). Libraries date from at least 320 BC. More recent information behaviours can be studied by analysing the writings of individuals. Three interesting examples are Napoleon, Darwin and Casanova, as they wrote a lot about their own information behaviour. Napoleon's use of maps, methodical collecting contributed to his military success. Darwin created indexes for his papers. Casanova - spy, businessman, librarian an information organiser - profited from information that he gathered and created a unique library schema. Such studies are important to the field of information science, as we need to be able to understand the psychological aspects of information behaviour in order to educate people to be information literate and help them operate effectively in an increasingly information- driven world. We need to provide better theories, models, and vocabularies and we need to help individuals learn and think more about their own information behaviours as we have traditionally been too focused on information professionals and systems designers.

Elizabeth Orna, talked about the use of imagery and information organisation in early monasteries of Asia and Europe. At the time of Gutenberg, print production spread very rapidly and there was much effort to cope with the effects of the new technology itself making knowledge visible. Renaissance humanists loved classification, but the vision of a single classification of all knowledge ended in late renaissance. However, just as early printing called for collaboration of many crafts, so too do modern information products and visualisation and design skills are increasingly needed. Political and cultural skills are also required to make information products and projects successful. The most successful projects tend to occur where everyone participates on an equal footing, with the designers, librarians, and end users involved right from the start.

Patrick Lambe, Straits Knowledge, discussed how knowledge organization can be used to support science. He talked about the need to connect the language of researchers to the language of administrators, as they oversee budgets and allocations of funding. Most of the interesting science is intersectional and cross disciplinary, so compound concepts are really important. Most of the work of knowledge organizers has been in the middle of disciplines, but it is the boundaries and borders where a bigger more socially useful contribution can be made. We need to help scientists from different domains to speak the same language so that they can cross domains in research and help them carry concepts through time. We used to deal with manageable types of information and well structured manageable collections, but this is not true anymore. However, knowledge organization structures perform many useful functions that are still needed. If a KO structure has predictive power, can show where there are gaps, and highlight differences in perspectives they remain useful.

David Bawden, City University, provided an overview of Brian Vickery's ideas about the foundations of information science. He claimed that Vickery saw no distinction between theory and practice, but that both should inform the other. Vickery saw the information profession as dynamic, so wanted students to learn to be flexible and adaptable rather than trained for a particular job.

Joseph T. Tennis, University of Washington, compared Ranganathan's facets with Vickery's classification, pointing out that while Ranganathan sought to create a canon of laws, Vickery aimed only to suggest, not prescribe.

Vanda Broughton, UCL, talked about the CRG, which originally comprised scientists, but after two years the scientists were struggling and they realised they needed librarians to help. The group published relatively little, while Vickery published a lot. However, there was a schism between classification and information retrieval that is only just beginning to be bridged.

James Howard and Silver Oliver, BBC, explained how they have used an ontology-based model for building the BBC's news and sports websites, with concept extraction software also using an ontology-based model to support taggers index the content.

Elaine Ménard, McGill University, described a project to investigate bilingual tagging of images.

Fran Alexander, BBC, talked about a business and technology change project and the work that is being done to migrate the BBC Archive's classification schemes from legacy databases to the new system. Following migration, the taxonomies will be mapped and merged and a number of mapping experiments have been undertaken in preparation.

Deborah Lee, Courtauld Institute of Art and City University, described her work on music classification. Facets are very useful for music, but genres change very rapidly, especially in pop music, which is a challenge. She suggested that music library users may need to be treated as a special case, because they have different needs to other users.

Jean Debaecker, University of Lille, described his studies of behaviour of users on online music download sites, and how they perceived, used, and created metadata.

Wolfram Sperber, FIZ Karlshruhe described the difficulties of knowledge management in mathematics.

There were also six poster presentations.

The conference was sponsored by Synaptica, Pool Party, Metataxis , MultiTes , Mondeca , Ashgate , CILIP , OCLC , UKeiG , and hosted by the Department of Information Studies, UCL .