Wednesday 12 November 2008

Outputs available: "Semantic Analysis Technology" event, 3 November 2008

The outputs (presentation slides and mp3) from the ISKO UK event that took place on 3rd November 2008 entitles "Semantic Analysis Technology: in Search of Categories, Concepts & Context" are now available the event's website http://www.iskouk.org/semantic_nov2008.htm.

This half-day event included presentations by Luca Scagliarini of Expert System, Jeremy Bentley of SmartLogic, Rob Lee of Rattle Research and linked presentations by BBC information architects Helen Lippell, Karen Loasby and Silver Oliver - followed by a rather interesting panel discussion. There were more than hundred people in attendance.

The talks represented the different approaches in text processing and advanced techniques in automatic resource indexing that help to resolve ambiguities in content searching and linking:

1. Luca Scagliarini (Expert System) "Whales & cat fur: using a semantic net to improve precision & recall" [pdf] [mp3]
Luca pointed out that the present information discovery suffered from both information overload and information underload due to a lack of meaning-based text processing. He reviewed current technologies and illustrated problems with shallow automatic linguistic analysis and the lack of 'understanding' of the meaning that is encoded in the relationships between verbs, prepositions and nouns. He illustrated how a 'deep semantic analysis' based on the analysis of relationships works in Expert System's new semantic intelligence software, Cogito. Cogito utilizes an innovative 'semantic network' to achieve improved machine 'understanding'. The semantic network contains 350,000 definitions and 2.8 million relationships for the English language vocabulary.

Jeremy Bentley (SmartLogic) "It’s just semantics" [pdf][mp3]
Jeremy provided an overview of issues in information organization: unstructured information, the doubling of number of resources every 19 months, the problem of 'findability' and the issues with black box solutions. He illustrated the relevance of metadata and the relevance of taxonomies built specifically to reflect the way a business works. He explained how this could be exploited in managing the semantic layer of an information and content architecture and how an ontology can be used for automatic analysis of contexts and semantics, in queries and search engines.
Rob Lee (Rattle Research): "Connecting concepts: joining up the BBC" [slideshare][mp3]
Rob Lee talked about Muddy Boots, a BBC project dealing with linked data and the creation of dynamic semantic richness. The BBC's remit to link to external sources has provoked lots of thinking and doing in the area of dynamic linking. Rob illustrated how datasets in the public domain such as MusicBrainz or DBpedia (which structures content from Wikipedia so that it can be used in semantic web systems) can be used to contextualise and index BBC resources as well as to extend them with external links.

Helen Lippell, Karen Loasby, Silver Oliver: "Tales from the trenches of auto-categorisation: three case studies in the implementation of auto-categorisation systems" [pdf][mp3]
Helen, Karen and Silver presented three different implementations of auto-categorisation systems at the BBC. They demonstrated the advantages and issues with each of these approaches. Helen's presentation entitled "Teaching computers to read newspapers Aka Automatic classification at FT.com in the early noughties" was about experience in a joint project by the FT, Lexis-Nexis and Dialog. The goal was to connect thousands of resources through a single interface. The tool used was Verity Intelligent Classifier (VIC) and the classification process used a taxonomy with a set of rules that could be finely tuned. Karen spoke about "Content Management Culture in the BBC" a metadata orientated project to produce BBC content that could be described in detail. The approach applied was a rule-based automatic classification system combined with the author's review and corrections. Silver talked about a "Statistical-based auto-categorisation" project designed to connect and cross-reference distributed BBC content and resources horizontally.

See outputs from other ISKO UK events.

No comments: