Sunday, 17 April 2011

Social metadata for libraries, archives and museums: Research findings

The results of a joint research project by OCLC and National Library of Australia are discussed in a presentation to the Libraries Australia 2010 forum, October 20, 2010. Ten percent of the respondents in the survey were UK-based.

The research generated three reports:

  • Website reviews, and use of third party sites (150 pp.)
  • Analysis of website manager survey results (50 pp.)
  • Recommendations for social metadata and bibliography (due November 2010).

The slide set includes a number of interesting examples where social metadata are used, from Flickr geotagging to The Mutiny on the Bounty, to 19th century moustaches. The slides then go on to present a summary of the results, such as how long respondents have been offering social media features, their reasons for doing so, social media and interactive features offered and policies and guidelines used, concluding with 18 recommendations.

There is no indication whether the papers accompanying the presentations will be published online.
Social metadata for libraries, archives and museums: Research findings from the RLG Partners Social Metadata Working Group, October 2010
View more presentations from Rose Holley.

The presentation is downloadable from the NLA web site (ppt, 6.7 MB), or viewable on SlideShare. The SlideShare version includes speaker's notes for each slide.

Thursday, 14 April 2011

Review of the Public Access to Information event

Public Access to Information? Challenges for Information Gatekeepers was a joint event between ISKO UK and Taxonomies in the Public Sector (TiPS). Michael Warner opened with a short description of TiPS, a discussion and, networking group that seeks to influence government in information issues. The group was established four years ago and welcomes new members.

The Ins and Outs of Information Rights


The first speaker was Christopher Graham, the current UK Information Commissioner. He began by discussing the role of the Information Commissioner’s Office (ICO). It has to enforce such regulations and the Freedom of Information Act (FOIA), the Data Protection Act (DPA) and Environment Information Regulations and Privacy and Electronic Communications Regulations. The ICO provides advice, guidance, monitoring, and promotes best practice and compliance with the law. With a staff of 350, the ICO seeks to be the “authoritative arbiter of information rights” and a model of good regulation.
Practically everybody is a stakeholder for the ICO, from local authorities, to politicians, citizens, and consumers. The commissioner has the rights of a corporation - a huge responsibility - but can only be dismissed by the Queen with the assent of both Houses of Parliament, so it is a good job to have in a recession.
Freedom of Information and Data Protection legislation embody competing rights. The Freedom of Information Act was seen as a bit of a “bolt on” to Data Protection law, but it became clear that they are intertwined and both have to be considered together. Some people have called for the establishment of a “privacy commissioner” to make the case for privacy, but this would just defer the decision point, as someone else would have to take responsibility for deciding on the balance between private rights and public interest.
It is a very exciting time to be involved in information, with controversial issues such as the ethics of the creation of human DNA databases. Linked Open Data is also opening up exciting possibilities. Crime mapping is a classic case of balancing privacy and freedom of information. There is a clearly a strong public interest in crime statistics, but could be detrimental to the rights of people living in high crime areas. Too much anonymising, however, may destroy the usefulness of the data. Ironically, the Big Society could actually end up involving less accountability as information moves into private arenas that do not have the same responsibilities to be open.
Public attitudes towards information security are ambivalent. People like CCTV to protect them, but resent being spied on. They like their data to be secure but are less concerned about the amount of data that organisations collect and store.
Ten years after the introduction of the FOIA, there is still a mixed picture. Organisations should have publication schemes, offer rights of access and processes for handling requests, but there are a growing number of complaints to the ICO over FOIA requests. As the current government’s information and open data agenda becomes more high profile, the public are likely to become more interested.
The FOIA should help to reduce inefficiency as it opens up public sector spending to scrutiny. Breaches of the information act are costly and the ICO monitors organisations to make sure they reply to requests promptly. Answer requests is becoming more difficult for organisations with dwindling budgets. The ICO website is a rich resource of information and support and the ICO tweets as @iconews.

The Checks and Balances of a Transparent Public Sector World of Information


Carol Tullo of the National Archives (NA) discussed the benefits of exploiting and re-exploiting public sector data, while avoiding unacceptable risks. Law, copyright, archives and information science all form part of Carol Tullo’s work. The NA is an information gatekeeper, even if it doesn’t think of itself like that. How do we give people access to public sector information, rather than just allowing people to get their own information? The default position is about proactive release of information and it is a very different world to the one of 20 years ago.
Nobody knows what transparency and accountability mean outside the information world but we use the terms all the time. The NA is trying to explain the concepts. The principles of public data policy have come down to core issues including releasing data under open licensing and open standards in re-usable form. If you can embed metadata standards in a pdf, it is not locked up and the metadata is not easily removed, so authorship provenance etc. is preserved. There are various strands of information management that are not creating standards and tools to publish this data in a sharable reusable form. For example, staff structures and organisational charts of public institutions are of public interest, but are often not kept in reusable sharable formats. The NA has helped developed a tool to standardise organisational charts to help institutions publish these usefully.
The government recognise the value of the data and for ten years there has been and agenda to publish more. It is still slow and there are moves to place obligations on institutions to publish. The NA is encouraging institutions to be proactive about releasing their data. The Open Government Licence launched last September is aimed to encourage this. Some 180 local authorities are now releasing their data under it and the Ministry of Culture in South Korea and the government in British Columbia, Canada, have adopted it. It is the new default licence under the FOIA.
No-one has been sued for using opened up data yet. The hack days and releases to encourage use of the data haven’t caused problems. However, people are worried that they can’t trust the licence that they will somehow get in to trouble for using data they discover on government websites, so more encouragement than a simple link to the licence would help.
The issue of semi-private companies and contractors working for the public sector and how much of their data should be made open also needs thought. Knowing what data is available and how it is structured is also important. There are many inventories in taxonomies, asset registers, etc. that can help.
Structured data, open standards and open formats, are vital. Using standards, schemas and APIs produces really rich metadata that allows sharing. Sheer weight of volume and limited resources can render huge amounts of information inaccessible, without any deliberate cover-up or conspiracy. Digital volumes of data are huge.
The NA helps form legislation to come up with something that is fit for purpose and useful for public sector information workers. Ministers want officials to come up with solutions to problems create by legislation so that people can easily get hold of information to solve their problems effortlessly.
Information managers need to see themselves as gamekeepers, rather than gatekeepers. They should make sure their information stock is healthy, but let it roam free for people to hunt down and use. This is the best way to support growth in an information economy.

What's Wrong with UK Information Law?


Charles Oppenheim gave3 a very entertaining presentation, opening with an anecdote about an early attempt by the Department of Trade and Industry to encourage publication and re-use of government data. The DTI published a list of names and phone numbers of civil servants to contact to ask for information. However, the first civil servant who was contacted immediately demanded where the caller had found his name, then declared that his name and number were official secrets and slammed the phone down. The situation has transformed since then.
There are many breaches of UK and EU information legislation. Personal data should not be transferred outside the European Economic Area unless there is accurate legal protection - such as rights to see and correct information. The USA fails the tests, so no-one can legally transfer data to the USA without some protections. Some safe harbours are declared for companies that are deemed compliant with DPA principles. However, the USA’s PATRIOT Act demands rights for the authorities to see and access all sorts of data for anti-terrorism and security purposes. The owner of the data may not be told that the data is being inspected. Lockheed Martin is handling 2011 census data, but the company is subject to the PATRIOT Act. Despite the statement at the start of the census, a large number of bodies have access to freshly collected census data.
The DPA doesn’t address cloud computing and may be out of date. Computers might be in all sorts of places such as the USA that don’t have adequate data and privacy rights protection. The cloud computing suppliers resist any attempt to abide by the rules. They don’t put compliance in the contract. If you fail to impose safeguards on your cloud computing supplier, you are in breach of the DPA.
People have the right to sue if there has been breach of the act, but you can only sue for distress if the data has appeared in the media. So, if a company sends you threatening letters because of incorrect information, you have no redress for the distress this may cause. It is a criminal offence to unlawfully obtain personal data, and reckless loss if an offence but reckless disregard of other data protection principles is not an offence. In any case, the expense of court cases means that most people could not afford to sue.
The FOIA was declared by Tony Blair to be the biggest mistake of his prime ministerial career. There are many problems with applying it and a lot of complaints end up at the ICO. Firvolous or arduaous requests can be a problem for organisations. In Norway there is an exemption from the Norwegian freedom of information act if the person making the request is obviously drunk! One good effect is the obligation on public authorities to provide electronic datasets under FOIA in a form that can be conveniently reused.

Meaningful, Linked Local Data


Paul Davidson, Chairman, CTO Public Sector Information Domain Team, talked about Linked Data and data standards. He pointed out that data can become more open the more it is processed as it becomes less personal. He contrasted data created at the operational level with the statistical, analytical and political stages of processing, as a records, such as a personal health care record, is then combined with others to produce statistics, which are then analysed an then used as evidence for policies. connect to improve. He described what needs ot be done to make data sharable, by explaining the semantics – controlled vocabularies and descriptive terms so that the subject is understandable, quality assurances so that people know how reliable, accurate and up to date the data is, any relevant rights and consents, and the format it is published in. IT is usually easier to provide such information for public data than for other types of data.
He stressed that public bodies should not require people to have to hunt around their websites and take the data in a single format, but should allow people to get to the raw data and take it away to use in their own applications. Standards are important, but it is better to publish the data in any format rather than keep it locked up while standards are selected.
However, meaningful data is much more than just bland lists. It is very hard to assess value without context –for example the figure for the total expenditure on hotel bills isn’t helpful unless you know whether it was a few people in an expensive hotel or lots in cheap ones. Knowing spending on roads by council is not useful unless you know the lengths of roads in each council area, so you can make a per mile comparison.
Ontologies, URIs, reference lists, and data registries all need to be managed to support Linked Data. Aggreagtors are also needed so that data sets can be brought together and queried easily and it is not clear whether the public sector should be providing such services or leaving such provision to the private sector to develop. Finally, end users who want the data are needed as well.

The afternoon ended with a lively panel session with all the speakers, followed by drinks and networking.

Short biographies of the speakers are available on the ISKO UK website.
This afternoon meeting is organized in co-operation with the UCL Department for Information Studies .