A model to bring museums, libraries and archives together

I am attending a workshop on the Conceptual Reference Model created by the International Council of Museums Committee on Documentation (CIDOC) this week.
The CIDOC Conceptual Reference Model (CRM) was created as a means of enabling information interchange and integration in the museum community and beyond. It “provides definitions and a formal structure for describing the implicit and explicit concepts and relationships used in cultural heritage documentation”.
It became an ISO standard in 2006 and a Special Interest Group continues to work to develop it and keep it in line with progress in conceptualisation for information integration.
The vision is to facilitate the harmonization of information across the cultural heritage sector, encompassing museums, libraries and archives, helping to create a global resource. The CRM is effectively an ontology describing concepts and relationships relevant to this kind of information. It is not in any sense a content standard, rather it takes what is available and looks at the underlying logic, analysing the structure in order to progress semantic interoperability.
I come to this as someone with a keen interest in interoperability, and I think that the Archives community should engage more actively in cross-sectoral initiatives that benefit resource discovery. I am interested to find out more about the practical application and adoption of the CRM. My concern is that in the attempt to cover all eventualities, it seems like quite a complex model. It seeks to ‘provide the level of detail and precision expected and required by museum professionals and researchers’. It covers detailed descriptions, contexts and relationships, which can often be very complex. The SIG is looking to harmonise the CRM with archival standards, which should take the cultural heritage sector a step further towards working together to share our resources.
I will be interested to learn more about the Model and I would like to consider how the CRM relates to what is going on in the wider environment, and particularly with reference to Linked Data and, more basically, the increasing recognition of web architecture as the core means to disseminate information. Initiatives to bring data together, to interconnect, should move us closer to integrated information systems, but we want to make sure that we have complimentary approaches.
You can read more about the Conceptual Reference Model on the CIDOC CRM website.

Visit to Seven Stories

Yesterday I enjoyed a visit to Seven Stories, the centre for children’s books, and one of the contributors to our sustainable development project. One of the main reasons for my visit was to see the authority files they have created in CALM, for authors and illustrators. I also gave a quick demonstration of how to use the Hub’s new EAD Editor, which was very well recieved.

Once the business of the visit was over, Hannah (the archivist) showed me some of the treasures of the collection, which included some of Phillip Pullman’s manuscripts (in very neat handwriting!); original artwork by Jan Ormerod for her book ‘Sunshine‘; and the original illustrations for Noel Streatfeild’s ‘Ballet Shoes’. Included with these was, to my great excitement, the original copy of Pauline’s application for a stage licence, filled out (with book-appropriate information) by either Noel or her illustrator Ruth Gervis who, I discovered to my delight, was Noel Streatfeild’s sister.

I’m really pleased that Seven Stories are going to be adding their descriptions to the Hub in the near future, and I’d encourage you to have a look – I’m sure you’ll find plenty to interest you.

Hub contributors’ reflections on the current and future state of the Hub

The Archives Hub is what the contributors make it, and with over 170 institutions now contributing, we want to continue to ensure that we listen to them and develop in accordance with their needs. This week we brought together a number of Archives Hub contributors for a workshop session. The idea was to think about where the Hub is now and where it could go in the future.
We started off by giving a short overview of the new Hub strategy, and updating contributors on the latest service developments. We then spent the rest of the morning asking them to look at three questions: What are the benefits of being part of the Hub? What are the challenges and barriers to contributing? What sort of future developments would you like to see?
Probably the strongest benefit was exposure – as a national service with an international user-base the Hub helps to expose archival content, and we also engage in a great deal of promotional work across the country and abroad. Other benefits that were emphasised included the ability to search for archives without knowing which repository they are held at, and the pan-disciplinary approach that a service like the Hub facilitates. Many contributors also felt that the Hub provides them with credibility, a useful source of expertise and support, and sometimes ‘a sympathetic ear’, which can be invaluable for lone archivists struggling to make their archives available to researchers. The network effect was also raised – the value of having a focus for collaboration and exchange of idea.
A major barrier to contributing is the backlog of data, which archivists are all familiar with, and the time required to deal with this, especially with the lack of funding opportunities for cataloguing and retro-conversion. The challenges of data exchange were cited, and the need to make this a great deal easier. For some, getting the effective backing of senior managers is an issue. For those institutions who host their own descriptions (Spokes), the problems surrounding the software, particularly in the earlier days of the distributed system, were highlighted, and also the requirement for technical support. One of the main barriers here may be the relationship with the institution’s own IT department. It was also felt that the use of Encoded Archival Description (EAD) may be off-putting to those who feel a little intimidated by the tags and attributes.
People would like to see easy export routines to contribute to the Hub from other sytems, particularly from CALM, a more user-friendly interface for the search results, and maybe more flexibility with display, as well as the ability to display images and seamless integration of other types of files. ‘More like Google’ was one suggestion, and certainly exposure to Google was considered to be vital. It would be useful for researchers to be able to search a Spoke (institution) and then run the same search on the central Hub automatically, which would create closer links between Spokes and Hub. Routes through to other services would add to our profile and more interoperability with digital repositories would be well-received. Similarly, the ability to search across archival networks, and maybe other systems, would benefit users and enable more people to find archival material of relevance. The importance of influencing the right people and lobbying were also listed as something the Hub could do on behalf of contributors.
After a very good lunch at Christie’s Bistro we returned to look at three particular developments that we all want to see, and each group took one issues and thought about what the drivers are that move it forward and what the retraining forces are that stop it from happening. We thought about usability, which is strongly driven by the need to be inclusive and to de-mystify archival descriptions for those not familiar with archives and in particular archival hierarchies. It is also driven by the need to (at least in some sense) compete with Google, the need to be up-to-date, and to think about exposing the data to mobile devices. However, the unrealistic expectations that people have and, fundamentally, the need to be clear about who our users are and understanding their needs are hugely important. The quality and consistency of the data and markup also come into play here, and the recognition that this sort of thing requires a great deal of expert software development.
The need for data export, the second issue that we looked at, is driven by the huge backlogs of data and the big impact that this should have on the Hub in terms of quantity of descriptions. It should be a selling point for vendors of systems, with the pressure of expectation from stakeholders for good export routines. It should save time, prove to be good value for money and be easily accommodated into the work flow of an archive office. However, complications arise with the variety of systems out there and the number of standards, and variance in application of standards. There may be issues about the quality of the data and people may be resistant to changing their work habits.
Our final issue, the increased access to digital content, is driven by increased expectations for accessing content, making the interface more visually attractive (with embedded images), the drive towards digitisation and possibly the funding opportunities that exist around this area. But there is the expense and time to consider, issues surrounding copyright, the issue of where the digital content is stored and issues around preservation and future-proofing.
The day ended with a useful discussion on measuring impact. We got some ideas from contributors that we will be looking at and sharing with you through our blog. But the challenges of understanding the whole research life-cycle and the way that primary sources fit into this are certainly a major barrier to measuring the impact that the Hub may have in the context of research outputs.

Archival Context: entities and multiple identities

I recently took part in a Webinar (Web seminar) on the new EAC-CPF standard. This is a standard for the encoding of information about record creators: corporate bodies, persons and families. This information can add a great deal to the context of archives, supporting a more complete understanding of the records and their provenance.

We were given a brief overview of the standard by Kathy Wisser, one of the Working Group members, and then the session was open to questions and discussion.

The standard is very new, and archivists are still working out how it fits in to the landscape and how it relates to various other standards. It was interesting to note how many questions essentially involved the implementation of EAC-CPF: who creates the records? where are they kept? how are they searched? who decides what?
These questions are clearly very important, but the standard is just a standard for the encoding of ISAAR(CPF) information. It will not help us to figure out how to work together to create and use EAC-CPF records effectively.
In general, archivists use EAD to include a biographical history of the record creator, and may not necessarily create or link to a whole authority record for them. The idea is that providing separate descriptions for different entities is more logical and efficient. The principle of separation of entities is well put: “Because relations occur between the descriptive nodes [i.e. between archive collections, creators, functions, activities], they are most efficiently created and maintained outside of each node.” So that if you have a collection description and a creator description, the relationship between the two is essentially maintained separately to the actual descriptions. If only EAD itself was a little more data-centric (database friendly you might say), this would facilitate a relational approach.
I am interested in how we will effectively link descriptions of the same person, because I cannot see us managing to create one single authoritative record for each creator. This is enabled via the ‘identities’: a record creator can have two or more identities with each represented by a distinct EAC-CPF instance. I think the variety of identity relationships that the standard provides for is important, although it inevitably adds a level of complexity. It is something we have implemented in our use of the tag to link to related descriptions. Whilst this kind of semantic markup is a good thing, there is a danger that the complexity will put people off.
I’m quite hung-up on the whole issue of identifiers at the moment. This may be because I’ve been looking at Linked Data and the importance of persistent URLs to identify entities (e.g. I have a URL, you have a URL, places have a URL, things have a URL and that way we can define all these things and then provide links between them). The Archives Hub is going to be providing persistent URLs for all our descriptions, using the unique identifier of the countrycode, repository code and local reference for the collection (e.g. http://www.archiveshub.ac.uk/search/record.html?id=gb100mss, where 100 is the repository code and MSS is the local reference).
I feel that it will be important for ISAAR(CPF) records to have persistent URLs, and these will come from the recordID and the agencyCode. Part of me thinks the agency responsible for the EAC-CPF instance should not be part of the identifer, because the record should exist apart from the institution that created it, but then realistically, we’re not going to get consensus on some kind of independent stand-alone ISAAR(CPF) record. One of the questions I’m currently asking myself is: If two different bodies have EAC-CPF records, does it matter what the identifers/URLs are for those records, even if they are for the same person? Is the important thing to relate them as representing the same thing? I’m sure its very important to have a persistent URL for all EAC-CPF instances, because that is how they will be discoverable; that is their online identity. But the question of providing one unique identifier for one person, or one corporate body is not something I have quite made my mind up about.
It will be interesting to see how the standard is assessed by archivists and more examples of implementation. The Archives Hub would be very interested to hear from anyone using it.

Sustainable content: visits to contributors

I recently visited two of the contributors to the Archives Hub sustainable content development project. The archivists at Queen Mary, University of London (QMUL) and the BT Archives were nice enough to let me drink their tea, and see how they used CALM.

Axiell, developers of the CALM software, have kindly let us have access to a trial version of CALM to help with this project, but it

UK Archives Discovery Network is born!

The National Archives Network of the UK (NAN) has been around for some time. It had a reasonably high profile around the turn of the century (that sounds weird!) when the cross-searching networks were being set up, but then in the following years its remit and purpose became less clear.
However, a great deal has been achieved over the past 10 years. The NAN projects and hubs have involved literally hundreds of archive repositories across the UK, ranging from public authorities through to the archives of small charities, and the result is that archives have had some resources to enable them to convert existing descriptions for contribution to the national projects, and that users have a number of very valuable cross-searching sites to use in order to facilitate discovery.
The vision was always to provide one gateway to search archives across the UK. Whilst this may still be a desirable vision, it may not be a realistic one, given the resources that it would involve and the issues of effective cross-searching of disparate descriptions. However, what we can do is to move towards opening up our data in ways that encourage cross-searching, sharing and working together to learn about how we can benefit users.
Over the past year, the NAN has been thinking about where it should be heading. At a recent meeting (August 2009), it was decided to change the name to the UK Archives Discovery Network, to reflect the UK-wide status of the network and to emphasise that we are about facilitating discovery for users.
The aims of the UKAD Network include working together in the best interests of archive users, surfacing descriptions, opening up data, sharing experiences and increasing links between repositories and networks. Whilst it may take some time for the Network to realise its remit, there are already benefits happening as a result of coming together, talking and sharing ideas and experiences.
I hope that the community continues down this path, because I think that it has become more important than ever to work together and really consider interoperability. Creating closed systems, however impressive they are in themselves, means continuing in a silo-based mentality, which is not truly responding to users’ expectations.
We have a social network site, which provides a fairly informal way of communicating:
There is also a JISC listserv: archives-discovery-network@jiscmail.ac.uk. We encourage archivists to use this to raise any issues associated with cross-searching, data standards, use of technology and archive networks.
We hope that archivists will be keen to use the UKAD network as a means to foster connections and collaborate on projects. Here’s to the next 10 years – goodness only knows where we will have reached by then!
Image: Flickr cc. Jan Leenders

Museums neglecting needs of researchers?

A recent RIN report ‘Discovering Physical Objects’ looks at how researchers find out about collections of objects relevant to their research. The report relates to museum objects rather than archives, but as ever, the Archives Hub feel that its always worth looking at library and museum studies, and seeing how they might apply to the world of archives.

Well, the results don’t seem to be very surprising. Researchers want online finding aids but are unaware of those that exist; they want contact with curatorial staff; and access to objects amongst museums is inconsistent.

I was interested to see that access to online finding aids NOW is more important than access to ‘perfect’ descriptions. The report states “technological developments that allow researchers
and others to easily add to and amend the content of these records have the potential to help all museums and other collections to improve the quality of their records.” I assume the report is reflecting what researchers have actually said here, rather than making an assumption, although the wording doesn’t make this explicit.

On the whole, the report gives the impression that museums are really rather behind the archive community in providing online access to descriptions. I’m curious about the statement that ‘only a few have the needs of researchers in mind’ when they create their online finding aids – I’d like to know more about this and the the evidence for it.

I’m surprised that curators apparently underestimate the value of online finding aids. It certainly seems that museum curators have not generally embraced technical possibilities and are not really into the spirit of collaboration and sharing.

The ways forward that the report recommends fit in quite nicely with the Hub’s ethos: to make museum descriptions open and interoperable so that people can create their own interfaces sourcing the data. We’ll keep an eye on the progress of Culture24 with interest.

Image from RIN report: Discovering Physical Objects (2009)

Let there be images!

I’m embroiled in our Enhancement Project at the moment, part of which is about enabling images to be displayed within the Archives Hub. Well, it’s actually more than that – it’s about using the tag and related tags to enable links to digital representations of archives and to enable images to be embedded at collection and item level. It’s something we’re really excited about, and we feel that it’s important to make this step in order to keep the Archives Hub moving onwards and upwards.

Due to the distributed nature of the Archives Hub, we aren’t able to use the element, but we’ve made the most of the tags on offer. We’re implementing options for embedded images; links to files; thumbnail links to full-size images; groups of images representing the same item.

We’ve made a conscious effort to implement this in a very standards-based way. I suppose you could say that the principle should be that if the EAD records are put into another system, everything should still work, and the markup does allow for this. I think that this approach is also important because we have a service where we are not creating the data – our contributors are – so we need to try to meet their various requirements whilst at the same time not knowing exactly what they will contribute. For example, we have to be aware that they might enter a large, high resolution image as a thumbnail and the system needs to be able to cope with this. I see it as a learning experience for both us and our contributors, and I think that it’s important to take that sort of perspective with the Hub.

I do hope that Hub contributors take advantage of this development. It will be great for them to be able to include images and link directly to content. We’ve made it very easy to add the necessary markup by providing the facility to do this within our new Data Creation and Editing Template, so there is no need to get down and dirty with the EAD markup unless they want to. We’ll be talking to our contributors about this at our workshops in March/April, which are already pretty much full, so that’s a good indication for us.

For more information, see our page on adding digital objects to Hub descriptions.

The Archives 2.0 Hub

No…we’re not thinking of changing the name…but I am thinking about a presentation that I’m giving on the Archives Hub in the context of ‘Archives 2.0’.

We’ve been doing a great deal of work recently that relates to the interoperability of the Hub. As part of an Enhancements Project taking place at Mimas, we are promoting data sharing, and an important part of this is work on import and export routines between services. Ideally, of course, it would be great to share data without any need for complex routines that effectively alter the structure of the data to make it suitable for different services, and remote searching of other data sources is something that we are also going to be looking at. But I guess that whilst we like to think of our service as interoperable, it’s currently still within certain limitations. It is problematic even sharing data held as EAD (Encoded Archival Description XML for archives) because EAD is really quite a permissive standard, allowing a great deal of flexibility and thus in some ways inhibiting easy data exchange. It is even more challenging to share data held in different databases. Many archives use the CALM system or the AdLib system, and we are working towards improving the export option from these systems, thus allowing archivists to have all of the advantages of an integrated management system, whilst at the same time enabling them to contribute to a cross-searching service such as the Hub.

I firmly believe that Archives 2.0, as an implementation of Web2.0 for archives, should primarily be viewed as an attitude rather than a suite of tools or services, characterised by openness, sharing, experimentation, collaboration, integration and flexibility that enables us to meet different user needs. Whilst widgets and whizzy features on websites are certainly a way to work towards this, I do think that more fundamentally we should be thinking about the data itself and how we can open this up.

Where next for the National Archives Network…?

Joy and I went to a meeting last week at The National Archives to discuss the issues surrounding the National Archives Network, and the possible future directions that the archive community might take. We came away with our heads full of ideas and issues to take forward – so a job well done I think.

The National Archives Network as a concept really began after the 1998 seminal report by the National Council on Archives, ‘Archives On-line: The Establishment of a United Kingdom Archival Network‘ (PDF file). The vision was to create a single portal to enable people to search across UK archives. However, it is not really surprising that this never materialised given the resources and technical support necessary to make such a huge concept work. The landscape has changed since the report came out, and this solution seems to be less relevant nowadays. However, the concept of a network and the importance of collaboration and sharing data have continued to be very much on the agenda.

The meeting was initiated by Nick Kingsley and Amy Warner from TNA National Advisory Services. It included representatives from The Archives Hub, AIM25, SCAN, ANW, Genesis and Janus, as well as a number of other interested archivists from various organisations. The morning was dedicated to brief talks about the various strands of the network, and it quickly emerged that we had many things in common in terms of how we were working and the sorts of development ideas that we had, and therefore there would clearly be an advantage in sharing knowledge and experience and working together to enhance our services for the benefit of our users.

In the afternoon we formed into 3 groups to talk about name authority files, searching and sharing data and also hidden archives. A number of broad points came out of these break out groups and also the discussion that followed:

We need to ensure that our catalogues are searchable by Google (no surprises there) – it looks like some of us have tackled this more successfully than others, and obviously there are issues about databases that are not accessible to Google. It is important for contributors that services like the Hub and AIM25 are available via Google, and this provides an additional motivation for contributing to such union catalogues.

We really need to come together to think more carefully about name authority files – how these are created, who is responsible for them, how we can even start to think about reaching a situation where there is actually just one name authority file for each person!

It is important to progress on the basis of exposing our data so that it can be easily shared. This means working together on various options, including import/export options and Web Services that allow machine-to-machine access to the data. There are also issues here about the format of some of the catalogues. Some work has already taken place on exporting EAD data from DS CALM and AdLib, two major archive management systems. The Archives Hub and AIM25 have also been working together with the aim of enabling contributors to add the same description to both services.

We talked about other areas where sharing our experiences and understanding would be of great benefit, including Website design and how to present collection and multi-level finding aids online. We also recognised the importance of gathering together more information about our users – what they want, what they expect, what would be of benefit to them. In the end, this is one of the keys to producing a useful and rewarding service.

The meeting was very positive, and there are plans to take some of these issues forward through working groups as well as meeting again as a whole group, maybe sharing some of the specific projects that we have been involved with and collaborating on future initiatives.