The Website for the New Archives Hub

screenshot of archives hub homepage
Archives Hub homepage

The back end of a new system usually involves a huge amount of work and this was very much the case for the Archives Hub, where we changed our whole workflow and approach to data processing (see The Building Blocks of the new Archives Hub), but it is the front end that people see and react to; the website is a reflection of the back end, as well as involving its own user experience challenges, and it reflects the reality of change to most of our users.

We worked closely with Knowledge Integration in the development of the system, and with Gooii in the design and implementation of the front end, and Sero ran some focus groups for us, testing out a series of wireframe designs on users. Our intention was to take full advantage of  the new data model and processing workflow in what we provided for our users. This post explains some of the priorities and design decisions that we made. Additional posts will cover some of the areas that we haven’t included here, such as the types of description (collections, themed collections, repositories) and our plan to introduce a proximity search and a browse.

Speed is of the Essence

Faster response times were absolutely essential and, to that end, a solution based on an enterprise search solution (in this case Elasticsearch) was the starting point. However, in addition to the underlying search technology, the design of the data model and indexing structure had a significant impact on system performance and response times, and this was key to the architecture that Knowledge Integration implemented. With the previous system there was only the concept of the ‘archive’ (EAD document) as a whole, which meant that the whole document structure was always delivered to the user whatever part of it they were actually interested in, creating a large overhead for both processing and bandwidth. In the new system, each EAD record is broken down into many separate sections which are each indexed separately, so that the specific section in which there is a search match can be delivered immediately to the user.

To illustrate this with an example:-

A researcher searches for content relating to ‘industrial revolution’ and this scores a hit on a single item 5 levels down in the archive hierarchy. With the previous system the whole archive in which the match occurs would be delivered to the user and then this specific section would be rendered from within the whole document, meaning that the result could not be shown until the whole archive has been loaded. If the results list included a number of very large archives the response time increased accordingly.

In the new system, the matching single item ‘component’ is delivered to the user immediately, when viewed in either the result list or on the detail page, as the ability to deliver the result is decoupled from archive size. In addition, for the detail page,  a summary of the structure of the archive is then built  around the item to provide both the context and allow easy navigation.

Even with the improvements to response times, the tree representation (which does have to present a summary of the whole structure), for some very large multi-level descriptions takes a while to render, but the description itself always loads instantly. This means that that the researcher can always see they have a result immediately and view it, and then the archival structure is delivered (after a short pause for very large archives) which gives the result context within the archive as a whole.

The system has been designed to allow for growth in both the number of contributors we can support and  the number of end-users, and will also improve our ability to syndicate the content to both Archives Portal Europe and deliver contributors own ‘micro sites‘.

Look and Feel

Some of the feedback that we received suggested that the old website design was welcoming, but didn’t feel professional or academic enough – maybe trying to be a bit too cuddly. We still wanted to make the site friendly and engaging, and I think we achieved this, but we also wanted to make it more professional looking, showing the Hub as an academic research tool.  It was also important to show that the Archives Hub is a Jisc service, so the design Gooii created was based upon the Jisc pattern library that we were required to use in order to fit in with other Jisc sites.

We have tried to maintain a friendly and informal tone along with use of cleaner lines and blocks, and a more visually up-to-date feel. We have a set of consistent icons, on/off buttons and use of show/hide, particularly with the filter. This helps to keep an uncluttered appearance whilst giving the user many options for navigation and filtering.

In response to feedback, we want to provide more help with navigating through the service, for those that would like some guidance. The homepage includes some ‘start exploring’ suggestions for topics, to help get inexperienced researchers started, and we are currently looking at the whole ‘researching‘ section and how we can improve that to work for all types of users.

Navigating

We wanted the Hub to work well with a fairly broad search that casts the net quite widely. This type of search is often carried out by a user who is less experienced in using archives, or is new to the Hub, and it can produce a rather overwhelming number of results. We have tried to facilitate the onward journey of the user through judicious use of filtering options. In many ways we felt that filtering was more important than advanced search in the website design, as our research has shown that people tend to drill down from a more general starting point rather than carry out a very specific search right from the off.  The filter panel is up-front, although it can be hidden/shown as desired, and it allows for drilling down by repository, subject, creator, date, level and digital content.

Another way that we have tried to help the end user is by using typeahead to suggest search results. When Gooii suggested this, we gave it some thought, as we were concerned that the user might think the suggestions were the ‘best’ matches, but typeahead suggestions are quite a common device on the web, and we felt that they might give some people a way in, from where they could easily navigate through further descriptions.

Hub website example of type ahead results
A search for ‘design’ with suggested results

 

The suggestions may help users to understand the sort of collections that are described on the Hub. We know that some users are not really aware of what ‘archives’ means in the context of a service like the Archives Hub, so this may help orientate them.

Suggested results also help to explain what the categories of results are – themes and locations are suggested as well as collection descriptions.

 

 

We thought about the usability of the hit list. In the feedback we received there was no clear preference for what users want in a hit list, and so we decided to implement a brief view, which just provides title and date, for maximum number of results, and also an expanded view, with location, name of creator, extent and language, so that the user can get a better idea of the materials being described just from scanning through the hit list.

An example of a hit list result in expanded mode
Expanded mode gives the user more information

With the above example, the title and date alone do not give much information, which is particularly common with descriptions of series or items, of so the name of creator adds real value to the result.

Seeing the Wood Through the Trees

The hierarchical nature of archives is always a challenge; a challenge for cataloguing,  processing and presentation. In terms of presentation, we were quite excited by the prospect of trying something a bit different with the new Hub design. This is where the ‘mini map’ came about. It was a very early suggestion by K-Int to have something that could help to orientate the user when they suddenly found themselves within a large hierarchical description. Gooii took the idea and created a number of wireframes to illustrate it for our focus groups.

For instance, if a user searches on Google for ‘conrad slater jodrell bank’ then they get a link to the Hub entry:

screenshot of google search result for a Hub description
Result of a search on Google

The user may never have used archives, or the Archives Hub before. But if they click on this link, taking them directly to material that sits within a hierarchical description, we wanted them to get an immediate context.

screen shot of one entry in the Jodrell Bank Archive
Jodrell Bank Observatory Archives: Conrad Slater Files

The page shows the description itself, the breadcrumb to the top level, the place in the tree where these particular files are described and a mini map that gives an instant indication of where this entry is in the whole. It is  intended (1) to give a basic message for those who are not familiar with archive collections – ‘there is lots more stuff in this collection’ and (2) to provide the user with a clearly understandable  expanding tree for navigation through this collection.

One of the decision we made, illustrated here, was to show where the material is held at every level, for every unit of description. The information is only actually included at the top level in the description itself, but we can easily cascade it down. This is a good illustration of where the approach to displaying archive descriptions needs to be appropriate for the Web – if a user comes straight into a series or item, you need to give context at that level and not just at the top level.

The design also works well for searches within large hierarchical descriptions.

screenshot showing a 'search within' with highlighted results
Search for ‘bicycles’ within the Co-operative Union Photographic Collection

The user can immediately get a sense of whether the search has thrown up substantial results or not. In the example above you can see that there are some references to ‘bicycles’ but only early on in the description.  In the example below, the search for ‘frost on sunday’ shows that there are many references within the Ronnie Barker Collection.

screenshot showing search within with lots of highlighted results
Search within the Ronnie Barker Collection for ‘frost on sunday’

One of the challenges for any archive interface is to ensure that it works for experienced users and first-time users. We hope that the way we have implemented navigation and searching mean that we have fulfilled this aim reasonably well.

Small is Beautiful

screenshot showing the Hub search on a mobile phone
The Archives Hub on an iPhone

The old site did not work well on mobile devices. It was created before mobile became massive, and it is quite hard to retrospectively fit a design to be responsive to different devices. Gooii started out with the intention of creating a responsive design, so that it renders well on different sized screens.  It requires quite a bit of compromise, because rendering complex multi-level hierarchies and very detailed catalogues on a very small screen is not at all easy. It may be best to change or remove some aspects of functionality in order to ensure the site makes sense. For example, the mobile display does not open the filter by default, as this would push the results down the page. But the user can open the filter and use the faceted search if they choose to do so.

We are particularly pleased that this has been achieved, as something like 30% of Hub use is on mobiles and tablets now, and the basic search and navigation needs to be effective.

graph showing use of desk, mobile and tablet devices on the Hub
Devices used to view the Hub site over a three month period

In the above graph, the orange line is desktop, the green is mobile and the purple is tablet. (the dip around the end of December is due to problems setting up the Analytics reporting).

Cutting Our Cloth

One of the lessons we have learnt over 15 years of working on the Archives Hub is that you can dream up all of the interface ideas that you like, but in the end what you can implement successfully comes down to the data. We had many suggestions from contributors and researchers about what we could implement, but oftentimes these ideas will not work in practice because of the variations in the descriptions.

We though about implementing a search for larger, medium sized or smaller collections, but you would need consistent ‘extent’ data, and we don’t have that because archivists don’t use any kind of controlled vocabulary for extent, so it is not something we can do.

When we were running focus groups, we talked about searching by level – collection, series, sub-series, file, item, etc. For some contributors a search by a specific level would be useful, but we could only implement three levels – collection (or ‘top level’), item (which includes ‘piece’) and then everything between these, because the ‘in-between’ levels don’t lend themselves to clear categorisation. The way levels work in archival description, and the way they are interpreted by repositories, means we had to take a practical view of what was achievable.

We still aren’t completely sold on how we indicate digital content, but there are particular challenges with this. Digital content can be images that are embedded within the description, links to images, or links to any other digital content imaginable. So, you can’t just use an image icon, because that does not represent text or audio. We ended up simply using a tick to indicate that there is digital content of some sort. However, one large collection may have links to only one or two digital items, so in that case the tick may raise false expectations. But you can hardly say ‘includes digital content, but not very much, so don’t get too excited’. There is  room for more thought about our whole approach to digital content on the Hub, as we get more links to digital surrogates and descriptions of born-digital collections.

Statistics

The outward indication of a more successful site is that use goes up. The use of statistics to give an indication of value is fraught with problems. Do the number of clicks represent value? Might more clicks indicate a poorer user interface design? Or might they indicate that users find the site more engaging? Does a user looking at only one description really gain less value than a user looking at ten descriptions? Clearly statistics can only ever be seen as one measure of value, and they need to be used with caution. However, the reality is that an upward graph is always welcomed! Therefore we are pleased to see that overall use of the website is up around 32% compared to this period during the previous year.

graph of blog stats comparing dataJan 2016 (the orange line) and Jan 2017 (the blue line), which shows typical daily use above 2,000 page views.

Feedback

We are pleased to say that the site has been very well received…

“The new site is wonderful. I am so impressed with its speed and functionality, as well as its clean, modern look.” (University Archivist)

“…there are so many other features that I could pick out, such as the ability to download XML and the direct link generator for components as well as collections, and the ‘start exploring’ feature.”  (University Archivist)

“Brand new Archives Hub looks great. Love how the ‘explorer themes’ connect physically separated collections” (Specialist Repository Head of Collections)

“A phenomenal achievement!” (Twitter follower)

 

With thanks to Rob Tice from Knowledge Integration for his input to this post.

Save

Archives Portal Europe Country Managers’ Meeting, 30 Nov 2016

This is a report of a meeting of the Archives Portal Europe Country Managers’ in Slovakia, 30 November 2016, with some comments and views from the UK and Archives Hub perspective.

APE-CMmeeting-30Nov2016
APE Country Managers meeting, Bratislava, 30 Nov 2016

Context

The APE Foundation (APEF), which was created following the completion of the APEx project (an EC funded project to maintain and develop the portal running from 2012 to 2015), is now taking APE forward. It has a Governing Board and working groups for standards, technical issues and PR/comms. The APEF has a coordinator and three technical/systems staff as well as an outreach officer. Institutions are invited to become associate members, to help support the portal and its aims.

Things are going well for APEF, with a profit recorded for 2016, and growing associate membership. APEF continues to be busy with development of APE, and is endeavouring to encourage cooperation and collaboration as a means to seize opportunities to keep developing and to take advantage of EU funding opportunities.

Current Development

The APEF has the support of Ministry of Culture in the Netherlands and has a close working relationship with the Netherlands national aggregation project, the ‘DTR’, which is key to the current APE development phase. The idea is to use the framework of APE for the DTR, benefitting both parties. Cooperation with DTR involves three main areas:

•    building an API to open up the functionality of APE to third parties (and to enable the DTR to harvest the APE data from The Netherlands)
•    improving the uploading and processing of EAC-CPF
•    enabling the uploading and processing of ‘additional finding aids’

The API has been developed so that specific requests can be sent to fetch selected data. It is possible to do this for EAD (descriptions) and EAC-CPF (names).  The API provides raw data as well as processed results.  There have been issues around things like relevance of ordering of results which is a substantial area of work that is being addressed.

The API raises implications in terms of the data, as the Content Provider Agreement that APE institutions sign gives control of the data to the contributors. So, the API had to be implemented in a way that enables each contributor to give explicit permission for the data to be available as CC0 (fully open data). This means that if a third party uses the API to grab data, they only get data from a country that has given this permission. APEF has introduced an API key, which is a little controversial, as it could be argued that it is a barrier to complete openness, but it does enable the Foundation to monitor use, which is useful for impact, for checking correct use, and blocking those who misuse the API. This information is not made open, but it is stored for impact and security purposes.

There was some discussion at the meeting around open data and use of CC0. In countries such as Switzerland it is not permitted to open up data through a CC0 licence, and in fact, it may be true to say that CC0 is not the appropriate licence for archival descriptions (the question of whether any copyright can exist in them is not clear) and a public domain licence is more appropriate. When working across European countries there are variations in approaches to open data. The situation is complicated because the application of CC0 for APE data is not explicit, so any licence that a country has attached to their data will effectively be exported with the data and you may get a kind of licence clash. But the feeling is that for practical purposes if the data is available through an API, developers will expect it to be fully open and use it with that in mind.

There has been work to look at ways to take EAC-CPF from a whole set of institutions more easily, which would be useful for the UK, where we have many EAC-CPF descriptions created by SNAC.  Work on any kind of work to bring more than one name description for the same person together has not started, and is not scheduled for the current period of development, but the emphasis is likely to be on better connectivity between variations of a name rather than having one description per name.

Additional finding aids offer the opportunity to add different types of information to APE. You may, for example, have a register of artists or ships logs, you may have started out with a set of cards with names A-Z, relating to your archive in some way.  You could describe these in one EAD description, and link this to the main description. In the current implementation of EAD2002 in APE this would have to go into a table in Scope & Content and in-line tagging is not allowed to identify parts of the data. This leads to limitations with how to search by name. But then EAD3 gives the option to add more information on events and names. You can divide a name up into parts, which allows for better searching.  Therefore APE is developing a new means to fetch and process EAD3 for the additional finding aids alongside EAD2002 for ‘standard’ finding aids. In conjunction with this, the interface needs to be changed to present the new names within the search.

The work on additional finding aids may not be so relevant for the Archives Hub as a contributor to APE, as the Hub cannot look at taking on ‘other finding aids’, with all the potential variations that implies. However, institutions could potentially log into APE themselves and upload these different types of descriptions.

APE and Europeana

There was quite a bit to talk about concerning APE and Europeana. The APEF is a full partner of the Europeana Digital Services Infrastructure 2 (DSI2) project (currently running 2016/2017). The project involves work on the structure for Europeana, maintaining and running data and aggregation services, improving data quality, and optimising relations with data partners. The work APE is involved with includes improving the current workflow for harvest/ingest of data, and also evaluating what has already been ingested into Europeana.

Europeana seems to have ongoing problems dealing with multi-level EAD descriptions, compounded by the limitation that they only represent  digital materials. The approach is not a good fit for archives. Europeana have also introduced both a new publishing framework and different rights statements.

The new publishing framework is a 4 tier approach where you can think of Europeana as a more basic tool for promoting your archives, or something that is a platform for reuse. It refers to the digital materials in terms of whether they are a certain number of pixels, e.g. 800 pixels wide for thumbnails (adding thumbnails means using Europeana as a ‘showcase’) and 1,200 pixels wide ( high quality and reusable, using Europeana as a distribution and reuse platform). The idea of trying to get ‘quality’ images seems good, but in practice I wonder if it simply raises the barrier too much.

The new Rights statements require institutions to be very clear about the rights they want to apply to digital content.  The likely conclusion of all this from the point of view of the Archives Hub is that we cannot grapple with adding to Europeana on behalf of all of our contributors, and therefore individual contributors will have to take this on board themselves. It will be possible for contributors to log into the APE dashboard (when it has been changed to reflect the Europeana new rights) and engage with this, selecting the finding aids, the preferred rights statements, and ensuring that thumbnail and reusable images meet the requirements.  One the descriptions are in APE they can then be supplied to Europeana. The resulting display in Europeana should be checked, to ensure that it is appropriate.

We discussed this approach, and concluded that maybe APE contributors could see Europeana as something that they might use to showcase their content, so, think of it on our terms, as archives, and how it might help us. There is no obligation to contribute, so it is a case of making the decision whether it is worth representing the best visual archives through Europeana or whether this approach takes more effort than the value that we get out of it.  After 10 years of working with Europeana, and not really getting proper representation of archives, the idea of finding a successful way of contributing archives is appealing, but it seems to me that the amount of effort required is going to be significant, and I’m not sure if the impact is enough to warrant it.

Europeana are working on a new way of automated and real time ingest from aggregators and content providers, but this may take another year or more to become fully operational.

Outreach and CM Reports

Towards the end of the day we had a presentation from the new PR/communicaitons officer. Having someone to encourage, co-ordinate and develop ideas for dissemination should provide invaluable for APE. The Facebook page is full of APE activities and related news and events. You can tweet and use the hashtag #archivesportaleurope if you would like to make APE aware of anything.

We ended the day with reports from country managers, which, as always threw up many issues, challenges, solutions, questions and answers. Plenty to set up APEF for another busy year!

Save

Save

Archives Portal Europe builds firm foundations

On 8th June 2016 I attended the first Country Manager’s meeting of the newly formed Foundation of the Archives Portal Europe (APEF) at the National Archives of the Netherlands (Nationaal Archief).

The Foundation has been formed on the basis of partnerships between European countries. The current Foundation partners are: Belgium, Denmark, Luxembourg, The Netherlands, Spain, Sweden, Switzerland, Estonia, France, Germany, Hungary, Italy, Latvia, Norway and Slovenia. All of these countries are members of the ‘Assembly of Associates’. Negotiations are proceeding with Bulgaria, Greece, Liechtenstein, Lithuania, Malta, Poland, Slovakia and the UK. Some countries are not yet in a position to become members, mainly due to financial and administrative issues, but the prospects currently look very positive, with a great willingness to take the Portal forwards and continue the valuable networking that has been built up over the past decade. Contributing to the Portal does not incur financial contribution; the Assembly of Associates is separate from this, and the idea is that countries (National Archives or bodies with an educational/research remit) sign up to the principles of APE and the APE Foundation – to collaborate and share experiences and ideas, and to make European archives as accessible as possible.

The Governing Board of the Foundation is working with potential partners to reach agreements on a combination of financial and in-kind contributions. It’s also working on long term strategy documents. It has established working groups for Standards and PR & Communications and it has set up cooperation with the Dutch DTR project (Digitale Taken Rijksarchieven / Digital Processes in State Archives) and with Europeana. The cooperation with the DTR project has been a major boost, as both projects are working towards similar goals, and therefore work effort can be shared, particularly development work.

Current tasks for the APEF:

  • Building an API to open up the functionality of the Archives Portal Europe to third parties and to implement the possibility for the content providers to switch this option on or off in the Archives Portal Europe’s back-end.
  • Improving the uploading and processing of EAC-CPF records in the Archives Portal Europe and improving the way in which records creators’ information can be searched and found via the Archives Portal Europe’s front-end and via the API.
  • Enabling the uploading/processing of “additional finding aids (indexes)” in the Archives Portal Europe and making this additional information available via the Archives Portal Europe’s front-end and the API.

The above in addition to the continuing work of getting more data into the Portal, supporting the country managers in working with repositories, and promoting the portal to researchers interested in using European-wide search and discovery tool.

APEF will be a full partner in the Europeana DSI2 project, connecting the online collections of Europe’s cultural heritage institutions, which will start after the summer and will run for 16 months. Within this project APEF will focus on helping Europeana to develop the aggregation structure and provide quality data from the archives community to Europeana. A focus on quality will help to get archival data into Europeana in a way that works for all parties. There seems to be a focus from Europeana on the ‘treasures’ from the archives, and on images that ‘sell’ the archives more effectively. Whatever the rights and wrongs of this, it seems important to continue to work to expose archives through as many channels as we can, and for us in the UK, the advantages of contributing to the Archives Hub and thence seamlessly to APE and to Europeana, albeit selectively, are clear.

A substantial part of the meeting was dedicated to updates from countries, which gave us all a chance to find out what others are doing, from the building of a national archives portal in Slovakia to progress with OAI-PMH harvesting from various systems, such as ScopeArchiv, used in Switzerland and other countries. Many countries are also concerned with translations of various documents, such as the Content Provider Agreement, which is not something the UK has had to consider (although a Welsh translation would be a possibility).

We had a session looking at some of the more operational and functional tasks that need to be thought about in any complex system such as the APE system. We then had a general Q&A session. It was acknowledged that creating EAD from scratch is a barrier to contributing for many repositories. For the UK this is not really an issue, because we contribute Archives Hub descriptions. But of course it is an issue for the Hub: to find ways to help our contributors provide descriptions, especially if they are using a proprietary system. Our EAD Editor accounts for a large percentage of our data, and that creates the EAD without the requirement of understanding more than a few formatting tags.

The Archives Hub aims to set up harvesting of our contributors’ descriptions over the next year, thus ensuring that any descriptions contributed to us will automatically be uploaded to the Archives Portal Europe. (We currently have to upload on a per-contributor basis, which is not very efficient with over 300 contributors). We will soon be turning our attention to the selective digital content that can be provided by APE to Europeana. That will require an agreement from each institution in terms of the Europeana open data licence. As the Hub operates on the principles of open data, to encourage maximum exposure of our descriptions and promote UK archives, that should not be a problem.

With thanks to Wim van Dongen, APEF country manager coordinator / technical coordinator, who provided the minutes of the Country Managers’ meeting, which are partially reproduced here.

Connecting through defining people and relationships

If, as a researcher, you search for ‘Jane Drew’, the celebrated architect and town planner, on the Archives Hub, amongst other things, you might discover a single item, “Letter from Jane B Drew to John and Myfanwy Piper”, a letter in the “Papers of John and Myfanwy Piper”.

You can see that its a letter in a collection at the Tate Gallery Archive. The description of the collection is an example of a good quality traditional archival catalogue, giving a fairly detailed listing of the content this particular collection.  But as a researcher you are really just interested in just this one letter.  You may ask yourself a number of questions, possibly starting with (1) Is this the Jane Drew I’m interested in? and then (2) What is the relationship between Jane Drew and John and Myfanwy Piper? You may well be able to find answers by accessing the letter itself, but at this stage you may just want to place this connection in the broader context of Jane Drew’s life and work. As a researcher, understanding how these people are connected may shed light on your research interests.

In this blog I want to think about this question of relationships. The fact is that archivists rarely provide structured information about relationships; if there is information, it is usually in the biographical history, which might outline key events and people in someone’s life, referring to their parents, work colleagues, friends, etc. The nature of the relationship is sometimes explicitly given, but often it is not. Our standards don’t really say much about relationships between the entities (people, organisations, places, etc) that we describe in our catalogues.

Going back to the Papers of John and Myfanwy Piper as an example, the biographical history includes the following:

[John] Piper began writing reviews from the late 1920s making a name for himself as a critic writing for periodicals like ‘The Listener’ and the ‘Architectural Review’. From 1935-1937 he assisted Myfanwy Evans, with the production of a quarterly review of contemporary European abstract painting called ‘Axis’. In 1937 Piper was commissioned by his friend John Betjeman to write the ‘Shell Guide to Oxfordshire’. Piper went on to write and provide photographs for a number of the guides as well as edit the series. In the same year John Piper married the writer Myfanwy Evans.

This is a typical of a biographical history – useful historical information about the individual or organisation. Within this there is information we can potentially use to create explicit relationship information:

John Piper ‘worked with’ Myfanwy Evans
John Piper ‘was friends with’ John Betjeman
John Piper ‘worked for’ John Betjeman
John Piper ‘was married to’ Myfanwy Evans

There are a number of issues to consider here:

How can we unambiguously identify the people?
How do we choose the vocabulary we use to define the relationships?
Do we try to include dates?
Is it reasonable for us to interpret relationships as ‘friendships’ or ‘collaborations’ if this is not actually explicit?

We are looking at some of these issues through our AHRC project, Exploring British Design. They are all issues that archivists need to explore in a debate around relationship information, but the first issue to consider is simply whether we should be thinking more about including this kind of relationship information in our archival finding aids. Is it something that would be of real value to end users?  This issue is coming more to the fore as we start to think about implementing ISAAR (CPF) and working with EAC-CPF , and also as Linked Open Data gains traction.

In a (well worth reading) recent article in the Journal of Contemporary Archival Studies, on the potential impact of EAC-CPF, K.M Wisser reports the findings of a survey about relationship information. The survey received 208 responses from archivists/archives in the US. Wisser wrote “The survey results indicate that the archival community has only just begun to consider relationships in the context of archival description and the role that explicit description of those relationships may play.”

As one respondent wrote:

“relationships are among the most important facets in a collection and deserve a high priority in description. One cannot understand the historical value of an event, person, or organization without knowing [the] relationship among and between them.”

One thing that really strikes me in Wisser’s findings is that archivists see relationships that are documented outside of the collection as almost as significant as those that are documented within the collection. Going back to our original topic of Jane Drew: who else did Jane Drew work with? Should we provide that information to our users, whether or not it is documented within the collection? Is our role to give as full an account as we can of Drew’s life and career? Is it to limit ourselves to what is within the collection?

Wisser’s survey asked respondents about the importance of relationship types. It is curious to me that archivists rated ‘collaborated with’ as a more important relationship than ‘studied with’; they rated a friendship as far more important when it was documented in the collection; and they rated ‘influenced by’ as generally not so important. I’m surprised that the respondents had such definite ideas about the relative importance of different types of relationships, especially when the majority appeared to agree with the importance of ‘objective cataloguing’.

In our Exploring British Design project, the work we did with researchers definitely confirmed to me the fairly self-evident observation that any relationship can be of major significance in research, even if it appears of minor significance within the archive, or indeed, within the literature in general. A brief collaboration may have been a crucial influence, a short friendship may have had hitherto unrealised impact, and anyway, the importance of the relationship depends upon the research you are doing. Researchers are not really aware of how challenging it is for us as information professionals to establish these kinds of relationships in ways that they can then access. But it is clear that this is the sort of connectivity they are after.

One of the challenges with documenting relationship types is that they can be hard to define. As Wisser notes:

“The concept of influence, however, proved the most problematic. Comments such as ‘influence is a squishy sort of relationship’ and ‘I think it would often be very difficult to prove that Entity A was influenced by Entity B’ indicate a notion of intangibility.”

The conclusion could be that we should leave well alone relationships that are hard to define. On the other hand, if we are in a position, as we research a collection, to highlight potential connections, that action could be of major value to a researcher, who may otherwise never know about a link that ends up being crucial to their particular research. The relationships that are easy to define are likely to have been defined already.

One thing that strikes me about the whole notion of introducing interpretation and opinion into cataloguing (a possible argument against defining relationships) is that the horse has pretty much bolted. I’ve looked at enough ‘objective’ descriptions to be aware that the names archivists choose to add as index terms are a choice; they inevitably have to be an opinion about the names significant enough to add as index terms. And subjects are a similar case – some collections are indexed thoroughly, some not at all.

Aside from indexing, each person would create a different scope and content entry, including and excluding different information, and whether you call that subjective or not, it is certainly always selective. You could also argue that the level of detailed hierarchical cataloguing, might indicate the relative importance of the collection. On the Archives Hub there are some collections catalogued in huge detail, and it is inevitable that researchers will assume these collections are particularly important.

All of these choices have implications for discoverability.

In Wisser’s survey, a significant proportion of respondents felt that the importance of a relationship should be based upon the use of the collection.  But this, again, raises the question: When thinking about relationships, is the cataloguer reflecting the scope of the collection, or are they trying to give as full a picture as they can of the person or organisation? Are we within the world of the collection; or is the collection within the world?

The reason that I believe that we should think beyond the bounds of the collection content is that I think it promises much richer rewards for our users and encourages archives to be a major player within a broader landscape of information resources. I base my thinking on the premise that the researcher is primarily interested in their research topic, which is not likely to be an archive collection per se, but rather an event, a person, an organisation, a subject, and the way things are connected. I think archivists are still tending to think in terms of a document that describes a collection, rather than how to link the collection into the cultural heritage landscape, and even more broadly beyond that. I wonder if archivists don’t always think beyond the catalogues they currently create because the researchers they have contact with (who visit the archive) are already fairly confident they want to use that repository, or a particular archive within that repository. In other words, the researcher is already in their space. When I worked in a specialist archive, I thought about researchers discovering our archive as a whole (having an online presence) and then I thought about them using our collections (individual collections each with their own description); I didn’t think about how our collections could be seen as part of a whole information landscape.

The loudest – and most convincing – argument I hear against this kind of approach is that it takes time, and archivists are short on time. But I wonder if that means we have to think fundamentally differently. Going back to Jane Drew, and think about the value of relationships for research into her life and work…

If one archive collection description highlights just a few relationships, this could take us a long way (although relationship types are a whole different thing…). If the individuals and organisations are unambiguously identified, this can help with the process of creating links out to other data sources, so that information can be linked together; then we have the chance to benefit from finding out about relationships that have been defined elsewhere. In other words, the connections one person has throughout their life can only be fully realised through the pooling of information resources, very much a joint effort. If the data is structured it can potentially be brought together.

Traditional archival cataloguing focuses on the collection, and what is documented within the collection. It tends to think in terms of a self-contained document. Pursuing relationships breaks the bounds of any one information source. That seems like a good thing, but it raises questions around approaches to cataloguing. One obvious way to tackle this is to start to think more about archival authority records. These should enable us to move beyond a collection-centric description of the collection and towards a more entity based approach, because you describe an agent (entity) independently of any one archival collection. Another option is to think in a Linked Data way, where you are concentrating on entities and relationships.

There are so many questions raised by the whole area of entities and relationships. A few of my current conclusions are:

We should primarily be led by what benefits research. Researchers are far less likely to think in terms of individual archive collections, and far more likely to think in terms of research areas (topics). The Web gives us the opportunity to think in a broader context.

Maybe it is worth considering taking some of the time used to provide a really detailed biographical history as an unstructured narrative, or the time to provide a really detailed multi-level description, and taking more time to provide (or provide the potential for) connections between our descriptions and the larger information environment. This could allow researchers to bring together much more comprehensive information, even if what we provide about individual collections is less detailed. Just adding something like a VIAF identifier to a name would be a great big leap forwards (http://viaf.org/viaf/51792789).

There is great value in being a small fish in a big pond, because most researchers are fishing for data in the big pond. As Wisser’s article says, “relationships are…seen to free collections from the isolation of individual repositories.” If we aim to be part of the big pond, we can continue to tend our smaller ponds as well!

To go back to the Piper Collection and Jane Drew….I used this as a random example, thinking of a researcher interested in one particular designer. But of course, the Tate Gallery Archive can’t be expected to define all the relationships within the description. It’s great that they have provided enough detail to find this one individual item – without that, we would not know about the connection with Jane Drew. I’m arguing for unambiguously identifying entities (people, organisations) because if we can potentially link this instance of ‘Jane Drew’ to other instances in other information sources, then it is very possible that we can find out more about this relationship; And if the relationship can’t be established through other sources, then maybe this archive provides unique evidence of a connection that could significantly benefit research.