A Selection of Archives to mark International Women’s Day

March 1, 2021 / Jane Stevenson

To mark International Women’s Day on 8th March, here is a selection of archives featuring women who have excelled and been highly influential in many different fields.

Daphne Oram (1925-2003), composer and musician

The Daphne Oram Archive, held at Goldsmiths, University of London, comprises papers, personal research, correspondence and photographs documenting the life and work of a pioneering British composer and electronic musician.

Throughout her career she lectured on electronic music and studio techniques. In 1971 she wrote An Individual Note of Music, Sound and Electronics which investigated philosophical aspects of electronic music. Besides being a musical innovator her other significant achievements include being the first woman to direct an electronic music studio, the first woman to set up a personal studio and the first woman to design and construct an electronic musical instrument.

Delia Derbyshire (1937-2001), musician and composer

The University of Manchester holds the Papers of Delia Derbyshire, composer. After being rejected by Decca Records, who said that they did not employ women in the recording studio, in 1962 Derbyshire became a trainee studio manager at the BBC. She was soon seconded to work at the BBC’s Radiophonic Workshop, which had been set up to provide theme and incidental music and sound for BBC radio and television programmes. The following year, she produced her electronic ‘realisation’ of Ron Grainer’s theme tune for the hugely popular BBC series Doctor Who – which is still one of the most famous and instantly recognisable television themes. In the late 1990s there was renewed interest in her work and many younger musicians making electronic dance and ambient music (such as Aphex Twin and The Chemical Brothers) cited Derbyshire as an important influence.

The Anita White Foundation International Women and Sport Archive

Dr Anita White and Professor Celia Brackenridge were both associated with the University of Chichester, and they were both centrally involved in the leadership and development of the international women and sport movement since 1990. The International Women and Sport Archive is comprised primarily of papers brought together by them and other leaders in the movement, accumulated in the course of their research, study and work in the fields of the sociology of sport and sport science, and their involvement as activists and leaders in the global women and sport movement.

The International Women and Sport Movement is said to have been born out of a decade in which increasing globalisation brought together women from across the world in the practice of sport. It does not refer to any one organisation, body or country, but it is generally agreed that a landmark event and major catalyst in the movement was the first international conference on women and sport which took place on 5-8 May 1994.

Kaye Webb ( 1914-1996), editor and publisher

The Papers of Kaye Webb, covering her career as journalist, magazine editor, editor at Puffin and later literary agent, are held at the Seven Stories Archive. The collection provides a comprehensive record of Webb’s career, reflecting the wide variety of work undertaken by her, and documented through notes, correspondence, press cuttings, audio-visual material, memorabilia and ephemera. Webb was editor of Puffin Books between 1961 and 1979, and in 1967 founded the Puffin Club, which she ran until 1981. As a journalist she worked on publications including Picture Post, Lilliput and the News Chronicle.

Elizabeth Garrett Anderson (1836-1917), physician and suffragist

The Letters of Elizabeth Garrett Anderson are part of the Women’s Library Archives. An English physician and suffragist, she was was the first woman to qualify in Britain as a physician and surgeon. She was the co-founder of the first hospital staffed by women, the first dean of a British medical school, the first woman in Britain to be elected to a school board and, as mayor of Aldeburgh, the first female mayor in Britain. The letters cover Anderson’s struggle to secure an entry into the medical profession.

Barbara Castle (1910-2002), politician and campaigner

The Barbara Castle Cabinet Diaries at the University of Bradford cover 1965-1971 and 1974-1976. In the 1945 General Election Barbara Castle was elected M.P. for Blackburn, a seat that she retained for 34 years. Following the Labour victory in 1964, Prime Minister Harold Wilson put Castle in charge of the newly-created Ministry of Overseas Development. “I decided on 26 January that I ought to start keeping a regular record of what was happening”, she said. Castle maintained this political diary throughout her periods in office. In 1974 Castle was made Secretary of State for Social Services, and in this post she introduced payment of child benefit to mothers and worked on the State Earnings Related Pensions Scheme. In 1979 she became a Member of the European Parliament and in 1990 she entered the House of Lords as Baroness Castle of Blackburn.

Alison Settle (1891-1980), fashion journalist and editor

In a career spanning from the early 1920s to the early 1970s, Alison Settle worked as a fashion journalist, and Brighton Design Archive hold the Alison Settle Archive which includes professional papers dating from the mid-1930s. She was a tireless champion of the interests of women, as well as campaigning for good quality, affordable design through her relationships with designers and manufacturers. Settle sought to improve design standards in all areas of manufacture and production, and contributed to the work of both the Council for Art & Industry and the Council of Industrial Design. She remained one of the best known fashion journalists in the country.

Elise Edith Bowerman (1889-1973), lawyer and suffragette

Diaries, photographs and correspondence of Elsie Edith Bowerman are held at the Women’s Library. Bowerman followed her mother into the suffrage movement. They were both active members of the militant Women’s Social & Political Union. They were on the maiden voyage of the Titanic – both survived. She worked for Scottish Women’s Hospitals during the First World War, and she also worked for Emmeline and Christabel Pankhurst during their campaign for ‘industrial peace’ in support of the war effort. In 1924 or 1925 she went on to set up the Women’s Guild of Empire with Flora Drummond, with the aim of promoting co-operation between employers and workers. She was admitted to the Bar in the early twenties and practised until 1938, when she joined the Women’s Voluntary Services. In 1947 Bowerman went to the United States to help set up the United Nations Commission on the Status of Women.

Tessa Boffin (1960-1993), writer, photographer and performance artist

The Tessa Boffin Archive at the University for the Creative Arts includes lesbian, gay, bisexual, transexual and other photography projects, including portrayal of AIDS, cross dressing and safe sex, as well as notes on television and radio productions of the 1980s portrayal on feminism and AIDS. Boffin was one of the leading lesbian artists in Great Britain during the AIDS Crisis, but her risqué performances were controversial, and frequently drew criticism, including from inside the LGBTQ community.

Gladys Aylward (1902-1970), missionary

Gladys May Aylward was an evangelical Christian missionary to China. She travelled to China in 1932 and in 1936 she became a Chinese citizen. In 1940, against the background of civil war between Nationalist government troops and the Communists, Japanese invasion, and the threat of bandits, she led a group of orphans on a perilous journey to Sian. Her story was told in the book The Small Woman, by Alan Burgess published in 1957, and made into the film The Inn of the Sixth Happiness starring Ingrid Bergman, in 1958. The Papers of Gladys Aylward, held at SOAS, provide a vivid portrait of Aylward, including her life in China, and the impact of World War Two.

Creating a COVID-19 archive at the Royal College of Nursing

November 3, 2020 / Jane Ronson

Archives Hub feature for November 2020

Now more than ever as we continue to battle the COVID-19 pandemic, the world is reliant on its digital infrastructure; the need to provide and access accurate and up-to-date information is of paramount importance. This raises some interesting questions, challenges and opportunities for archive services who can play their part in the collective response to the crisis by capturing and recording events, activities and decisions. Archives and recordkeeping professionals have always supported the notions of accountability and transparency through their work, something which is being demonstrated in real time during the development of the pandemic.

As the UK’s largest trade union and professional association for nurses, the Royal College of Nursing (RCN) has been supporting and representing nurses and healthcare workers throughout the pandemic. It is vital that records of how this has been done are available to the organisation in perpetuity as evidence of advice given and decisions taken. The RCN has a responsibility to its members to be able to demonstrate that the organisation has been working in their best interests and the interests of their patients. In turn, the RCN archive has a responsibility to ensure that records with evidential and research value are captured, preserved and accessible to right audiences at the right time.

***One of our first attempts at archiving the RCN COVID-19 webpages using our digital archive.***

As a result, like many of our archivist and recordkeeping colleagues across the world, we have created a COVID-19 archive. Since the beginning of the year the RCN archive team have been actively collecting records relating to COVID-19 from across the organisation to build up a picture of how the pandemic has unfolded through the eyes of RCN members and staff. Unsurprisingly, this covers a wide range of record types and digital formats: web crawls of special COVID-19 webpages containing up-to-date guidance and advice, targeted staff emails, member surveys on working conditions and PPE, General Secretary’s video messages, special committee situation reports, newly created online nursing resources, publications – the list could go on. Within this set of records is a complex combination of access requirements and restrictions which, through balancing business confidentiality with public interest, we will manage alongside the records themselves.

We are in the fortunate position of having a remotely accessible network and a digital archive, which has meant that we have been able to collect these records as they have been created and start uploading them to our digital archive straight away. While some of the records we’re collecting as part of the COVID-19 archive project would have been transferred to us anyway, there are several new record series on our 2020 collecting plan as a result of the pandemic. For example, our first venture in web archiving was a test crawl of the RCN COVID-19 webpages; these are now collected regularly and form an integral part of the COVID-19 archive. Having seen and been inspired by the experiences of other archives already running successful daily web crawls to capture public advice and the public response, we decided to capture our pages daily as well – this ensured that we were keeping up to speed with each piece of new advice and guidance shared on the webpages. As the rate of updates to the pages has slowed, we have since reduced the frequency to weekly, although we continue to monitor them, ready to capture more frequently if needed. This was the pilot web archiving project we didn’t know we were doing until it happened, and it has in turn has sparked interest in a larger web archiving project to capture the whole RCN website, which is well underway.

A video message from Donna Kinnar, General Secretary, on the staff intranet. An example of the range of formats collected for the COVID-19 archive. — ***A video message from Donna Kinnar, General Secretary, on the staff intranet. An example of the range of formats collected for the COVID-19 archive***.

Alongside the collecting of material, we have been considering how the records of the COVID-19 archive will fit into our existing catalogue structure. While it would be easy to create a new Fonds for COVID-19, we realised that this view was being skewed by our thoughts about future access to the material, and the ease at which colleagues or researchers would be able to view all the material neatly packaged together. Instead we plan to preserve the context of the records by arranging them by creator, in our case this is mostly the department of origin, to fit within our existing catalogue structure. There will be occasions when it is important to view all COVID-19 records together to get a complete picture of the reaction and response to the pandemic, so using the ‘linked collection’ feature in our digital archive we plan to create a virtual COVID-19 collection containing records from across different record series to allow this level of access. Beyond this we are considering which records from our COVID-19 archive will be shared on our public digital archive website to ensure the transparency and accountability that creating the COVID-19 archive in the first place helps to achieve.

We have certainly learnt a lot this year and the team has upskilled, becoming more proficient and confident in processing a wide range of digital formats, from collection through to access. Our sector has also stepped up by providing online webinars and training events to share our experiences of this extraordinary time. In May we participated in a panel discussion facilitated by Preservica, our digital archive supplier, who generously donated 250GB of storage space for us to store the COVID-19 archive. At the event we shared our plans and projects for collecting COVID-19 records with the archive community alongside colleagues from a wide range of institutions. These included Network Rail, who have been collecting records such as emergency train timetables introduced in response to the falling customer demand, and all the documentation that went into making this happen, and University at Buffalo in the US, who are encouraging students and staff to share their experiences of the pandemic by submitting video diaries and photographs to the archive. Learning about and reflecting on the wide range of collecting projects happening around the world is as informative as it is inspiring.

***An example of a publication for the COVID-19 archive. This is the cover of the April 2020 Bulletin RCN members magazine.***

It is amazing to think that in the (probably not too distant) future the COVID-19 records we have collected will be catalogued, available to view online through our digital archive and be being used to inform research into, and evaluations of, the response of the UK’s largest independent nursing organisation and our role in how Britain handled the pandemic.

Katherine Chorley, Digital Asst Archivist
Royal College of Nursing Archives

Browse all Royal College of Nursing Archives collections on the Archives Hub.

Previous RCN Archives feature: Cathlin du Sautoy and Hermione Blackwood: personal papers at the Royal College of Nursing Archives

All images copyright Royal College of Nursing Archives. Reproduced with the kind permission of the copyright holders.

Name Authorities in Archives

April 1, 2020 / Jane Stevenson / 2 Comments

There have been some threads on archives-nra recently about adding names to archival catalogues, so I thought it might be good to blog about it, reflecting the Archives Hub’s experience and knowledge on this topic after 20 years of working with aggregated data, and three Linked Data projects. We are also about to embark upon a ‘Names Project’ with the intention, in the first phase, of laying the groundwork for creating something that is interoperable and sustainable. The idea is to develop the Archives Hub so that we can include name records, with the ability to ingest and process them automatically and at scale, which is a big challenge.

What rules or guides should you follow for creating a name?

To start with, one of the interesting things about this topic is that the development of persistent unique identifiers (PIDs) should actually make consistency with name form and pattern less of an issue. (I say that advisedly, as someone who has always promoted consistency, following rules, and using care with constructing names). Of course, this only works if PIDs are assigned to names. To take an example – here is a list of names for one person, the Victorian social reformer Beatrice Webb:

Webb Martha Beatrice 1858-1943 Social Reformer
Webb Beatrice 1858-1943 social reformer
Webb, Martha Beatrice. ( 1858-1943) nee Potter Social Reformer
Webb Martha Beatrice 1858-1943 nee Potter, social reformer and historian
WEBB, Beatrice, 1858-1943
Webb, Martha Beatrice
Webb, M.B., 1858-1943
Webb, Martha, b. 1858, wife of Sydney Webb
Potter, Martha Beatrice, 1858-1943
Martha Beatrice Potter, 1858-1943

If all instances of this name were accompanied by recognised and agreed identifiers, then job done, we know they all represent the same person, whatever the form of the name.

It is important to state that ‘knowing who this is’ applies to both humans and machines. Humans will probably gather that these all represent the same person; the question is whether they can all be matched programmatically.

Still, we’ve a long way to go before universal PID harmony, and we’ve also got the problem of which identifiers to use. So, we’re back to rules for the construction of a name, which, of course, have many advantages besides disambiguation.

The archives community in the UK is likely to turn to ISAAR(CPF) and the NCA Rules. Sometimes the question is asked about which one to use, but in truth they are complimentary and so a choice is not needed.

ISAAR(CPF) is about a full name authority, as it is generally understood within the archive community – essentially a biographical record, documenting the nature, context and activities of the entity, preferably providing relationships to other people and organisations. The idea is to provide context about the records’ creation and use, which helps the user to understand and interpret the archive collection, something that archivists see as an essential activity.

The term ‘name authority’ can simply apply to the name itself as well as to a full record about an entity. This is typically the case in the Library world, but even in NCA Rules (which I’ll come on to), the term is defined as “the recognised, authorised or prescribed form of a name”. This difference in definition can sometimes cause confusion.

ISAAR(CPF) states that it “is intended to be used in conjunction with existing national standards or as the basis for the development of national standards”, and “rules and conventions for standardizing access points may be developed nationally”. ISAAR(CPF) is about a whole lot more than the name; it is about describing the entity and the relationships it has with with other entities. For the authorised form of the name, you are prompted to use national conventions or other guidance. The standard also allows for other forms of the name, but essentially the way the name is constructed and the dividers used are not prescribed. ISAAR(CPF) does also allow for an authority record identifier – which comes back to the PIDs mentioned above, but it does not prescribe the identifier used.

So, that leaves NCA Rules, which are about the construction of names. I’m just going to focus on personal names for this blog post.

As far as I’m concerned, there is loads of good and useful stuff in these Rules. Going through the rules really brings home just how complicated names can be. Everything from medieval surnames to greek names to names with no identifiable surname, pre-titles and epithets is addressed. I’ve particularly found the rules on royal names and papal names useful myself. If you want to know how to deal with William of Malmesbury or the Duchess of Marlborough it’s great.

The NCA Rules were created in 1997, which is an age away in terms of the modern digital and online age, and yet so much of what they say is still useful, because in the end we haven’t changed our names. However, in the digital age we continue to change how the names that we construct are stored, transferred, displayed and used. I think that this means that parts of the NCA Rules are no longer so helpful.

Hyphenated and compound surnames

This is a particular problem as far as I’m concerned. If you want to enter the name of William Henry Fox Talbot, the Rules propose using Talbot | William Henry | Fox. You can cross-reference to ‘Fox Talbot’. However, in modern databases and formats like XML, you are in danger of ending up with:

Surname: Talbot
Forenames: William Henry Fox

In terms of archival catalogues, this may not be so bad. If there is a search by name, it usually searches across the whole name, so ‘Fox Talbot’ as a search is likely to bring back the record. The display of the name may be Talbot, William Henry Fox, but a researcher is likely to understand who that is from the context of the description. Humans are generally good at interpretation through context.

However, for those of us pushing forward with the principles of joined-up data and moving towards the ideal of Linked Data (even if we don’t fully get there), this structure is a problem. In the Archives Hub we could end up with:

<persname>
<surname>Talbot</surname>
<forename>William Henry Fox</forename>
<dates>1800-1877</dates>
</persname>

Clearly, this is not correct, and it becomes harder to connect it with other instances of the name. As stated above, if we all used agreed PIDs it would not be such a problem, e.g.

<persname authfilenumber=”https://viaf.org/viaf/54325833″ source=”viaf”>
<surname >Talbot</surname>
<forename>William Henry Fox</forename>
<dates>1800-1877</dates>
</persname>

But applying these PIDs (even if we do manage to agree what they are) to all our catalogues retrospectively….well, that’s a bit of a job. And it would require the kind of analysis of names that much prefers semantically well-structured names, so kind of a catch 22.

That leaves the recommended route of stating that the ‘entry element’ for the surname is, well, the surname. Hyphenated surnames are the same. NCA Rules plumps for “Lewis” as the entry element for Cecil Day-Lewis. I would argue for it being “Day-Lewis”.

I think there is a similar issue with prefixes such as ‘Du’ and ‘Van’. Putting Daphne Du Maurier under ‘Maurier’ is not right…

Being part of the wider world

One reason ‘Maurier, Daphne Du’ is not right is clear when you look at http://viaf.org/viaf/24600806. This is the Virtual International Authority File entry for Daphne Du Maurier. Only the Lebanese National Library has gone for ‘Maurier, Daphne Du’. Of course, the name has still been matched with the others, so no harm done in a sense, at least on VIAF. But it doesn’t really help matters to be out of sync with everyone else where names are concerned.

VIAF is the Virtual International Authority File, and it is a good place to start when thinking about persistent unique identifiers and the benefits of data join-up. It is not perfect, but what it does is to push things towards interlinked data and to enable the kind of connectivity that Linked Data is after. Other authority files are available (which is part of the problem), but VIAF is widely used, it sources from many countries, and it is fairly comprehensive.

Going back to Beatrice Webb. She is also in VIAF: http://viaf.org/viaf/86607236. Just as with Daphne Du Maurier, you can see how all the variations in the name have been brought together. But there is more value to VIAF than this. It also brings in other data. As well as related names, related works and publishers it includes links to Wikipedia in a whole range of languages. It also records the ISNI (International Standard Name Identifier) and WorldCat Identifier.

All of these links provide the potential for any data at those destinations to be brought together. If you go to the English page on wikipedia for Beatrice Webb, that includes content sourced from wikidata: https://www.wikidata.org/wiki/Q242666 – another instance of sharing information across different services.

Going back to ISAAR(CPF), the standard states that repositories “can more easily share or link contextual information about this source if it has been maintained in a standardized manner. Such standardization is of particular international benefit when the sharing or linking of contextual information is likely to cross national boundaries”. But as the name entry itself is down to national standards, the question is whether the NCA Rules do encourage standardisation. I would say that they do, on the whole, but with caveats, including those mentioned above. You may have others. I know that some archivists are not happy with the treatment of women’s names, such as the advice: “A woman who marries and adopts her husband’s surname is to be entered under that name.” I tend to think that it is important to include the maiden name, and I think we should consider both linking up data (links from information created before the person was married, for example) and also how end users will search – rules are no good if they act against people actually finding the information that they want.

Epithet and structure

Epithets are used by archivists, but most domains do not use them. The library world does not add epithets. We like them for adding context, and they are often very useful. However, they do add to the level of variation considerably. For our Names Project we will probably exclude the epithet from name matching (if we can – they are not always easy to isolate). With the time and tenacity, you could utilise them to help with matching, but one of the challenges with algorithmic solutions is that you have to draw the line according to your resources. Epithets are really useful, but we can never hope to standardise them. What we really need to do is to identify them as part of the structure. In EAC-CPF (the XML standard for name entities that is based upon ISAAR(CPF), information typically included in epithets is separated out semantically:

<nameEntry>
<part localType=“surname”>Emberton</part>
<part localType=“forename”>Joseph</part>
</nameEntry>

And then also included in this entry:

<occupation>
<term>Architect</term>
</occupation>

This is perfect. You can display the information together, or separately; you can search them together, or separately. Records in Context (RiC) has sought to provide a conceptual model that brings together ICA archival standards. It is not a set of cataloguing guidelines, but it does use the language of structured data, and to some degree Linked Data – names are ‘entities’ and they have properties and relations with other entities. It encourages the idea of separating out things like occupation (often part of our epithet) from the agent (person) so that you can, for example, link one occupation to many people. This is more feasible if we create the entries in a structured way so that you can separate these two pieces of information (a person pursues an occupation / an occupation is pursued by a person), but often we don’t do this, and our cataloguing systems don’t help us to do this.

Dividers

In many ways these are part of a name for the purposes of processing, but dividers are not really covered in most standards. Standards tend to cop out by using pipes: Charles I | 1600-1649 | King of Great Britain and Ireland. This is an agnostic stance – the dividers are up to you. NCA Rules: “In the Rules there are no mandatory conventions for punctuation and abbreviation. These will continue to conform to the house style of each repository.” For house style read “everyone do what they prefer”.

This has meant that we have a great and interesting variety of dividers. The divider diversity can be overcome programatically, but it is still another complication in terms of consistency. Have a look at the names at the bottom of this record: https://archiveshub.jisc.ac.uk/data/gb982-sww. The problems are *not* because Aberystwyth have entered them incorrectly – the data is fine; they are because this record was created in a long past time (2010) of the Archives Hub’s first online data creation tool, which was fairly basic. We then attempted some global normalisation work on these older descriptions…and there’s the rub. If you write something to say ‘turn X into Y’ that usually works fine. But the more complicated it gets, the harder it is to satisfy all the data, so to speak. It is more like ‘turn P,Q,R,S,T,U,V,W,X into Y’ but if you come across A, B or C turn them into Z’. With the above example the excess of dividers is because there is punctuation within the XML record itself, but we also apply punctuation during display, as most records don’t include it. We are now working on more effective ways to standardise (which is a slow process because we don’t have many staff, whilst we also have loads of things clamouring for attention). We could have a recognised and agreed use of dividers, e.g:

Churchill, Sir Winston Leonard Spencer, 1874-1965, Knight, statesman and historian

Churchill, Sir Winston Leonard Spencer (1874-1965), Knight, statesman and historian

Dates in brackets seems to be the most common approach in archives, although maybe less so in other domains. Both these names can be easily matched – the dividers are not a problem. But it can get harder with various combinations of pre-titles, titles, epithets, born, died, floruit, question marks, nee, square brackets, commas, stops, semi-colons and colons. However, the ideal is that the parts of the name are created separately, and then displayed as preferred, which I come back to below.

So, what should we do?

My best advice to anyone creating names is to follow ISAAR(CPF) and to use NCA Rules, but also think about a name in a broader context and be aware of international standards and identifiers – it is great if you can include recognised identifiers if you can – VIAF or ISNI or ORCID. We want to share data with the outside world (outside of the archival domain) so we don’t want to be too focused on archival standards and ignore web standards and common ways of doing things. We have to work within the systems we have, so sometimes you cannot structure a name as you would like, but aim for consistency, semantic structure as far as you are able, and practices that are not out of step with everyone else’s practices. This means that we can more easily join our data up and create a space that researchers can navigate in infinite ways and for infinite purposes.

Survey: Digital Skills in the Archives Sector

January 25, 2019 / Ben

Do you work in the archives sector? What are your digital strengths? What digital support and training do you still need?

The National Archives and Jisc have created a survey to hear the latest thoughts of those working within (or who have a professional connection to) the sector on all aspects of their digital services and capabilities:

https://www.snapsurveys.com/wh/s.asp?k=154765868779

The results of this survey will directly inform The National Archives’ programme of work in support of the sector over the next 3-5 years. In collaboration with Jisc, The National Archives want to tackle various challenges at the intersection of archival practice and digital technologies, as well as to celebrate digital excellence in archives whenever possible.

The survey takes around 20-30 minutes to complete but not all questions are compulsory and many respondents will only need to answer certain applicable sections.

If you have any questions, please contact asd@nationalarchives.gov.uk. Thanks in advance for your help.

The Website for the New Archives Hub

February 16, 2017 / Jane Stevenson

screenshot of archives hub homepage — Archives Hub homepage

The back end of a new system usually involves a huge amount of work and this was very much the case for the Archives Hub, where we changed our whole workflow and approach to data processing (see The Building Blocks of the new Archives Hub), but it is the front end that people see and react to; the website is a reflection of the back end, as well as involving its own user experience challenges, and it reflects the reality of change to most of our users.

We worked closely with Knowledge Integration in the development of the system, and with Gooii in the design and implementation of the front end, and Sero ran some focus groups for us, testing out a series of wireframe designs on users. Our intention was to take full advantage of the new data model and processing workflow in what we provided for our users. This post explains some of the priorities and design decisions that we made. Additional posts will cover some of the areas that we haven’t included here, such as the types of description (collections, themed collections, repositories) and our plan to introduce a proximity search and a browse.

Speed is of the Essence

Faster response times were absolutely essential and, to that end, a solution based on an enterprise search solution (in this case Elasticsearch) was the starting point. However, in addition to the underlying search technology, the design of the data model and indexing structure had a significant impact on system performance and response times, and this was key to the architecture that Knowledge Integration implemented. With the previous system there was only the concept of the ‘archive’ (EAD document) as a whole, which meant that the whole document structure was always delivered to the user whatever part of it they were actually interested in, creating a large overhead for both processing and bandwidth. In the new system, each EAD record is broken down into many separate sections which are each indexed separately, so that the specific section in which there is a search match can be delivered immediately to the user.

To illustrate this with an example:-

A researcher searches for content relating to ‘industrial revolution’ and this scores a hit on a single item 5 levels down in the archive hierarchy. With the previous system the whole archive in which the match occurs would be delivered to the user and then this specific section would be rendered from within the whole document, meaning that the result could not be shown until the whole archive has been loaded. If the results list included a number of very large archives the response time increased accordingly.

In the new system, the matching single item ‘component’ is delivered to the user immediately, when viewed in either the result list or on the detail page, as the ability to deliver the result is decoupled from archive size. In addition, for the detail page, a summary of the structure of the archive is then built around the item to provide both the context and allow easy navigation.

Even with the improvements to response times, the tree representation (which does have to present a summary of the whole structure), for some very large multi-level descriptions takes a while to render, but the description itself always loads instantly. This means that that the researcher can always see they have a result immediately and view it, and then the archival structure is delivered (after a short pause for very large archives) which gives the result context within the archive as a whole.

The system has been designed to allow for growth in both the number of contributors we can support and the number of end-users, and will also improve our ability to syndicate the content to both Archives Portal Europe and deliver contributors own ‘micro sites‘.

Look and Feel

Some of the feedback that we received suggested that the old website design was welcoming, but didn’t feel professional or academic enough – maybe trying to be a bit too cuddly. We still wanted to make the site friendly and engaging, and I think we achieved this, but we also wanted to make it more professional looking, showing the Hub as an academic research tool. It was also important to show that the Archives Hub is a Jisc service, so the design Gooii created was based upon the Jisc pattern library that we were required to use in order to fit in with other Jisc sites.

We have tried to maintain a friendly and informal tone along with use of cleaner lines and blocks, and a more visually up-to-date feel. We have a set of consistent icons, on/off buttons and use of show/hide, particularly with the filter. This helps to keep an uncluttered appearance whilst giving the user many options for navigation and filtering.

In response to feedback, we want to provide more help with navigating through the service, for those that would like some guidance. The homepage includes some ‘start exploring’ suggestions for topics, to help get inexperienced researchers started, and we are currently looking at the whole ‘researching‘ section and how we can improve that to work for all types of users.

Navigating

We wanted the Hub to work well with a fairly broad search that casts the net quite widely. This type of search is often carried out by a user who is less experienced in using archives, or is new to the Hub, and it can produce a rather overwhelming number of results. We have tried to facilitate the onward journey of the user through judicious use of filtering options. In many ways we felt that filtering was more important than advanced search in the website design, as our research has shown that people tend to drill down from a more general starting point rather than carry out a very specific search right from the off. The filter panel is up-front, although it can be hidden/shown as desired, and it allows for drilling down by repository, subject, creator, date, level and digital content.

Another way that we have tried to help the end user is by using typeahead to suggest search results. When Gooii suggested this, we gave it some thought, as we were concerned that the user might think the suggestions were the ‘best’ matches, but typeahead suggestions are quite a common device on the web, and we felt that they might give some people a way in, from where they could easily navigate through further descriptions.

Hub website example of type ahead results — A search for ‘design’ with suggested results

The suggestions may help users to understand the sort of collections that are described on the Hub. We know that some users are not really aware of what ‘archives’ means in the context of a service like the Archives Hub, so this may help orientate them.

Suggested results also help to explain what the categories of results are – themes and locations are suggested as well as collection descriptions.

We thought about the usability of the hit list. In the feedback we received there was no clear preference for what users want in a hit list, and so we decided to implement a brief view, which just provides title and date, for maximum number of results, and also an expanded view, with location, name of creator, extent and language, so that the user can get a better idea of the materials being described just from scanning through the hit list.

An example of a hit list result in expanded mode — Expanded mode gives the user more information

With the above example, the title and date alone do not give much information, which is particularly common with descriptions of series or items, of so the name of creator adds real value to the result.

Seeing the Wood Through the Trees

The hierarchical nature of archives is always a challenge; a challenge for cataloguing, processing and presentation. In terms of presentation, we were quite excited by the prospect of trying something a bit different with the new Hub design. This is where the ‘mini map’ came about. It was a very early suggestion by K-Int to have something that could help to orientate the user when they suddenly found themselves within a large hierarchical description. Gooii took the idea and created a number of wireframes to illustrate it for our focus groups.

For instance, if a user searches on Google for ‘conrad slater jodrell bank’ then they get a link to the Hub entry:

screenshot of google search result for a Hub description — Result of a search on Google

The user may never have used archives, or the Archives Hub before. But if they click on this link, taking them directly to material that sits within a hierarchical description, we wanted them to get an immediate context.

screen shot of one entry in the Jodrell Bank Archive — Jodrell Bank Observatory Archives: Conrad Slater Files

The page shows the description itself, the breadcrumb to the top level, the place in the tree where these particular files are described and a mini map that gives an instant indication of where this entry is in the whole. It is intended (1) to give a basic message for those who are not familiar with archive collections – ‘there is lots more stuff in this collection’ and (2) to provide the user with a clearly understandable expanding tree for navigation through this collection.

One of the decision we made, illustrated here, was to show where the material is held at every level, for every unit of description. The information is only actually included at the top level in the description itself, but we can easily cascade it down. This is a good illustration of where the approach to displaying archive descriptions needs to be appropriate for the Web – if a user comes straight into a series or item, you need to give context at that level and not just at the top level.

The design also works well for searches within large hierarchical descriptions.

screenshot showing a 'search within' with highlighted results — Search for ‘bicycles’ within the Co-operative Union Photographic Collection

The user can immediately get a sense of whether the search has thrown up substantial results or not. In the example above you can see that there are some references to ‘bicycles’ but only early on in the description. In the example below, the search for ‘frost on sunday’ shows that there are many references within the Ronnie Barker Collection.

screenshot showing search within with lots of highlighted results — Search within the Ronnie Barker Collection for ‘frost on sunday’

One of the challenges for any archive interface is to ensure that it works for experienced users and first-time users. We hope that the way we have implemented navigation and searching mean that we have fulfilled this aim reasonably well.

Small is Beautiful

screenshot showing the Hub search on a mobile phone — The Archives Hub on an iPhone

The old site did not work well on mobile devices. It was created before mobile became massive, and it is quite hard to retrospectively fit a design to be responsive to different devices. Gooii started out with the intention of creating a responsive design, so that it renders well on different sized screens. It requires quite a bit of compromise, because rendering complex multi-level hierarchies and very detailed catalogues on a very small screen is not at all easy. It may be best to change or remove some aspects of functionality in order to ensure the site makes sense. For example, the mobile display does not open the filter by default, as this would push the results down the page. But the user can open the filter and use the faceted search if they choose to do so.

We are particularly pleased that this has been achieved, as something like 30% of Hub use is on mobiles and tablets now, and the basic search and navigation needs to be effective.

graph showing use of desk, mobile and tablet devices on the Hub — Devices used to view the Hub site over a three month period

In the above graph, the orange line is desktop, the green is mobile and the purple is tablet. (the dip around the end of December is due to problems setting up the Analytics reporting).

Cutting Our Cloth

One of the lessons we have learnt over 15 years of working on the Archives Hub is that you can dream up all of the interface ideas that you like, but in the end what you can implement successfully comes down to the data. We had many suggestions from contributors and researchers about what we could implement, but oftentimes these ideas will not work in practice because of the variations in the descriptions.

We though about implementing a search for larger, medium sized or smaller collections, but you would need consistent ‘extent’ data, and we don’t have that because archivists don’t use any kind of controlled vocabulary for extent, so it is not something we can do.

When we were running focus groups, we talked about searching by level – collection, series, sub-series, file, item, etc. For some contributors a search by a specific level would be useful, but we could only implement three levels – collection (or ‘top level’), item (which includes ‘piece’) and then everything between these, because the ‘in-between’ levels don’t lend themselves to clear categorisation. The way levels work in archival description, and the way they are interpreted by repositories, means we had to take a practical view of what was achievable.

We still aren’t completely sold on how we indicate digital content, but there are particular challenges with this. Digital content can be images that are embedded within the description, links to images, or links to any other digital content imaginable. So, you can’t just use an image icon, because that does not represent text or audio. We ended up simply using a tick to indicate that there is digital content of some sort. However, one large collection may have links to only one or two digital items, so in that case the tick may raise false expectations. But you can hardly say ‘includes digital content, but not very much, so don’t get too excited’. There is room for more thought about our whole approach to digital content on the Hub, as we get more links to digital surrogates and descriptions of born-digital collections.

Statistics

The outward indication of a more successful site is that use goes up. The use of statistics to give an indication of value is fraught with problems. Do the number of clicks represent value? Might more clicks indicate a poorer user interface design? Or might they indicate that users find the site more engaging? Does a user looking at only one description really gain less value than a user looking at ten descriptions? Clearly statistics can only ever be seen as one measure of value, and they need to be used with caution. However, the reality is that an upward graph is always welcomed! Therefore we are pleased to see that overall use of the website is up around 32% compared to this period during the previous year.

Jan 2016 (the orange line) and Jan 2017 (the blue line), which shows typical daily use above 2,000 page views.

Feedback

We are pleased to say that the site has been very well received…

“The new site is wonderful. I am so impressed with its speed and functionality, as well as its clean, modern look.” (University Archivist)

“…there are so many other features that I could pick out, such as the ability to download XML and the direct link generator for components as well as collections, and the ‘start exploring’ feature.” (University Archivist)

“Brand new Archives Hub looks great. Love how the ‘explorer themes’ connect physically separated collections” (Specialist Repository Head of Collections)

“A phenomenal achievement!” (Twitter follower)

With thanks to Rob Tice from Knowledge Integration for his input to this post.

Save

Archives Portal Europe Country Managers’ Meeting, 30 Nov 2016

November 30, 2016 / Jane Stevenson / 1 Comment

This is a report of a meeting of the Archives Portal Europe Country Managers’ in Slovakia, 30 November 2016, with some comments and views from the UK and Archives Hub perspective.

APE-CMmeeting-30Nov2016 — APE Country Managers meeting, Bratislava, 30 Nov 2016

Context

The APE Foundation (APEF), which was created following the completion of the APEx project (an EC funded project to maintain and develop the portal running from 2012 to 2015), is now taking APE forward. It has a Governing Board and working groups for standards, technical issues and PR/comms. The APEF has a coordinator and three technical/systems staff as well as an outreach officer. Institutions are invited to become associate members, to help support the portal and its aims.

Things are going well for APEF, with a profit recorded for 2016, and growing associate membership. APEF continues to be busy with development of APE, and is endeavouring to encourage cooperation and collaboration as a means to seize opportunities to keep developing and to take advantage of EU funding opportunities.

Current Development

The APEF has the support of Ministry of Culture in the Netherlands and has a close working relationship with the Netherlands national aggregation project, the ‘DTR’, which is key to the current APE development phase. The idea is to use the framework of APE for the DTR, benefitting both parties. Cooperation with DTR involves three main areas:

•   building an API to open up the functionality of APE to third parties (and to enable the DTR to harvest the APE data from The Netherlands)
•   improving the uploading and processing of EAC-CPF
•   enabling the uploading and processing of ‘additional finding aids’

The API has been developed so that specific requests can be sent to fetch selected data. It is possible to do this for EAD (descriptions) and EAC-CPF (names). The API provides raw data as well as processed results. There have been issues around things like relevance of ordering of results which is a substantial area of work that is being addressed.

The API raises implications in terms of the data, as the Content Provider Agreement that APE institutions sign gives control of the data to the contributors. So, the API had to be implemented in a way that enables each contributor to give explicit permission for the data to be available as CC0 (fully open data). This means that if a third party uses the API to grab data, they only get data from a country that has given this permission. APEF has introduced an API key, which is a little controversial, as it could be argued that it is a barrier to complete openness, but it does enable the Foundation to monitor use, which is useful for impact, for checking correct use, and blocking those who misuse the API. This information is not made open, but it is stored for impact and security purposes.

There was some discussion at the meeting around open data and use of CC0. In countries such as Switzerland it is not permitted to open up data through a CC0 licence, and in fact, it may be true to say that CC0 is not the appropriate licence for archival descriptions (the question of whether any copyright can exist in them is not clear) and a public domain licence is more appropriate. When working across European countries there are variations in approaches to open data. The situation is complicated because the application of CC0 for APE data is not explicit, so any licence that a country has attached to their data will effectively be exported with the data and you may get a kind of licence clash. But the feeling is that for practical purposes if the data is available through an API, developers will expect it to be fully open and use it with that in mind.

There has been work to look at ways to take EAC-CPF from a whole set of institutions more easily, which would be useful for the UK, where we have many EAC-CPF descriptions created by SNAC. Work on any kind of work to bring more than one name description for the same person together has not started, and is not scheduled for the current period of development, but the emphasis is likely to be on better connectivity between variations of a name rather than having one description per name.

Additional finding aids offer the opportunity to add different types of information to APE. You may, for example, have a register of artists or ships logs, you may have started out with a set of cards with names A-Z, relating to your archive in some way. You could describe these in one EAD description, and link this to the main description. In the current implementation of EAD2002 in APE this would have to go into a table in Scope & Content and in-line tagging is not allowed to identify parts of the data. This leads to limitations with how to search by name. But then EAD3 gives the option to add more information on events and names. You can divide a name up into parts, which allows for better searching. Therefore APE is developing a new means to fetch and process EAD3 for the additional finding aids alongside EAD2002 for ‘standard’ finding aids. In conjunction with this, the interface needs to be changed to present the new names within the search.

The work on additional finding aids may not be so relevant for the Archives Hub as a contributor to APE, as the Hub cannot look at taking on ‘other finding aids’, with all the potential variations that implies. However, institutions could potentially log into APE themselves and upload these different types of descriptions.

APE and Europeana

There was quite a bit to talk about concerning APE and Europeana. The APEF is a full partner of the Europeana Digital Services Infrastructure 2 (DSI2) project (currently running 2016/2017). The project involves work on the structure for Europeana, maintaining and running data and aggregation services, improving data quality, and optimising relations with data partners. The work APE is involved with includes improving the current workflow for harvest/ingest of data, and also evaluating what has already been ingested into Europeana.

Europeana seems to have ongoing problems dealing with multi-level EAD descriptions, compounded by the limitation that they only represent digital materials. The approach is not a good fit for archives. Europeana have also introduced both a new publishing framework and different rights statements.

The new publishing framework is a 4 tier approach where you can think of Europeana as a more basic tool for promoting your archives, or something that is a platform for reuse. It refers to the digital materials in terms of whether they are a certain number of pixels, e.g. 800 pixels wide for thumbnails (adding thumbnails means using Europeana as a ‘showcase’) and 1,200 pixels wide ( high quality and reusable, using Europeana as a distribution and reuse platform). The idea of trying to get ‘quality’ images seems good, but in practice I wonder if it simply raises the barrier too much.

The new Rights statements require institutions to be very clear about the rights they want to apply to digital content. The likely conclusion of all this from the point of view of the Archives Hub is that we cannot grapple with adding to Europeana on behalf of all of our contributors, and therefore individual contributors will have to take this on board themselves. It will be possible for contributors to log into the APE dashboard (when it has been changed to reflect the Europeana new rights) and engage with this, selecting the finding aids, the preferred rights statements, and ensuring that thumbnail and reusable images meet the requirements. One the descriptions are in APE they can then be supplied to Europeana. The resulting display in Europeana should be checked, to ensure that it is appropriate.

We discussed this approach, and concluded that maybe APE contributors could see Europeana as something that they might use to showcase their content, so, think of it on our terms, as archives, and how it might help us. There is no obligation to contribute, so it is a case of making the decision whether it is worth representing the best visual archives through Europeana or whether this approach takes more effort than the value that we get out of it. After 10 years of working with Europeana, and not really getting proper representation of archives, the idea of finding a successful way of contributing archives is appealing, but it seems to me that the amount of effort required is going to be significant, and I’m not sure if the impact is enough to warrant it.

Europeana are working on a new way of automated and real time ingest from aggregators and content providers, but this may take another year or more to become fully operational.

Outreach and CM Reports

Towards the end of the day we had a presentation from the new PR/communicaitons officer. Having someone to encourage, co-ordinate and develop ideas for dissemination should provide invaluable for APE. The Facebook page is full of APE activities and related news and events. You can tweet and use the hashtag #archivesportaleurope if you would like to make APE aware of anything.

We ended the day with reports from country managers, which, as always threw up many issues, challenges, solutions, questions and answers. Plenty to set up APEF for another busy year!

Save

Archives Portal Europe builds firm foundations

July 7, 2016 / Jane Stevenson / 1 Comment

On 8th June 2016 I attended the first Country Manager’s meeting of the newly formed Foundation of the Archives Portal Europe (APEF) at the National Archives of the Netherlands (Nationaal Archief).

The Foundation has been formed on the basis of partnerships between European countries. The current Foundation partners are: Belgium, Denmark, Luxembourg, The Netherlands, Spain, Sweden, Switzerland, Estonia, France, Germany, Hungary, Italy, Latvia, Norway and Slovenia. All of these countries are members of the ‘Assembly of Associates’. Negotiations are proceeding with Bulgaria, Greece, Liechtenstein, Lithuania, Malta, Poland, Slovakia and the UK. Some countries are not yet in a position to become members, mainly due to financial and administrative issues, but the prospects currently look very positive, with a great willingness to take the Portal forwards and continue the valuable networking that has been built up over the past decade. Contributing to the Portal does not incur financial contribution; the Assembly of Associates is separate from this, and the idea is that countries (National Archives or bodies with an educational/research remit) sign up to the principles of APE and the APE Foundation – to collaborate and share experiences and ideas, and to make European archives as accessible as possible.

The Governing Board of the Foundation is working with potential partners to reach agreements on a combination of financial and in-kind contributions. It’s also working on long term strategy documents. It has established working groups for Standards and PR & Communications and it has set up cooperation with the Dutch DTR project (Digitale Taken Rijksarchieven / Digital Processes in State Archives) and with Europeana. The cooperation with the DTR project has been a major boost, as both projects are working towards similar goals, and therefore work effort can be shared, particularly development work.

Current tasks for the APEF:

Building an API to open up the functionality of the Archives Portal Europe to third parties and to implement the possibility for the content providers to switch this option on or off in the Archives Portal Europe’s back-end.
Improving the uploading and processing of EAC-CPF records in the Archives Portal Europe and improving the way in which records creators’ information can be searched and found via the Archives Portal Europe’s front-end and via the API.
Enabling the uploading/processing of “additional finding aids (indexes)” in the Archives Portal Europe and making this additional information available via the Archives Portal Europe’s front-end and the API.

The above in addition to the continuing work of getting more data into the Portal, supporting the country managers in working with repositories, and promoting the portal to researchers interested in using European-wide search and discovery tool.

APEF will be a full partner in the Europeana DSI2 project, connecting the online collections of Europe’s cultural heritage institutions, which will start after the summer and will run for 16 months. Within this project APEF will focus on helping Europeana to develop the aggregation structure and provide quality data from the archives community to Europeana. A focus on quality will help to get archival data into Europeana in a way that works for all parties. There seems to be a focus from Europeana on the ‘treasures’ from the archives, and on images that ‘sell’ the archives more effectively. Whatever the rights and wrongs of this, it seems important to continue to work to expose archives through as many channels as we can, and for us in the UK, the advantages of contributing to the Archives Hub and thence seamlessly to APE and to Europeana, albeit selectively, are clear.

A substantial part of the meeting was dedicated to updates from countries, which gave us all a chance to find out what others are doing, from the building of a national archives portal in Slovakia to progress with OAI-PMH harvesting from various systems, such as ScopeArchiv, used in Switzerland and other countries. Many countries are also concerned with translations of various documents, such as the Content Provider Agreement, which is not something the UK has had to consider (although a Welsh translation would be a possibility).

We had a session looking at some of the more operational and functional tasks that need to be thought about in any complex system such as the APE system. We then had a general Q&A session. It was acknowledged that creating EAD from scratch is a barrier to contributing for many repositories. For the UK this is not really an issue, because we contribute Archives Hub descriptions. But of course it is an issue for the Hub: to find ways to help our contributors provide descriptions, especially if they are using a proprietary system. Our EAD Editor accounts for a large percentage of our data, and that creates the EAD without the requirement of understanding more than a few formatting tags.

The Archives Hub aims to set up harvesting of our contributors’ descriptions over the next year, thus ensuring that any descriptions contributed to us will automatically be uploaded to the Archives Portal Europe. (We currently have to upload on a per-contributor basis, which is not very efficient with over 300 contributors). We will soon be turning our attention to the selective digital content that can be provided by APE to Europeana. That will require an agreement from each institution in terms of the Europeana open data licence. As the Hub operates on the principles of open data, to encourage maximum exposure of our descriptions and promote UK archives, that should not be a problem.

With thanks to Wim van Dongen, APEF country manager coordinator / technical coordinator, who provided the minutes of the Country Managers’ meeting, which are partially reproduced here.

Connecting through defining people and relationships

June 11, 2015 / Jane Stevenson / 1 Comment

If, as a researcher, you search for ‘Jane Drew’, the celebrated architect and town planner, on the Archives Hub, amongst other things, you might discover a single item, “Letter from Jane B Drew to John and Myfanwy Piper”, a letter in the “Papers of John and Myfanwy Piper”.

You can see that its a letter in a collection at the Tate Gallery Archive. The description of the collection is an example of a good quality traditional archival catalogue, giving a fairly detailed listing of the content this particular collection. But as a researcher you are really just interested in just this one letter. You may ask yourself a number of questions, possibly starting with (1) Is this the Jane Drew I’m interested in? and then (2) What is the relationship between Jane Drew and John and Myfanwy Piper? You may well be able to find answers by accessing the letter itself, but at this stage you may just want to place this connection in the broader context of Jane Drew’s life and work. As a researcher, understanding how these people are connected may shed light on your research interests.

In this blog I want to think about this question of relationships. The fact is that archivists rarely provide structured information about relationships; if there is information, it is usually in the biographical history, which might outline key events and people in someone’s life, referring to their parents, work colleagues, friends, etc. The nature of the relationship is sometimes explicitly given, but often it is not. Our standards don’t really say much about relationships between the entities (people, organisations, places, etc) that we describe in our catalogues.

Going back to the Papers of John and Myfanwy Piper as an example, the biographical history includes the following:

[John] Piper began writing reviews from the late 1920s making a name for himself as a critic writing for periodicals like ‘The Listener’ and the ‘Architectural Review’. From 1935-1937 he assisted Myfanwy Evans, with the production of a quarterly review of contemporary European abstract painting called ‘Axis’. In 1937 Piper was commissioned by his friend John Betjeman to write the ‘Shell Guide to Oxfordshire’. Piper went on to write and provide photographs for a number of the guides as well as edit the series. In the same year John Piper married the writer Myfanwy Evans.

This is a typical of a biographical history – useful historical information about the individual or organisation. Within this there is information we can potentially use to create explicit relationship information:

John Piper ‘worked with’ Myfanwy Evans
John Piper ‘was friends with’ John Betjeman
John Piper ‘worked for’ John Betjeman
John Piper ‘was married to’ Myfanwy Evans

There are a number of issues to consider here:

How can we unambiguously identify the people?
How do we choose the vocabulary we use to define the relationships?
Do we try to include dates?
Is it reasonable for us to interpret relationships as ‘friendships’ or ‘collaborations’ if this is not actually explicit?

We are looking at some of these issues through our AHRC project, Exploring British Design. They are all issues that archivists need to explore in a debate around relationship information, but the first issue to consider is simply whether we should be thinking more about including this kind of relationship information in our archival finding aids. Is it something that would be of real value to end users? This issue is coming more to the fore as we start to think about implementing ISAAR (CPF) and working with EAC-CPF , and also as Linked Open Data gains traction.

In a (well worth reading) recent article in the Journal of Contemporary Archival Studies, on the potential impact of EAC-CPF, K.M Wisser reports the findings of a survey about relationship information. The survey received 208 responses from archivists/archives in the US. Wisser wrote “The survey results indicate that the archival community has only just begun to consider relationships in the context of archival description and the role that explicit description of those relationships may play.”

As one respondent wrote:

“relationships are among the most important facets in a collection and deserve a high priority in description. One cannot understand the historical value of an event, person, or organization without knowing [the] relationship among and between them.”

One thing that really strikes me in Wisser’s findings is that archivists see relationships that are documented outside of the collection as almost as significant as those that are documented within the collection. Going back to our original topic of Jane Drew: who else did Jane Drew work with? Should we provide that information to our users, whether or not it is documented within the collection? Is our role to give as full an account as we can of Drew’s life and career? Is it to limit ourselves to what is within the collection?

Wisser’s survey asked respondents about the importance of relationship types. It is curious to me that archivists rated ‘collaborated with’ as a more important relationship than ‘studied with’; they rated a friendship as far more important when it was documented in the collection; and they rated ‘influenced by’ as generally not so important. I’m surprised that the respondents had such definite ideas about the relative importance of different types of relationships, especially when the majority appeared to agree with the importance of ‘objective cataloguing’.

In our Exploring British Design project, the work we did with researchers definitely confirmed to me the fairly self-evident observation that any relationship can be of major significance in research, even if it appears of minor significance within the archive, or indeed, within the literature in general. A brief collaboration may have been a crucial influence, a short friendship may have had hitherto unrealised impact, and anyway, the importance of the relationship depends upon the research you are doing. Researchers are not really aware of how challenging it is for us as information professionals to establish these kinds of relationships in ways that they can then access. But it is clear that this is the sort of connectivity they are after.

One of the challenges with documenting relationship types is that they can be hard to define. As Wisser notes:

“The concept of influence, however, proved the most problematic. Comments such as ‘influence is a squishy sort of relationship’ and ‘I think it would often be very difficult to prove that Entity A was influenced by Entity B’ indicate a notion of intangibility.”

The conclusion could be that we should leave well alone relationships that are hard to define. On the other hand, if we are in a position, as we research a collection, to highlight potential connections, that action could be of major value to a researcher, who may otherwise never know about a link that ends up being crucial to their particular research. The relationships that are easy to define are likely to have been defined already.

One thing that strikes me about the whole notion of introducing interpretation and opinion into cataloguing (a possible argument against defining relationships) is that the horse has pretty much bolted. I’ve looked at enough ‘objective’ descriptions to be aware that the names archivists choose to add as index terms are a choice; they inevitably have to be an opinion about the names significant enough to add as index terms. And subjects are a similar case – some collections are indexed thoroughly, some not at all.

Aside from indexing, each person would create a different scope and content entry, including and excluding different information, and whether you call that subjective or not, it is certainly always selective. You could also argue that the level of detailed hierarchical cataloguing, might indicate the relative importance of the collection. On the Archives Hub there are some collections catalogued in huge detail, and it is inevitable that researchers will assume these collections are particularly important.

All of these choices have implications for discoverability.

In Wisser’s survey, a significant proportion of respondents felt that the importance of a relationship should be based upon the use of the collection. But this, again, raises the question: When thinking about relationships, is the cataloguer reflecting the scope of the collection, or are they trying to give as full a picture as they can of the person or organisation? Are we within the world of the collection; or is the collection within the world?

The reason that I believe that we should think beyond the bounds of the collection content is that I think it promises much richer rewards for our users and encourages archives to be a major player within a broader landscape of information resources. I base my thinking on the premise that the researcher is primarily interested in their research topic, which is not likely to be an archive collection per se, but rather an event, a person, an organisation, a subject, and the way things are connected. I think archivists are still tending to think in terms of a document that describes a collection, rather than how to link the collection into the cultural heritage landscape, and even more broadly beyond that. I wonder if archivists don’t always think beyond the catalogues they currently create because the researchers they have contact with (who visit the archive) are already fairly confident they want to use that repository, or a particular archive within that repository. In other words, the researcher is already in their space. When I worked in a specialist archive, I thought about researchers discovering our archive as a whole (having an online presence) and then I thought about them using our collections (individual collections each with their own description); I didn’t think about how our collections could be seen as part of a whole information landscape.

The loudest – and most convincing – argument I hear against this kind of approach is that it takes time, and archivists are short on time. But I wonder if that means we have to think fundamentally differently. Going back to Jane Drew, and think about the value of relationships for research into her life and work…

If one archive collection description highlights just a few relationships, this could take us a long way (although relationship types are a whole different thing…). If the individuals and organisations are unambiguously identified, this can help with the process of creating links out to other data sources, so that information can be linked together; then we have the chance to benefit from finding out about relationships that have been defined elsewhere. In other words, the connections one person has throughout their life can only be fully realised through the pooling of information resources, very much a joint effort. If the data is structured it can potentially be brought together.

Traditional archival cataloguing focuses on the collection, and what is documented within the collection. It tends to think in terms of a self-contained document. Pursuing relationships breaks the bounds of any one information source. That seems like a good thing, but it raises questions around approaches to cataloguing. One obvious way to tackle this is to start to think more about archival authority records. These should enable us to move beyond a collection-centric description of the collection and towards a more entity based approach, because you describe an agent (entity) independently of any one archival collection. Another option is to think in a Linked Data way, where you are concentrating on entities and relationships.

There are so many questions raised by the whole area of entities and relationships. A few of my current conclusions are:

We should primarily be led by what benefits research. Researchers are far less likely to think in terms of individual archive collections, and far more likely to think in terms of research areas (topics). The Web gives us the opportunity to think in a broader context.

Maybe it is worth considering taking some of the time used to provide a really detailed biographical history as an unstructured narrative, or the time to provide a really detailed multi-level description, and taking more time to provide (or provide the potential for) connections between our descriptions and the larger information environment. This could allow researchers to bring together much more comprehensive information, even if what we provide about individual collections is less detailed. Just adding something like a VIAF identifier to a name would be a great big leap forwards (http://viaf.org/viaf/51792789).

There is great value in being a small fish in a big pond, because most researchers are fishing for data in the big pond. As Wisser’s article says, “relationships are…seen to free collections from the isolation of individual repositories.” If we aim to be part of the big pond, we can continue to tend our smaller ponds as well!

To go back to the Piper Collection and Jane Drew….I used this as a random example, thinking of a researcher interested in one particular designer. But of course, the Tate Gallery Archive can’t be expected to define all the relationships within the description. It’s great that they have provided enough detail to find this one individual item – without that, we would not know about the connection with Jane Drew. I’m arguing for unambiguously identifying entities (people, organisations) because if we can potentially link this instance of ‘Jane Drew’ to other instances in other information sources, then it is very possible that we can find out more about this relationship; And if the relationship can’t be established through other sources, then maybe this archive provides unique evidence of a connection that could significantly benefit research.