Launch of Towards a National Collection discovery projects

£14.5m awarded to transform online exploration of UK’s culture and heritage collections through harnessing innovative AI

The Arts and Humanities Research Council (AHRC) has awarded £14.5m to the research and development of emerging technologies, including machine learning and citizen-led archiving, in order to connect the UK’s cultural artefacts and historical archives in new and transformative ways.

Image by Colin McDowall, courtesy of Towards a National Collection. (Young woman winding bobbins on wheel in the loom shop, 1898 Blanket factory, Witney, Oxfordshire © Historic England Archive CC73_00946 | Indian laundry couple with the man ironing clothes. Attributed to a painter from Tanjore (Thanjavur), ca. 1840. Gouache drawing. 32247i © Wellcome Collection | Sir Hans Sloane (1660–1753) Stephen Slaughter (1697–1765) (attributed to) © The Trustees of the Natural History Museum, London | A starboard bow view of the three-masted barque Glenbervie (1866) with crowds of people, on the rocks at Lowland Point. G14146. © National Maritime Museum, Greenwich, London, Gibson’s of Scilly Shipwreck Collection | Artwork by Peter Morphew illustrating the repositories of the University of Glasgow Archives and Special Collections.)

The Archives Hub is pleased to announce that we will be a project partner in one of five major projects being launched today. The projects form the largest investment of Towards a National Collection, a five-year research programme. Today’s launch reveals the first insights into how thousands of disparate collections could be explored by public audiences and academic researchers in the future.

The five ‘Discovery Projects’ will harness the potential of new technology to dissolve barriers between collections – opening up public access and facilitating research across a range of sources and stories held in different physical locations. One of the central aims is to empower and diversify audiences by involving them in the research and creating new ways for them to access and interact with collections. In addition to innovative online access, the projects will generate artist commissions, community fellowships, computer simulations, and travelling exhibitions. The projects are:

● The Congruence Engine: Digital Tools for New Collections-Based Industrial Histories

● Our Heritage, Our Stories: Linking and searching community-generated digital content to develop the people’s national collection

● Transforming Collections: Reimagining Art, Nation and Heritage

● The Sloane Lab: Looking back to build future shared collections

● Unpath’d Waters: Marine and Maritime Collections in the UK

The investigation is the largest of its kind to be undertaken to date, anywhere in the world. It extends across the UK, involving 15 universities and 63 heritage collections and institutions of different scales, with over 120 individual researchers and collaborators.

Together, the Discovery Projects represent a vital step in the UK’s ambition to maintain leadership in cross-disciplinary research, both between different humanities disciplines and between the humanities and other fields. Towards a National Collection will set a global standard for other countries building their own collections, enhancing collaboration between the UK’s renowned heritage and national collections worldwide.

Archives Hub and the Transforming Collections: Reimagining Art, Nation and Heritage project

Donald Locke 1972-4, Trophies of Empire © Estate of Donald Locke Courtesy of Tate | Claudette Johnson, Figure in Blue, 2018. © Claudette Johnson. Image Credit: Arts Council Collection, Southbank Centre | Iniva_Rivington Place: Photograph by Carlos Jimenez, 2018 | Rachel Jones, lick your teeth, they so clutch, 2021. Arts Council Collection, Southbank Centre, London © the artist.
Donald Locke 1972-4, Trophies of Empire © Estate of Donald Locke Courtesy of Tate | Claudette Johnson, Figure in Blue, 2018. © Claudette Johnson. Image Credit: Arts Council Collection, Southbank Centre | Iniva_Rivington Place: Photograph by Carlos Jimenez, 2018 | Rachel Jones, lick your teeth, they so clutch, 2021. Arts Council Collection, Southbank Centre, London © the artist. Image courtesy of the artist and Thaddaeus Ropac, London.

The Archives Hub at Jisc will be working with fellow project partners:

susan pui san lok, 2021
susan pui san lok, 2021: Courtesy the artist
  • Tate
  • Arts Council Collection
  • Art Fund
  • Art UK
  • Birmingham Museums Trust
  • British Council Collection
  • Contemporary Art Society
  • Glasgow Museums
  • Iniva (Institute of International Visual Art)
  • Manchester Art Gallery
  • Middlesbrough Institute of Modern Art
  • National Museums Liverpool
  • Van Abbemuseum (NL)
  • Wellcome Collection

The Principal investigator for Transforming Collections: Reimagining Art, Nation and Heritage project is Professor susan pui san lok, University of the Arts London.

More than twenty years after Stuart Hall posed the question, ‘Whose heritage?’, Hall’s call for the critical transformation and reimagining of heritage and nation remains as urgent as ever. This project is driven by the provocation that a national collection cannot be imagined without addressing structural inequalities in the arts, engaging debates around contested heritage, and revealing contentious histories imbued in objects.

An arrangement of different castes including snake charmer, brick-layer, basket-maker, potter and wives. Gouache drawing. 28438i © Wellcome Collection.

Transforming Collections aims to enable cross-search of collections, surface patterns of bias, uncover hidden connections, and open up new interpretative frames and ‘potential histories’ (Azoulay, 2019) of art, nation and heritage. It will combine critical art historical and museological research with participatory machine learning design, and embed creative activations of interactive machine learning in the form of artist commissions.

Untitled 1986 1987.21, Manchester Art Gallery © Keith Piper.

Among the aims of this project are to surface suppressed histories, amplify marginalized voices, and re-evaluate artists and artworks ignored or side-lined by dominant narratives; and to begin to imagine a distributed yet connected evolving ‘national collection’ that builds on and enriches existing knowledge, with multiple and multivocal narratives.

The role of the Archives Hub will centre around:

  • Disseminating project aims, developments and outcomes to our contributors, through our communication channels and our cataloguing workshops, to encourage a wide range of archives to engage with these issues.
Glasgow Women’s Library, Museum of the Year finalist, 2018. Art Map 2019. © Marc Atkins / Art Fund 2018
  • Working with the Creative Computing Institute, at the University of the Arts London, to integrate the Machine Learning (ML) processing into the Archives Hub data processing workflows, so that it can benefit for over 350 institutions, including public art institutions.
Mick Grierson, Exploring the Daphne Oram Collection using 3D visualisation and machine learning (screenshot). 2012. Mick Grierson, Parag MitalLondon © the artist.
  • Providing expertise from over 20 years of running an archival aggregator and working with a whole range of UK archive repositories, particularly around sustainability and the challenges of working with archival metadata.

Creating a COVID-19 archive at the Royal College of Nursing

Archives Hub feature for November 2020

Now more than ever as we continue to battle the COVID-19 pandemic, the world is reliant on its digital infrastructure; the need to provide and access accurate and up-to-date information is of paramount importance. This raises some interesting questions, challenges and opportunities for archive services who can play their part in the collective response to the crisis by capturing and recording events, activities and decisions. Archives and recordkeeping professionals have always supported the notions of accountability and transparency through their work, something which is being demonstrated in real time during the development of the pandemic.

As the UK’s largest trade union and professional association for nurses, the Royal College of Nursing (RCN) has been supporting and representing nurses and healthcare workers throughout the pandemic. It is vital that records of how this has been done are available to the organisation in perpetuity as evidence of advice given and decisions taken. The RCN has a responsibility to its members to be able to demonstrate that the organisation has been working in their best interests and the interests of their patients. In turn, the RCN archive has a responsibility to ensure that records with evidential and research value are captured, preserved and accessible to right audiences at the right time.

One of our first attempts at archiving the RCN COVID-19 webpages using our digital archive.
One of our first attempts at archiving the RCN COVID-19 webpages using our digital archive.

As a result, like many of our archivist and recordkeeping colleagues across the world, we have created a COVID-19 archive. Since the beginning of the year the RCN archive team have been actively collecting records relating to COVID-19 from across the organisation to build up a picture of how the pandemic has unfolded through the eyes of RCN members and staff. Unsurprisingly, this covers a wide range of record types and digital formats: web crawls of special COVID-19 webpages containing up-to-date guidance and advice, targeted staff emails, member surveys on working conditions and PPE, General Secretary’s video messages, special committee situation reports, newly created online nursing resources, publications – the list could go on. Within this set of records is a complex combination of access requirements and restrictions which, through balancing business confidentiality with public interest, we will manage alongside the records themselves.

We are in the fortunate position of having a remotely accessible network and a digital archive, which has meant that we have been able to collect these records as they have been created and start uploading them to our digital archive straight away. While some of the records we’re collecting as part of the COVID-19 archive project would have been transferred to us anyway, there are several new record series on our 2020 collecting plan as a result of the pandemic. For example, our first venture in web archiving was a test crawl of the RCN COVID-19 webpages; these are now collected regularly and form an integral part of the COVID-19 archive. Having seen and been inspired by the experiences of other archives already running successful daily web crawls to capture public advice and the public response, we decided to capture our pages daily as well – this ensured that we were keeping up to speed with each piece of new advice and guidance shared on the webpages. As the rate of updates to the pages has slowed, we have since reduced the frequency to weekly, although we continue to monitor them, ready to capture more frequently if needed. This was the pilot web archiving project we didn’t know we were doing until it happened, and it has in turn has sparked interest in a larger web archiving project to capture the whole RCN website, which is well underway.

A video message from Donna Kinnar, General Secretary, on the staff intranet. An example of the range of formats collected for the COVID-19 archive.
A video message from Donna Kinnar, General Secretary, on the staff intranet. An example of the range of formats collected for the COVID-19 archive.

Alongside the collecting of material, we have been considering how the records of the COVID-19 archive will fit into our existing catalogue structure. While it would be easy to create a new Fonds for COVID-19, we realised that this view was being skewed by our thoughts about future access to the material, and the ease at which colleagues or researchers would be able to view all the material neatly packaged together. Instead we plan to preserve the context of the records by arranging them by creator, in our case this is mostly the department of origin, to fit within our existing catalogue structure. There will be occasions when it is important to view all COVID-19 records together to get a complete picture of the reaction and response to the pandemic, so using the ‘linked collection’ feature in our digital archive we plan to create a virtual COVID-19 collection containing records from across different record series to allow this level of access. Beyond this we are considering which records from our COVID-19 archive will be shared on our public digital archive website to ensure the transparency and accountability that creating the COVID-19 archive in the first place helps to achieve.

We have certainly learnt a lot this year and the team has upskilled, becoming more proficient and confident in processing a wide range of digital formats, from collection through to access. Our sector has also stepped up by providing online webinars and training events to share our experiences of this extraordinary time. In May we participated in a panel discussion facilitated by Preservica, our digital archive supplier, who generously donated 250GB of storage space for us to store the COVID-19 archive. At the event we shared our plans and projects for collecting COVID-19 records with the archive community alongside colleagues from a wide range of institutions. These included Network Rail, who have been collecting records such as emergency train timetables introduced in response to the falling customer demand, and all the documentation that went into making this happen, and University at Buffalo in the US, who are encouraging students and staff to share their experiences of the pandemic by submitting video diaries and photographs to the archive. Learning about and reflecting on the wide range of collecting projects happening around the world is as informative as it is inspiring.

An example of a publication for the COVID-19 archive. This is the cover of the April 2020 Bulletin RCN members magazine.
An example of a publication for the COVID-19 archive. This is the cover of the April 2020 Bulletin RCN members magazine.

It is amazing to think that in the (probably not too distant) future the COVID-19 records we have collected will be catalogued, available to view online through our digital archive and be being used to inform research into, and evaluations of, the response of the UK’s largest independent nursing organisation and our role in how Britain handled the pandemic.

Katherine Chorley, Digital Asst Archivist
Royal College of Nursing Archives

Related

Browse all Royal College of Nursing Archives collections on the Archives Hub.

Previous RCN Archives feature: Cathlin du Sautoy and Hermione Blackwood: personal papers at the Royal College of Nursing Archives

All images copyright Royal College of Nursing Archives. Reproduced with the kind permission of the copyright holders.

Archives Hub Survey Results: What do people want from an archives aggregation service?

The 2018 Archives Hub online survey was answered by 83 respondents. The majority were in the UK, but a significant number were in other parts of Europe, the USA or further afield, including Australia, New Zealand and Africa. Nearly 50% were from higher or further education, and most were using it for undergraduate, postgraduate and academic research. Other users were spread across different sectors or retired, and using it for various reasons, including teaching, family history and leisure or archives administration.

We do find that a substantial number of people are kind enough to answer the survey, although they have not used the service yet. On this survey 60% were not regular users, so that is quite a large number, and maybe indicates how many first-time users we get on the service. Of those users, half expected to use it regularly, so it is likely they are students or other people with a sustained research interest. The other 40% use the Hub at varying levels of regularity. Overall, the findings indicate that we cannot assume any pattern of use, and this is corroborated by previous surveys.

Ease of use was generally good, with 43% finding it easy or very easy, but a few people felt it was difficult to use. This is likely to be the verdict of inexperienced users, and it may be that they are not familiar with archives, but it behoves us to keep thinking about users who need more support and help. We aim to make the Hub suitable for all levels of users, but it is true to say that we have a focus on academic use, so we would not want to simplify it to the point where functionality is lost.

I found one comment particularly elucidating: “You do need to understand how physical archives work to negotiate the resource, but in terms of teaching this actually makes it really useful as a way to teach students to use a physical archive.”  I think this is very true: archives are catalogued in a certain way, that may not be immediately obvious to someone new to them. The hierarchy gives important context but can make navigation more complicated. The fact that some large collections have a short summary description and other smaller archives have a detailed item-level description adds to the confusion.

One negative comment that we got maybe illustrates the problem with relevance ranking: “It is terribly unhelpful! It gives irrelevant stuff upfront, and searches for one’s terms separately, not together.” You always feel bad about someone having such a bad experience, but it is impossible to know if you could easily help the individual by just suggesting a slightly different search approach, or whether they are really looking for archival material at all. This particular user was a retired person undertaking family history, and they couldn’t access a specific letter they wanted to find. Relevance ranking is always tricky – it is not always obvious why you get the results that you do, but on the whole we’ve had positive comments about relevance ranking, and it is not easy to see how it could be markedly improved.  The Hub automatically uses AND for phrase searches, which is fairly standard practice. If you search for ‘gold silver’ you will probably get the terms close to each other but not as a phrase, but if you search for ‘cotton mills’ you will get the phrase ranked higher than e.g. ‘mill-made cotton’ or ‘cotton spinning mill’.  One of the problems is that the phrase may not be in the title, although the title is ranked higher than other fields overall. So, you may see in your hit list ‘Publication proposals’ or ‘Synopses’ and only see ‘cotton mills’ if you go into the description. On the face of it, you may think that the result is not relevant.

screenshot of survey showing what people value
What do you most value about the Archives Hub?

All of our surveys have clearly indicated that a comprehensive service providing detailed descriptions of materials is what people want most of all. It seems to be more important than providing digital content, which may indicate an acknowledgement from many researchers that most archives are not, and will not be, digitised. We also have some evidence from focus groups and talking to our contributors that many researchers really value working with physical materials, and do not necessarily see digital surrogates as a substitute for this. Having said that, providing links to digital materials still ranks very highly in our surveys. In the 2018 survey we asked whether researchers prefer to search physical and digital archives separately or together, in order to try to get more of a sense of how important digital content is. Respondents put a higher value on searching both together, although overall the results were not compelling one way or the other. But it does seem clear that a service providing access to purely digital content is not what researchers want. One respondent cited Europeana as being helpful because it provided the digital content, but it is unclear whether they would therefore prefer a service like Europeana that does not provide access to anything unless it is digital.

Searching by name, subject and place are clearly seen as important functions. Many of our contributors do index their descriptions, but overall indexing is inconsistent, and some repositories don’t do it at all. This means that a name or subject search inevitably filters out some important and relevant material. But in the end, this will happen with all searches. Results depend upon the search strategy used, and with archives, which are so idiosyncratic, there is no way to ensure that a researcher finds everything relating to their subject.  We are currently working on introducing name records (using EAC-CPF). But this is an incredibly difficult area of work. The most challenging aspect of providing name records is disambiguation. In the archives world, we have not traditionally had a consistent way of referring to individuals. In many of the descriptions that we have, life dates are not provided, even when available, and the archive community has a standard (NCA Rules) that it not always helpful for an online environment or for automated processing. It actually encourages cataloguers to split up a compound or hyphenated surname in a way that can make it impossible to then match the name. For example, what you would ideally want is an entry such as ‘Sackville-West, Victoria Mary (1892-1962) Writer‘, but according to the NCA Rules, you should enter something like ‘West Victoria Mary Sackville- 1892-1962 poet, novelist and biographer‘. The epithet is always likely to vary, which doesn’t help matters, but entering the name itself in this non-standard way is particularly frustrating in terms of name matching.  On the Hub we are encouraging the use of VIAF identifiers, which, if used widely, would massively facilitate name matching. But at the moment use is so small that this is really only a drop in the ocean. In addition, we have to think about whether we enable contributors to create new name records, whether we create them out of archive descriptions, and how we then match the names to names already on the Hub, whether we ingest names from other sources and try to deal with the inevitable variations and inconsistencies.  Archivists often refer to their own store of names as ‘authorities’ but in truth there is often nothing authoritative about them; they are done following in-house conventions. These challenges will not prevent us from going forwards with this work, but they are major hurdles, and one thing is clear: we will not end up with a perfect situation. Researchers will look for a name such as ‘Arthur Wellesley’ or ‘Duke of Wellington’ and will probably get several results. Our aim is to reduce the number of results as much as we can, but reducing all variations to a single result is not going to happen for many individuals, and probably for some organisations. Try searching SNAC (http://snaccooperative.org/), a name-based resource, for Wellington, Arthur Wellesley, to get an idea of the variations that you can get in the user interface, even after a substantial amount of work to try to disambiguate and bring names together.

The 2018 survey asked about the importance of providing information on how to access a collection, and 75% saw this as very important. This clearly indicates that we cannot assume that people are familiar with the archival landscape. Some time ago we introduced a link on all top-level entries ‘how to access these materials’. We have just changed that to ‘advice on accessing these materials’, as we felt that the former suggested that the materials are readily accessible (i.e. digital), and we have also introduced the link on all description pages, down to item-level. In the last year, the link has been clicked on 11,592 times, and the average time spent on the resulting information page is 1 minute, so this is clearly very important help for users. People are also indicating that general advice on how to discover and use archives is a high priority (59% saw this as of high value). So, we are keen to do more to help people navigate and understand the Archives Hub and the use of archives. We are just in the process of re-organising our ‘Researching‘ section of the website, to help make it easier to use and more focussed.

There were a number of suggestions for improvements to the Hub. One that stood out was the need to enable researchers to find archives from one repository. At the moment, our repository filter only provides the top 20 repositories, but we plan to extend this. It is partly a case of working out how best to do it, when the list of results could be over 300. We are considering a ‘more’ link to enable users to scroll down the list. Many other comments about improvements related back to being more comprehensive.

One respondent noted that ‘there was no option for inexperienced users’. It is clear that a number of users do find it hard to understand. However, to a degree this has to reflect the way archives are presented and catalogued, and it is unclear whether some users of the Hub are aware of what sort of materials are being presented to them and what their expectations are. We do have a Guide to Using Archives specifically for beginners, and this has been used 5,795 times in the last year, with consistently high use since it was introduced. It may be that we should give this higher visibility within the description pages.

Screenshot of Hub page on using archives
Guide to Using Archives

What we will do immediately as a result of the survey is to link this into our page on accessing materials, which is linked from all descriptions, so that people can find it more easily. We did used to have a ‘what am I looking at?’ kind of link on each page, and we could re-introduce this, maybe putting the link on our ‘Archive Collection’ and ‘Archive Unit’ icons.

 

 

 

It is particularly important to us that the survey indicated people that use the Hub do go on to visit a repository. We would not expect all use to translate into a visit, but the 2018 survey indicated 25% have visited a repository and 48% are likely to in the future. A couple of respondents said that they used it as a teaching tool or a tool to help others, who have then gone on to visit archives. People referred to a whole range of repositories they have or will visit, from local authority through to university and specialist archives.

screenshot of survey results
I have found material using the Archives Hub that I would not otherwise have discovered

59% had found materials using the Hub that they felt they would not have found otherwise. This makes the importance of aggregation very clear, and probably reflects our good ranking on Google and other search engines, which brings people into the Archive Hub who otherwise may not have found it, and may not have found the archives otherwise.

 

 

We’re supporting EXPLORE YOUR ARCHIVE

Logo, Explore Your Archives campaign
Explore Your Archive, http://www.exploreyourarchive.org, developed by The Archives and Records Association (UK and Ireland) and The National Archives, is the biggest ever public awareness campaign by the archives sector of the UK and Ireland.

From 16 November there will be hundreds of events and activities taking place in all kinds of archives. Those who work in archives will also be sharing some of their wonderful stories and amazing treasures. The public are being encouraged not just to visit an archive or explore archival collections online, but to understand more of the vital role which archives play in education, business, transparency and identity.

How the Hub fits in

The Archives Hub is a gateway to archives held at over 220 institutions and organisations across the UK.

Explore…

Using our map to discover archives close to you:
http://archiveshub.ac.uk/contributorsmap/.

Search….

Using the Hub search at http://archiveshub.ac.uk/search.html to uncover other collections.

Discover…

Image: Ballerina advert.
© TSB savings advert, c. 1950. Lloyds Banking Group Archives.

A rich variety of content: The breadth of content on the Hub highlights how archives are integral to historical and cultural awareness. Our contributors include Universities, business archives, charities, local government, libraries, museums and cathedrals.

Here are just a few of the collections you can find:

From the Ancient…

Canterbury Cathedral: Records of the Dean and Chapter of Canterbury Cathedral, c800 to present. http://archiveshub.ac.uk/data/gb054-cca/dcc

The collection of records of Canterbury Cathedral includes material dating from the early Middle Ages right up to the present day. The material relates to the Cathedral’s estates and reflects the activities of the Dean and Chapter and its staff.

… to the Contemporary

Archive of the National Theatre of Scotland, 2006 to present.
http://archiveshub.ac.uk/data/gb247-stants

Launched in February 2006 and billing itself as a ‘theatre without walls’, the National Theatre of Scotland has no building of its own and operates within the existing infrastructure of Scottish theatre. Material is held at Glasgow University Library and includes programmes, press-cuttings, reviews and scripts.

From the Large…

Royal Greenwich Observatory: Records and Papers, 1675-1998.
http://archiveshub.ac.uk/data/gb012-ms.rgo

With around one kilometre of material, the records consist of all the surviving historical paper records of the Royal Observatory. Collections include: papers of the Astronomers Royal and telescope construction projects, management and observations, including the William Herschel Telescope and Radcliffe Observatory.

… to the Small

Gaelic Manuscripts, c. 1732-c. 1869. http://archiveshub.ac.uk/data/gb752-gm

One reel of microfilm comprising images of 23 original Gaelic manuscripts, relating to Ireland and to the activities of Irishmen at home and abroad, held at Queen’s University Belfast. It consists largely of fragments of both religious and secular verse, topographical poems and other tracts and tales dating mainly from the 18th and 19th centuries.

From the Young…

Children’s Society, 18th century – 21st century.
http://archiveshub.ac.uk/data/gb2180-tcs

The Children’s Society Archive comprises the records created and managed by The Children’s Society (titled The Waifs and Strays Society from 1881 to 1946). The majority of the collections date from the organisation’s founding in 1881. This includes a large quantity of visual material in the form of photographs and publicity material, as well as some audio-visual material.

… to the Older generation

Scrapbooks of Barking and Dagenham Branch of Age Concern, 2002-2008.
http://archiveshub.ac.uk/data/gb0350-bd58

This collection comprises six scrapbooks, containing newspaper cuttings on the Barking and Dagenham Branch of Age Concern, relating to events, as well as issues affecting elderly people in the borough.

From Northern Scotland…

Thomas S Muir, Architectural notes on churches on Scottish islands, 1850-1872. http://archiveshub.ac.uk/data/gb227-msbr783.m9

Thomas S Muir (1802-1888) worked for most of his life as a book-keeper in Edinburgh. All his spare time was devoted to his passion for early Scottish churches, visiting all the locations where ruins were to be found, including even the most inaccessible islands. The volume, ‘Ecclesiological notes on some of the islands of Scotland’, comprises detailed architectural descriptions, with line drawings, of features of churches and other ecclesiastical remains.

… to the Southerly Channel Islands

Image: Jersey Archive.
Image: Jersey Archive.

Archive of the States of Jersey, 1603 – 2010.
http://archiveshub.ac.uk/data/gb1539-c

The States of Jersey collection includes the minutes, correspondence, reports and acts of the States of Jersey. Also, the minutes of the different Committee’s of the States including Agriculture, Education, Defence, Housing, Social Security, Finance, Harbours and Airports, Health and Social Services, Tourism, Home Affairs, Planning and Environment, Economic Development and Policy and Resources.

From the Frozen Antarctic…

British Australian New Zealand Antarctic Research Expedition, 1929-1934. http://archiveshub.ac.uk/data/gb015-banzare

The collection comprises of press cuttings relating to the British Australian New Zealand Antarctic Research Expedition, 1929-1931.

…to the Heat of Africa

Africa 95, c. 1957-1996. http://archiveshub.ac.uk/data/gb102-africa95

Africa 95 was founded in 1992 to initiate and organise a nationwide season of the arts of Africa to be held in the UK in the last quarter of 1995. Printed material, photographs, and slides of the work of artists from Algeria, Egypt, Ethiopia, Ghana, Ivory Coast, Kenya, Morocco, Nigeria, Senegal, South Africa, Sudan, Uganda,Tanzania, Tunisia, Zambia, Zimbabwe, and the USA.

From the Fire brigade…

Fire Brigades Union, 1919-1997. http://archiveshub.ac.uk/data/gb152-mss.346

The Fire Brigades Union (FBU) was founded in 1918 as the Firemen’s Trade Union. The union began its life as a body very much based around the London area but soon expanded to include provincial brigades. The collection includes: Executive Council minutes, annual accounts, subject files (including Sizewell Public Inquiry, 1980s) and the national strike, 1977.

…to the Water board

Records relating to Derwent Valley Water Board, 1899-1974.
http://archiveshub.ac.uk/data/gb159-dvw

The collection comprises a full series of indexed bound minute books (1899-1974) containing annual statements of accounts, and other specific reports. Also, maps and plans relate to specific elements of intended works such as the building of Ladybower Reservoir in Derbyshire.

From the Arts…

D.H. Lawrence (1885-1930) Collection, 1865-1999.
http://archiveshub.ac.uk/data/gb159-la

The Lawrence Collection contains extensive materials by and about D.H. Lawrence, ranging in date from his childhood and including original manuscripts and his correspondence.

… to Science

Clifford Hiley Mortimer Collection, 1937-1980.
http://archiveshub.ac.uk/data/gb986-morc

This collection contains river and lake data in rivers in Britain, and correspondence regarding flows, inflows, chemical analyses and chemical stratification. It also includes mud samples!

From War…

Image: Poppy, World War One
© Image is in the public domain: papaver in High Wood, [tinelot@pobox.com Tinelot Wittermans]
Daniel Dougal First World War Diaries, 1914-1918.
http://archiveshub.ac.uk/data/gb133-ddd

Diaries of Daniel Dougal, which detail his service as an army doctor on the Western Front during the First World War. Dougal rose to become Deputy Assistant Director of Medical Services, 34th Division of the British Army, and his diaries provide important information on the operation of Army medical services.

… to Peace

Campaign for Nuclear Disarmament (CND), 1958-2008.
http://archiveshub.ac.uk/data/gb097-campaignfornucleardisarmament

The Campaign for Nuclear Disarmament (CND) is a non party-political British organisation advocating the abolition of nuclear weapons worldwide. Includes papers relating to the CND’s constitution, minutes of National Council, National Executive Committee annual conference papers and papers relating to Aldermaston marches and other demonstrations.

These are selected descriptions: there’s much more to discover by exploring the Hub! And we’re adding more descriptions every week. If you’d like to add your descriptions to the Hub, now’s a great time! See Be part of something bigger for information on how we can help you expose your collections to a worldwide audience.

Also of interest:

Work in an archive and want to be involved in the Explore Your Archive campaign?

It’s not too late to take part, visit: www.nationalarchives.gov.uk/yourtoolkit.

More on Collections

Image of Guardian staff
Guardian billing room staff, 1921. From the Guardian News and Media Archive. Copyright: Guardian.

Browse our Features pages to learn about the breadth of material described on the Hub: http://archiveshub.ac.uk/features/

Supporting Historians: responding to changing research practices

image of camera lensThis post picks out some highlights from a report from Ithaka S+R, “Supporting the Changing Research Practices of Historians” by Roger C Schonfeld and Jennifer Rutner (December 2012). It concentrates on findings that are of particular relevance for archivists and for discovery. The report is recommended reading. It is a US study, but clearly there are strong similarities with other countries.

The report finds that underlying research methods are still broadly as they were but practices have changed considerably: “Based on interviews with dozens of historians, librarians, archivists, and other support services providers, this project has found that the underlying research methods of many historians remain fairly recognizable even with the introduction of new tools and technologies, but the day to day research practices of all historians have changed fundamentally.”

It goes on to summarise the improvements that archives might make to meet changing needs, none of which are unexpected: “For archives, we recommend ongoing improvements to access through improved finding aids, digitization, and discovery tool integration, as well as expanded opportunities for archivists to help historians interpret collections, to build connections among users, and to instruct PhD students in the use of archives.”

It is very encouraging to see the positive comments about researchers’ interactions with archivists: “Having a meeting with the archivist and librarian is really fantastic, because they help you understand what is in the archive, and what you might be able to use.” It is clear from the study that archivists have a vital role to play as key collaborators and colleagues of historians, and their value is clear: “Archivists are often able
 to hone and direct an inquiry, bringing to light items and collections that the researcher may have been unaware of.”

The study does highlight the changing nature of interactions with archival material, as a result of the use of digital cameras in particular, which enables the analytical work to take place elsewhere. It is generally felt to be a convenient and time-saving option, enabling long-term interaction with resources outside of the reading room. This development is actually described as “the single most significant shift in research practices among historians.” It raises questions about whether the role of the archivist changes when the analytical work is displaced from the archive, as archivists may have less opportunity for intellectual engagement with researchers.  The study does highlight a possible issue with digital copies, namely the separation of metadata from content, where the researcher has hundreds of images and needs to organise them constructively, and it also found that scholars are struggling to work with digitised non-textual content effectively.

The ability to find time for research trips was a primary challenge for many researchers. “Interviewees repeatedly emphasized that the amount of time they are able to spend in the archives shapes the nature of the interaction with the sources significantly.” Because most struggle to find time for research trips,  digitised sources are hugely beneficial.

The study found that digitised finding aids help researchers to “travel more strategically”. It suggests that high-quality finding aids may become more important as researchers move more towards photographic visits to archives, rather than serendipitous visits. This connection is something I have not thought about before, and I would be very interested to hear what archivists think about this idea.

Of major relevance for a service like the Archives Hub is the conclusion about finding aids:

“The use of online finding aids greatly facilitates, and sometimes displaces, these visits. If a “good” finding aid is readily available online, this might make a scouting visit unnecessary, depending on the importance of the archive to the research project. In some cases, researchers were able to rule out a visit to an archive based on the online finding aids, and re-purpose funds and effort to tracking down other sources for the project.”

This study is a clear endorsement for our belief (which, I should say, is also backed up by our own researcher surveys) that finding aids play a role not only in identifying and prioritising sources, but also in providing enough information in themselves to make a visit unnecessary. As well as this, they may have a kind of positive negative effect: the researcher knows that materials can be ruled out.  The study strongly emphasised the need for “searchable databases” and “centralized searching” and participants talked about the problem with locating each collection independently, especially across the diverse types of archive repository: “The process of identifying archives – in some cases small, local archives or international archives – can present an amazing challenge to researchers.” Clearly comprehensive cross-searching search tools are a huge boon to researchers.

In terms of discovery, Google is clearly a major tool and there was a feeling that it was the most comprehensive discovery tool, as well as being convenient and easy to use. It is often used at the start of a searching process.: “Generally, historians discover finding aids through Google searches and archive websites.” There is a clear demand for more descriptions online: “The general consensus among interviewees was that more online finding aids would greatly benefit their research, and that archives should continue to make efforts to make these accessible online. Continued and expanded efforts to develop finding aids more efficiently and to make them available digitally would seem to support the needs of historians for improved access.”

In terms of PhD students (and maybe others who are inexperienced researchers), the study found issues with the use of archives and other sources:

“Interviews with PhD candidates indicated that there is often little support for them in learning about new research methods or practices, either in their department or elsewhere at their institution, of which they are aware. While the subject matter treated by historians continues to diversify dramatically, new methodologies develop, and research practices change rapidly, it is clearly critically important that students have a grounding in the methods and practices of the field.” The Archives Hub has recently produced a brief Guide to Using Archives for the Inexperienced, and discussions on the archives email list showed just how much this is an important topic for archivists and how there was a general consensus that  PhD students need more training on research methodologies.

Summing up, the report makes six recommendations specifically for Archives:

1. More online finding aids
2. More digitisation
3. Discovery tools that promote cross-searching, crossing institutional boundaries and encompassing small and local record offices
4. Adequate resources for ensuring the expertise of the archivist continues to be available, enabling archivists to be active interpreters of the collections
5. Adapting to and facilitating the use of digital cameras and scanners in reading rooms
6. Training PhD students in the use of archives

There is a great deal more of interest and relevance in the report around searching, Google Scholar, the use of the academic library, organising and managing research, citation management and digital research methods. It is very well worth reading.

 

An evaluation of the use of archives and the Archives Hub

This blog is based upon a report written by colleagues at Mimas* presenting the results of the evaluation of our innovative Linked Data interface, ‘Linking Lives‘. The evaluation consisted of a survey and a focus group, with 10 participants including PhD students and MA students studying history, politics and social sciences. We asked participants a number of questions about the Archives Hub service, in order to provide context for their thoughts on the Linking Lives interface.

This blog post concentrates on their responses relating to the use of archives, methods of searching and interpretation of results. You can read more about their responses to the Linking Lives interface on our Linking Lives blog.

Use of Archives and Primary Source Materials

We felt that it was important to establish how important archives are to the participants in our survey and focus group. We found that “without exception, all of the respondents expressed a need for primary resources” (Evaluation report). One respondent said:

“I would not consider myself to be doing proper history if I wasn’t either reinterpreting primary sources others had written about, or looking at primary sources nobody has written about. It is generally expected for history to be based on primary sources, I think.” (Survey response)

One of the most important factors to the respondents was originality in research. Other responses included acknowledgement of how archives give structure to research, bringing out different angles and perspectives and also highlighting areas that have been neglected. Archives give substance to research and they enable researchers to distinguish their own work:

“Primary sources are very valuable for my research because they allow me to put together my own interpretation, rather than relying on published findings elsewhere.” (Survey response)

Understanding of Archives

It is often the case that people have different perceptions of what archives are, and with the Linking Lives evaluation work this was confirmed. Commonly there is a difference between social scientists and historians; the former concentrating on datasets (e.g. data from the Office of National Statistics) and the latter on materials created during a person’s life or the activities of an organisation and deemed worthy of permanently preserving. The evaluation report states:

“The participants that had a similar understanding of what an archive was to the Archive Hub’s definition had a more positive experience than those who didn’t share that definition.”

This is a valuable observation for the work of the Hub in a general sense, as well as the Linking Lives interface, because it demonstrates how initial perceptions and expectations can influence attitudes towards the service. In addition, the evaluation work highlighted another common fallacy: that an archive is essentially a library. Some of the participants in the survey expected the Archives Hub to provide them with information about published sources, such as research papers.

These findings highlight one of the issues when trying to evaluate the likely value of an innovative service: researchers do not think in the same language or with the same perspectives as information professionals. I wonder if we have a tendency to present services and interfaces modelled from our own standpoint rather than from the standpoint of the researcher.

Search Techniques and Habits

“Searches were often not particularly expansive, and participants searched for specific details which were unique to their line of enquiry” (Evaluation report). Examples include titles of women’s magazines, personal names or places. If the search returned nothing, participants might then broaden it out.

Participants said they would repeatedly return to archives or websites they were familiar with, often linked to quite niche research topics. This highlights how a positive experience with a service when it is first used may have a powerful effect over the longer term.

The survey found that online research was a priority:

“Due to conflicting pressures on time and economic resources, online searching was prevalent amongst the sample. Often research starts online and the majority is done online. Visits to see archives in person, although still seen as necessary, are carefully evaluated.”  (Evaluation report)

The main resources participants used were Google and Google Scholar (the most ubiquitous search engines used) as well as The National Archives, Google Books and ESDS. Specialist archives were referred to relating to specific search areas (e.g. The People’s History Museum, the Wellcome Library, the Mass Observation Archive).

Thoughts and Comments About the Archives Hub

All participants found the Hub easy to navigate and most found locating resources intuitive. As part of the survey we asked the participants to find certain resources, and almost all of them provided the right answers with seemingly no difficulty.

“It is clear. The descent of folders and references at the top are good for referencing/orientating oneself. The descriptions are good – they obviously can’t contain everything that could be useful to everyone and still be a summary. It is similar to other archive searches so it is clear.” (Survey response, PhD history student)

The social scientists that took part in the evaluation were less positive about the Archives Hub than the historians. Clearly many social science students are looking for datasets, and these are generally not represented on the Hub. There was a feeling that contemporary sources are not well represented, and these are often more important to researchers in fields like politics and sociology. But overall comments were very positive:

“…if anyone ever asked about how to search archives online I’d definitely point them to the Archives Hub”.

“Useful. It will save me making specific searches at universities.”

Archives Hub Content

It was interesting to see the sorts of searches participants made. A search for ‘spatial ideas’ by one participant did not yield useful results. This would not surprise many archivists – collections are generally not catalogued to draw out such concepts (neither Unesco nor UKAT have a subject heading for this; LCSH has ‘spatial analysis’). However, there may well be collections that cover a subject like this, if the researcher is prepared to dig deep enough and think about different approaches to searching. Another participant commented that “you can’t just look for the big themes”. This is the type of search that might benefit from us drawing together archive collections around themes, but this is always a very flawed approach. This is one reason that we have Features, which showcase archives around subjects but do not try to provide a ‘comprehensive’ view onto a subject.

This kind of feedback from researchers helps us to think about how to more effectively present the Archives Hub. Expectations are such an important part of researchers’ experiences. It is not possible to completely mitigate against expectations that do not match reality, but we could, for example, have a page on ‘The Archives Hub for Social Scientists’ that would at least provide those who looked at it with a better sense of what the Hub may or may not provide for them (whether anyone would read it is another matter!).

This survey, along with previous surveys we have carried out, emphasises the importance of a comprehensive service and a clear scope (“it wasn’t clear to me what subjects or organisations are covered”). However, with the nature of archives, it is very difficult to give this kind of information with any accuracy, as the collections represented are diverse and sometimes unexpected. in the end you cannot entirely draw a clear line around the scope of the Archives Hub, just like you cannot draw a clear line around the subjects represented in any one archive. The Hub also changes continuously, with new descriptions added every week. Cataloguing is not a perfect art; it can draw out key people, places, subjects and events, but it cannot hope to reflect everything about a collection, and the knowledge a researcher brings with them may help to draw out information from a collection that was not explicitly provided in the description. If a researcher is prepared to spend a bit of time searching, there is always the chance that they may stumble across sources that are new to them and potentially important:

“…another student who was mainly focused on the use of the Kremlin Archives did point out that [the Archives Hub] brought up the Walls and Glasier papers, which were new to [them]”.

Even if you provide a list of subjects, what does that really mean? Archives will not cover a subject comprehensively; they were not written with that in mind; they were created for other purposes – that is their strength in many ways – it is what makes them a rich and exciting resource, but it does not make it easy to accurately describe them for researchers. Just one series of correspondence may refer to thousands of subjects, some in passing, some more substantially, but archivists generally don’t have time to go through an entire series and draw out every concept.

If the Archives Hub included a description for every archive held at an HE institution across the UK, or for every specialist repository, what would that signify? It would be comprehensive in one sense, but in a sense that may not mean much to researchers. It would be interesting to ask researchers what they see as ‘comprehensive resources’ as it is hard to see how these could really exist, particularly when talking about unpublished sources.

Relevance of Search Results

The difficulties some participants had with the relevance of results comes back to the problem of how to catalogue resources that often cover a myriad of subjects, maybe superficially, maybe in detail; maybe from a very biased perspective. If a researcher looks for ‘social housing manchester’ then the results they get will be accurate in a sense – the machine will do its job and find collections with these terms, and there will be weighting of different fields (eg. the title will be highly weighted), but they still may not get the results they expect, because collections may not explicitly be about social housing in Manchester. The researcher needs to do a bit more work to think about what might be in the collection and whether it might be relevant. However, cataloguers are at fault to some extent. We do get descriptions sent to the Hub where the subjects listed seem inadequate or they do not seem to reflect the scope and content that has been provided. Sometimes a subject is listed but there is no sense of why it is included in the rest of the description. Sometimes a person is included in the index terms but they are not described in the content. This does not help researchers to make sense of what they see.

I do think that there are lessons here for archivists, or those who catalogue archives. I don’t think that enough thought is gives to the needs of the researcher. The inconsistent use of subject terms, for example, and the need for a description of the archive to draw out key concepts a little more clearly. Some archivists don’t see the need to add index terms, and think in terms of technologies like Google being able to search by keyword, therefore that is enough. But it isn’t enough. Researchers need more than this. They need to know what the collection is substantially about, they need to search across other collections about similar subjects. Controlled vocabulary enables this kind of exploratory searching. There is a big difference between searching for ‘nuclear disarmament’ as a keyword, which means it might exist anywhere within the description, and searching for it as a subject – a significant topic within an archive.

 

*Linking Lives Evaluation: Final Report (October 2012) by Lisa Charnock, Frank Manista, Janine Rigby and Joy Palmer

Archives and the Researchers of Tomorrow

“In 2009, the British Library and JISC commissioned the three-year Researchers of Tomorrow study, focusing on the information-seeking and research behaviour of doctoral students in ‘Generation Y’, born between 1982 and 1994 and not ‘digital natives’. Over 17,000 doctoral students from more than 70 higher education institutions participated in the three annual surveys, which were complemented by a longitudinal student cohort study.” (Taken from http://www.jisc.ac.uk/publications/reports/2012/researchers-of-tomorrow#exec_sum).

This post picks up on some aspects of the study, particularly those that are  relevant to archives and archivists. I am assuming that archivists come into the category of libraries as being ‘library professionals’, at least to an extent, though our profession is not explicitly mentioned. I would recommend reading the report in full as it offers some useful insights into the research behaviour of an important group of researchers.

What is heartening about this study is that the findings confirm that generation Ydoctoral students are

Image from: http://www.freedigitalphotos.net

sophisticated information-seekers and users of complex information sources“.  The study does conclude that information seeking behaviour is becoming less reliant on the support of libraries and library staff, which may have implications for the role of libraries and archive professionals, but “library staff assistance with finding/retrieving difficult-to-access resources” was seen as one of the most valuable research support resources available to students, although it was a relatively small proportion of students that used this service. There was a preference for this kind of on-demand 1-2-1 support rather than formal training sessions. One of the students said ” the librarians are quite possibly the most enthusiastic and helpful people ever, and I certainly recommend finding a librarian who knows their stuff, because I have had tremendous amounts of help with my research so far, just simply by asking my librarian the right question.

The survey concentrated on the most recent information-seeking activity of students, and found that most were not seeking primary sources, but rather secondary sources (largely journal articles and books).

This apparent and striking dependence on published research resources implies that, as the basis for their own analytical and original research, relatively few doctoral students in social sciences and arts and humanities are using ‘primary’ materials such as newspapers, archival material and social data.

This finding was true across all subject disciplines and all ages. The study found that about 80% of arts and humanities students were looking for any bibliographic references on their topic or specific publications, while only 7% were looking for non-published archival material. It seems that this reliance on published information is formed early on in their research, as students lay the groundwork for their PhD. Most students said they used academic libraries more in their first year of study – whether visiting or using the online services, so maybe this is the time to engage with students and encourage them to use more diverse sources in the longer-term.

A point that piqued my interest was that the arts and humanities students visiting other collections in order to use archival sources would do so even if  “many of the resources they required had been digitised“, but this point was not explained further, which was frustrating. Is it because they are likely to use a mixture of digital and non-digital sources? Or maybe if they locate digital content it stimulates their interest to find out more about using primary sources?

Around 30% used Google or Google Scholar as their main information gathering tool,  although arts and humanities students sourced their information from a wider spread of online and offline sources, including library catalogues.

One thing that concerned me, and that I was not aware of, was the assertion that “current citation-based assessment and authenticity criteria in doctoral and academic research discourage the citing of non-published or original material“. I would be interested to know why this is the case, as surely it should be actively encouraged rather than discouraged? How does this fit with the need to demonstrate originality in research?

Students rely heavily on help from their supervisors early on in their research, and supervisors have a great influence on their information and resource use. I wonder if they actively encourage the use of primary sources as much as we would like?  I can’t help thinking that a supervisor enthusiastically extolling the importance and potential impact of using archives would be the best way to encourage use.

There continues to be a feeling amongst students, according to this study, that “using social media and online forums in research lacks legitimacy” and that these tools are more appropriate within a social context. The use of twitter, blogs and social bookmarking was low (2009 survey: 13% of arts and humanities students had used and valued Twitter) and use was more commonly passive than active. There was a feeling that new tools and applications would not transform the way that the students work, but should complement and enhance established research practices and behaviour. However, it should be noted that use of ‘Web 2.0’ tools increased over the 3 years of the study, so it may be that a similar study carried out in 5 years time would show significantly different behaviour.

Students want training in research geared towards their own subject area, preferably face-to-face. Online tutorials and packages were not well used. The implication is that training should be at a very local level and done in a fairly informal way. Generic research skills are less appealing. Research skills training days are valued, but if they are poor and badly taught, the student feels their time is wasted and may be put off trying again. Students were quite critical of the quality and utility of the training offered by their university mainly because (i) it was not pitched at the right level (ii) it was too generic or (iii) it was not available on demand. Library-led training sessions got a more positive response, but students were far less likely to take up training opportunities after the first year of their PhD.  Training in the use of primary sources was not specifically covered in the report, though it must be supposed this would be (should be!) included in library-led training.

The study indicated that students dislike reading (as opposed to scanning) on screen. This suggests that it is important to provide the right information online, information that is easy to scan through, but worth providing a PDF for printout, especially for detailed descriptions.

One quote stood out for me, as it seems to sum up the potential danger of modern ways of working in terms of approaches to more in-depth analysis:

“The problem with the internet is that it’s so easy to drift between websites and to absorb information in short easy bites that at times you forget to turn off the computer, rest your eyes from screen glare and do some proper in-depth reading. The fragments and thoughts on the internet are compelling (addictive, even), and incredibly useful for breadth, but browsing (as its name suggests) isn’t really so good for depth, and at this level depth is what’s required.” (Arts and humanities)

We do often hear about the internet, or computers, tending to reduce levels of concentration. I think that this point is subtly different though – it’s more about the type of concentration required for in-depth work, something that could be seen as essential for effective primary source research.

Conclusions

We probably all agree that we can always do more to to promote the importance of archives to all potential users, including doctoral students. Certainly, we need to make it easier for them to discovery sources through the usual routes that they use, so for one thing ensuring we have a profile via Google and Google Scholar. Too many archives still resist this requirement, as if it is somehow demeaning or too populist, or maybe because they are too caught up in developing their own websites rather than thinking about search engine optimisation, or maybe it is just because archivists are not sure how to achieve good search engine rankings?

Are we actively promoting a low-barrier, welcome and open approach? I recall many archive institutions that routinely state that their archives are ‘open to bone fide researchers only’. Language like that seems to me to be somewhat off putting. Who are the ‘non-bone fide’ researchers that need to be kept out? This sort of language does not seem conducive to the principle of making archives available to all.

The applications we develop need to be relatively easy to absorb into existing research work practices, which will only change slowly over time. We should not get too caught up in social networks and Web 2.0 as if these are ‘where it’s at’ for this generation of students. Maybe the approaches to research are generally more traditional than we think.

The report itself concludes that the lack of use of primary sources is worrying and requires further investigation:

There is a strong case for more in-depth research among doctoral students to determine whether the data signals a real shift away from doctoral research based on primary sources compared to, say, a decade ago. If this proves to be the case there may be significant implications for doctoral research quality related to what Park described as “widely articulated tensions between product (producing a thesis of adequate quality) and process (developing the researcher), and between timely completion and high quality research“.

 

 

 

 

 

Linking Cultural Heritage Data

Last week I attended a meeting at the British Museum to talk with some museum folk about ways forward with Linked Data. It was a follow up to a meeting I organised on Archives and Linked Data,  and it was held under the auspices of the CIDOC Documentation Standards Working Group. The group consisted of me, Richard Light (Museum consultant), Rory McIllroy (ULCC), Jeremy Ottevanger (Imperial War Museum), Jonathan Whitson Cloud (British Museum), Julia Stribblehill (British Museum), and briefly able to join us was Pete Johnston (worked on the Locah project and now on the Linking Lives Linked Data project).

It proved to be a very pleasant day, with lots of really useful discussion. We spent some time simply talking about the differences – and similarities – in our perspectives and in our data. One of our aims was to start to create more links and dialogue between our sectors, and in this regard I think that the day was undoubtedly successful.

To start with our conversation ranged around various issues that our domains deal with. For example, we talked a bit about definitions, and how important they are within a museum context. For example, if you think about a collection of coins, defining types is key, and agreeing what those types are and what they should be called could be a very significant job in itself.  We were thinking about this in the context of providing authoritative identifiers for these types, so that different data sources can use the same terms.  Effectively identifying entities such as names and places are vital for museums, libraries and archives, of course, and then within the archive community we could also provide authoritative identifiers for things like levels of description. Workign together to provide authoritative and persistent URIs for these kinds of things could be really useful for our communities.

We talked about the value of promoting ‘storytelling’ and the limitations that may inhibit a more event-based approach. DBPedia (Wikipedia as Linked Data) may be at the centre of the Linked Data Cloud, but it may not be so useful in this context because it cannot chart data over time. For example, it can give you the population of Berlin, but it cannot give you the changing population over time. We agreed that it is important to have an emphasis on this kind of timeline approach.

We spent a little while looking at the British Museum’s departmental database, which includes some archives, but treats them more as objects (although the series they form a part of is provided, this contextual information is not at the fore – there is not a series description as such). The proposal is to find a way to join this system up with the central archive, maybe through the use of Linked Data.

We touched upon the whole issue of what a ‘collection’ is within the museum context, which is often more about single objects, and reflected on the challenge of how to define a collection, because even something like a cup and saucer could be seen as a collection…or it is one object?…or is a full tea set a collection?

For archivists, quite detailed biographical information is often part of the description of a collection. We do this in order to place the collection within a context. These biographical histories often add significant value to our descriptions, and sometimes the information in them may be taken from the archive collection, so new information may be revealed. Museums don’t tend to provide this kind of detail, and are more likely to reference other sources for the researcher to use to find out about individuals or organisations. In fact, referencing external sources is something archives are doing more frequently, and Linked Data will encourage this kind of approach, and may save us time in duplicating effort creating new biographical entries for the same person. (There is also the move towards creating separate name authorities, but this also brings with it big challenges around sharing data and using the same authorities).

We moved on to talk about Linked Data more specifically, and thought a bit about whether the emphasis should be on discovery or the quality and utility of what you get when you are presented with the results. We generally felt that discovery was key because Linked Data is primarily about linking things together in order to make new discoveries and take new directions in research.

book showing design patterns
Wellcome Library, London

One of the main aims of the day was to discuss the idea of the use of design patterns to help the cultural heritage community create and use Linked Data. It would facilitate the process of querying different graphs and getting reasonably predictable information back, if we could do things in common where possible. Richard has written up some thoughts about design patterns from a museum perspective and there is a very useful Linked Data Patterns book by Leigh Dodds and Richard Davis. We felt this could form a template for the sort of thing that we want to do. We were well aware that this work would really benefit from a cross-domain approach involving museums, archives, libraries and galleries, and this is what we hope to achieve.

We spoke briefly about the value of something like OpenCalais and wondered whether a cultural heritage version of this kind of extraction tool would be useful. If it was more tailored for our own sectors, it may be more useful in creating authorities, so that we can refer to things in a common way, as a persistent URL would be provided for the people, subjects, concepts, that we need to describe. We considered the scenario that people may go back to writing free text and then intelligent tools will extract concepts for them.

We concluded that it would be worth setting up a Wiki to encourage the community to get involved in exploring the idea of Linked Data Patterns. We thought it would be a good idea to ask people to tell us what they want to know – we need the real life questions and then we can think about how our data can join up to answer those questions. Just a short set of typical real-life questions would enable us to look at ways to link up data that fit a need, because a key question is whether existing practices are a good fit for what researchers really want to know.

Online Survey Results (2011)

We would like to share some of the results of our annual online survey, which we run each year, over a 3-4 week period. We aim for about 100 responses (though obviously more would be very welcome!), and for this survey we got 92 responses. We create a pop-up invitation to fill out the survey – something we do not like to do, but we do feel that it attracts more responses than a simple link.

Context

We have a number of questions that are replicated in surveys run for Zetoc and Copac, two bibliographic JISC-funded Mimas services, and this provides a means to help us (and our funders) look at all three services together and compare patterns of use and types of user.

This year we added four questions specifically designed to help us with understanding users of the Hub and to help us plan our priorities.

We aim to keep the number of questions down to about 12 at the most, and ensure that the survey will take no longer than 10 minutes to complete. But we also want to provide the opportunity for people to spend longer and give more feedback if they wish, so we combine tick lists and radio boxes with free text comments boxes.

We take the opportunity to ask whether participants would be willing to provide more feedback for us, and if they are potentially willing, they provide their email address. This gives us the opportunity to ask them to provide more feedback, maybe by being part of a focus group.

Results of the Survey

Profile

  • The vast majority of respondents (80%) are based in the UK for their study and/or work.
  • Most respondents are in the higher education sector (60%). A substantial number are in the Government sector and also the heritage/museum sector.
  • 20% of those using the Hub are students – maybe less than we would hope, but a significant number.
  • 10% are academics – again, less than we would hope, but it may be that academics are less willing to fill in a survey.
  • 50% are archivists or other information professionals. This is a high number, but it is important to note that it includes use of the Hub on behalf of researchers, to answer their enquiries, so it could be said to represent indirect use by researchers.
  • The majority of respondents use the service once or twice a month, although usage patterns were spread over all options, from daily to less than once a month, and it is difficult to draw conclusions from this, as just one visit to the Hub website may prove invaluable for research.

graph showing value of the HubUse and Recommendation

  • A significant percentage – 26% – find the Hub ‘neither easy nor difficult’ to use, and 3% of the respondents found it difficult to use, indicating that we still need to work on improving usability (although note that a number of comments were positive about ease of use) .
  • 73% agree their work would take longer without the Hub, which is a very positive result and shows how important it is to be able to cross-search archives in this way.
  • A huge majority – 93% – would recommend the Hub to others, which is very important for us. We aim to achieve 90% positive in this response, as we believe that recommendations are a very important means for the Hub to become more widely known.

Subject Areas

We spent a significant amount of time creating a list of subjects that would give us a good indication of disciplines in which people might use the Hub. The results were:

    • History 47
    • Library & Archive Studies 33
    • English Literature 17
    • Creative & Performing Arts 16
    • Education & Research Methods 10
    • Predominantly Interdisciplinary 9
    • Geography & Environment 5
    • Political Studies & International Affairs 5
    • Modern Languages and Linguistics 4
    • Physical Sciences 4
    • Special Collections 4
    • Architecture & Planning 3
    • Biological & Natural Sciences 3
    • Communication & Media Studies 3
    • Medicine 3
    • Theology & Philosophy 3
    • Archaeology 2
    • Engineering 2
    • Psychology & Sociology 2
    • Agriculture 1
    • Law 1
    • Mathematics 1
    • Business & Management Studies 0
  • History is, not surprisingly, the most common discipline, but literature, the arts, education and also interdisciplinary work all feature highly.
  • There is a reasonable amount of use from the subjects that might be deemed to have less call for archives, showing that we should continue to promote the Hub in these areas and that archives are used in disciplines where they do not have a high profile. It would be very valuable to explore this further.

graph showing use of archival websites

  • The Hub is often used along with other archival websites, particularly The National Archives and individual record office websites, but a significant number do not use the websites listed, so we cannot assume prior knowledge of archives.
  • It would be interesting to know more about patterns of use. Do researchers try different websites, and in what order to they visit them? Do they have a sense of what the different sites offer?
  • There is still low use of the European aggregators, Europeana and APENet, although at present UK archives are not well represented on these services and arguably they do not have a high profile amongst researchers (the Hub is not yet represented on these aggregators).

Subsequent activities

  • It is interesting to note that 32% visit a record office as a result of using the Hub, but 68% do not. It would be useful to explore this further, to understand whether the use of the Hub is in itself enough for some researchers. We do know that for some people, the description holds valuable information in and of itself, but we don’t know whether the need to visit a record office, maybe some distance away, prevents use of the archives when they might be of value to the researcher.

What is of most value?

  • We asked about what is important to researchers, looking at key areas for us. The results show that comprehensive coverage still tops the polls, but detailed descriptions also continue to be very important to researchers, somewhat in opposition tograph showing what is most valuable to researchers the idea of the ‘quick and dirty’ approach. More sophisticated questioning might draw out how useful basic descriptions are compared with no description and what sort of level of detail is acceptable.
  • Links to digital content and information on related material are important, but not as important as adding more descriptions and providing a level of detail that enables researchers to effectively assess archives.
  • Searching across other cultural heritage resources at the same time is maybe surprisingly less of a priority than content and links. It is often assumed that researchers want as much diverse information as possible in a ‘one-stop shop’ approach, but maybe the issues with things like the usability of the search,  navigation, number of results and relevance ranking of results illustrate one of the main issues – creating a site that holds descriptions and links to very varied content and still ensuring it is very easily understandable and researchers know what they are getting.
  • The regional search was not a high priority but a significant medium priority, and it might be argued that not all researchers would be interested in this, but some would find it particularly useful, and many archivists would certainly find it helpful in their work
  • We provided a free text box for participants to say what they most valued. The ability to search across descriptions, which is the most basic value proposition of the Hub, came out top, and breadth of coverage was also popular, and could be said to be part of the same selling point.
  • It was interesting to see that some respondents cited the EAD Editor as the main strength for them, showing how important it is to provide ways for archivists to create descriptions (it may be thought that other means are at their disposal, but often this is not the case).
  • Six people referred to the importance of the Hub for providing an online presence, indicating that for some record offices, the Hub is still the only way that collections are surfaced on the Web.

What would most improve the Hub?

  • We had a diversity of responses to the question about what would most improve the Hub, maybe indicating that there are no very obvious weaknesses, which is a good thing. But this does make it difficult for us to take anything constructive from the answers, because we cannot tell whether there is a real need for a change to be made. However, there were a few answers that focused on the interface design, and some of these issues should be addressed by our new ‘utility bar’ which is a means to more clearly separate the description from the other functions that users can then perform, and should be implemented in the next six months.

Conclusions

The survey did not throw up anything unexpected, so it has not materially affected our plans for development of the Hub. But it is essentially an endorsement of what we are doing, which is very positive for us. It emphasised the importance of comprehensive coverage, which is something we are prioritising, and the value of detailed descriptions, which we facilitate through the EAD Editor and our training opportunities and online documentation. Please contact us if you would like to know more.

The long tail of archives

For many of us, the importance of measuring use and impact are coming more to the fore. Funders are often keen for indications of the ‘value’ of archives and typically look for charts and graphs that can provide some kind of summary of users’ interaction with archives. For the Hub, in the most direct sense this is about use of the descriptions of archives, although, of course, we are just as interested in whether researchers go on to consult archives directly.

The pattern of use of archives and the implications of this are complex. The long tail has become a phrase that is banded around quite a bit, and to my mind it is one of those concepts that is quite useful. It was popularised by Chris Anderson, more in relation to the commercial world, relating to selling a smaller number of items in large quantities and a large number of items in relatively small quantities, and you can read more about it in Wikipedia: Long Tail.

If we think about books, we might assume that a smaller number of popular titles are widely used and use gradually declines until you reach a long tail of low use.  We might think that the pattern, very broadly speaking, is a bit like this:

I attended a talk at the UKSG Conference recently, where Terry Bucknell from the University of Liverpool was talking about the purchase of e-books for the University. He had some very whizzy and really quite absorbing statistics that analysed the use of packages of e-books. It seems that it is hard to predict use and that whilst a new package of e-books is the most widely used for that particular year, the older packages are still significantly used, and indeed, some books that are barely used one year may be get significant use in subsequent years. The patterns of use suggested that patron-driven acquisition, or selection of titles after one year of use, were not as good value as e-book packages, although you cannot accurately measure the return on investment after only one year.

Archives are kind of like this only a whole lot more tricky to deal with.

For archives, my feeling is that the graph is more like this:

No prizes for guessing which are the vastly more used collections*. We have highly used collections for popular research activities, archives of high-profile people and archives around significant events, and it is often these that are digitised in order to protect the originals.  But it is true to say that a large proportion of archives are in the ‘long tail’ of use.

I think this can be a problem for us. Use statistics can dominate perceptions of value and influence funding, often very profoundly. Yet I think that this is completely the wrong way to look at it. Direct use does not correlate to value, not within archives.

I think there are a number of factors at work here:

  • The use of archives is intimately bound up with how they are catalogued. If you have a collection of letters, and just describe it thus, maybe with the main author (or archival ‘creator’), and covering dates, then researchers will not know that there are letters by a number of very interesting people, about a whole range of subjects of great interest for all sorts of topics. Often, archivists don’t have the time to create rich metadata (I remember the frustrations of this lack of time). Having worked in the British Architectural Library, I remember that we had great stuff for social history, history of empire, in particular the Raj in India, urban planning, environment, even the history of kitchen design or local food and diet habits. We also had a wonderful collection of photographs, and I recall the Photographs Curator showing me some really early and beautiful photographs of Central Park in New York. Its these kind of surprises that are the stuff of archives, but we don’t often have time to bring these out in the cataoguing process.
  • The use of a particular archive collection may be low, and yet the value gained from the insights may be very substantial. Knowledge gained as a result of research in the archives may feed into one author’s book or article, and from there it may disseminate widely. So, one use of one archive may have high value over time. If you fed this kind of benefit in as indirect use, the pattern would look very different.
  • The ‘value’ of archives may change over time. Going back to my experience at the British Architectural Library, I remember being told how the drawings of Sir Edwin Lutyens were not considered particularly valuable back in the 1950s – he wasn’t very fashionable after his death. Yet now he is recognised as a truly great architect, and his archives and drawings are highly prized.
  • The use of archives may change over time. Just because an archive has not been used for some time – maybe only a couple of researchers have accessed it in a number of years – it doesn’t mean that it won’t become much more heavily used. I think that research, just like many things, is subject to fashions to some extent, and how we choose to look back at our past changes over time. This is one of the challenges for archivists in terms of acquisitions. What is required is a long-term perspective but organisations all too often operate within short-term perspectives.
  • Some archives may never be highly used, maybe due to various difficulties interpreting them. I suppose Latin manuscripts come to mind, but also other manuscripts that are very hard to read and those pesky letters that are cross-written. Also, some things are specialised and require professional or some kind of expert knowledge in order to understand them. This does not make them less valuable. It’s easy to think of examples of great and vital works of our history that are not easy for most people to read or interpret, but that are hugely important.
  • Some archives are very fragile, and therefore use has to be limited. Digitising may be one option, but this is costly, and there are a lot of fragile archives out there.

I’m sure I could think of some more – any thoughts on this are very welcome!

So, I think that it’s important for archivists to demonstrate that whilst there may be a long tail to archives, the value of many of those archives that are not highly used can be very substantial. I realise that this is not an easy task, but we do have one invention in our favour: The Web. Not to mention the standards that we have built up over time to help us to describe our content. The long tail graph does demonstrate to us that the ‘long tail of use’ can be just as much, or more, than the ‘high column of use’. The use of the Web is vital in making this into a reality, because researchers all over the world can discover archives that were previously extremely hard to surface.  That does still leave the problems of not being able to catalogue in depth in order to help surface content…the experiments with crowd-sourcing and user generated content may prove to be one answer. I’d like to see a study of this – have the experiments with asking researchers to help us catalogue our content proved successful if we take a broad overview? I’ve seen some feedback on individual projects, such as OldWeather:

“Old Weather (http://www.oldweather.org) is now more than 50% complete, with more than 400,000 pages transcribed and 80 ships’ logs finished. This is all thanks to the incredible effort that you have all put in. The science and history teams are constantly amazed at the work you’re all doing.” (a recent email sent out to the contributors, or ‘ship captains’).

If anyone has any thoughts or stories about demonstrating value, we’d love to hear your views.

* family history sources