Finding and accessing archives for voluntary action history

Guest Blog by Georgina Brewis

It would not be an exaggeration to say that the history of voluntary, civic and cultural organisations has never been more popular as an academic subject in Britain. Leading historians like Brian Harrison have called attention to the importance of voluntarism as a theme in post-war British history while there has been a wave of PhD theses dealing with topics such as the voluntary hospitals, the role of disability charities in politics, the professionalization of the voluntary sector and the formation of humanitarian networks across empire.  In 2011 no less than three edited collections presenting the latest research on voluntary action history were published and several further volumes appeared in 2012 or are in press. Such new research has been strengthened and sustained by the Voluntary Action History Society and particularly its active New Researchers group.  Importantly, not all these studies are by historians, pointing to the importance of archival resources for students of political science, sociology, health studies and other disciplines. There is growing recognition that we cannot write British social history or social policy without looking at the considerable contributions of charities, voluntary groups, philanthropists, campaigners and volunteers.

So how do academic researchers track down the archives of the voluntary and community organisations they want to use? Any would-be researcher of charity needs to understand that those bodies with catalogued and accessible institutional archives – whether kept in-house or deposited elsewhere – represent only a very small minority of voluntary organisations. Unsurprisingly these tend to be the larger, better funded and longer-established groups such as the British Red Cross or the Children’s Society.  The voluntary sector in Britain is often likened to a pyramid: a very small number of organisations at the top with paid staff, regular income and office space resting on a much larger base of groups run entirely by volunteers, subsisting on small grants and donations. Voluntary sector archives may reflect this pattern, but there is no guarantee that even the largest charity will have made provision for preservation and conservation of its records (aside from the limited financial data required by the Charity Commission) let alone for cataloguing or access.

Researchers and students are advised to start with the National Register of Archives. Another useful database is DANGO, which identifies the locations of the papers of several thousand non-governmental organisations, and was put together by a team at Birmingham University, although the end of project funding means its entries and website are no longer being updated. Searching the Archives Hub will find records of voluntary groups where these are deposited at an institution contained on its database; Hull History Centre, SOAS, Birmingham University Library or the Women’s Library have all built up specialisms in this area. Perhaps there would be a way of encouraging charities with in-house collections to make the catalogues available via Archives Hub?

Archives Hub has helped me search for materials relating to small or short-lived student-run charities that may be contained within a students’ union archive or an individual’s private papers.  Although an organisation’s institutional archive may be lost or never have existed, its history can be reconstructed through accessing annual reports, correspondence and other papers held in many different repositories – as I have managed to do for the group International Student Service. It would be helpful for future researchers if it was possible to log this information somewhere.

It remains the case that many researchers will have to seek access to records by contacting an organisation or group founder directly, with variable results. This is likely to be increasingly the case given the increase in numbers of pressure groups, charities and other voluntary bodies since the 1960s. In my experience there is a range of practice from organisations which ignore or refuse requests for access with varying degrees of politeness to those that welcome you with open arms and let you sit unsupervised with the charity’s papers, free to copy, remove, deface or pour coffee all over the institutional record. Once you’ve had success accessing the records of one organisation, it may be easier to open communications with others in a related sector. Learning how to negotiate what we might call ‘informal archives’ will be a key challenge for future researchers of voluntary action. There is a need for better advice for academics, particularly students and new researchers, on the multiple ethical considerations and practical concerns that come with using informal archives. How do you track down such records? How do you reference sources? What do you do if you’re concerned about the physical state of records or what might happen to them when the group’s founder dies? How to reconcile your obligations as a historian with the fact that a particular organisation has trusted you to look at their materials?

It is also worth remembering that records relating to charitable activities can turn up in unexpected places, for example in the archives of private companies. The records of a charitable Trust or Foundation may well contain better sources about a particular charity than the organisation itself has preserved, although again there may be problems of access. There are good signs that this is changing not least through the positive examples of two funders involved with the new Campaign for Voluntary Sector Archives: the Barrow Cadbury Trust and the Diana, Princess of Wales Memorial Fund.

This new Campaign for Voluntary Sector Archives, which was launched at the House of Lords in October 2012, seeks to raise awareness of the importance of voluntary sector archives as strategic assets for governance, corporate identity, accountability and research. It maintains that caring for archives and records is actually an important aspect of the sector’s wider public benefit responsibility. Most significantly, the Campaign brings together academic researchers, custodians, creators of records and others in the voluntary sector to share expertise and resources. Together, we should be able to begin to address some of the issues and questions I’ve outlined above. Yet there is a long way to go before all voluntary organisations are convinced not only of the value of records to the current mission, but also of the value of making these accessible to researchers from a variety of disciplines. For more information contact info@voluntarysectorarchives.org.uk

The New Scholarly Record

I was lucky enough to attend the 2012 EmTACL conference in Trondheim, and this blog is based around the excellent keynote presentation by Herbert van de Sompel, which really made me think about temporal issues with the Web and how this can limit our understanding of the scholarly record.

Herbert believes that the current infrastructure for scholarly communication is not up to the job. We now have many non-traditional assets, which do not always have fixity and often have a wide range of dependencies; assets such as datasets, blogs, software, videos, slides which may form part of a scholarly resource. Everything is much more dynamic than it used to be. ‘Research objects’ often include assets that are interdependent with each other, so they need to be available all together for the object to be complete. But this is complicated by the fact that many of them are ‘in motion’ and updated over time.

This idea of dynamic resources that are in flux, constantly being updated, is very relevant for archivists, partly because we need to understand how archives are not static and fixed in time, and partly because we need to be aware of the challenges of archiving ever more complex and interconnected resources. It is useful to understand the research environment and the way technology influences outputs and influences what is possible for future research.

There are examples of innovative services that are responding to the opportunities of dynamic resources. One that Herbert mentioned was PLOS, which publishes open scholarly articles. It puts publications into Wikipedia as well as keeping the ‘static’ copy, so that the articles have a kind of second life where they continue to evolve as well as being kept as they were at the time of submission. For example, ‘Circular Permutation in Proteins‘.

The idea of executable papers is starting to become established – papers that are not just to read but to interact with. These contain access to the primary data with capabilities to re-execute algorithms and even capabilities to allow researchers to upload and use their own data. It produces a complex interdependency and produces a challenge for archiving because if something is not fixed in time, what does that mean for retaining access to it over time?

This all raises the issue of what the scholarly record actually is. Where does it start? Where does it end? We are no longer talking about a bunch of static files but a dynamic interconnected resource. In fact, there is an increasing sense that the article itself is not necessarily the key output, but rather it is the advertising for the actual scholarship.

Herbert concluded from this that it becomes very important to be able to view different points in time in the evolution of scholarly record, and this should be done in a way that works with the Web. The Web is the platform, the infrastructure for the scholarly record.  Scholarly communication then becomes native to the Web. At the heart of this is the need to use HTTP URIs.

However, where are we at the moment? The current archival infrastructure for scholarly outputs deals with things with fixity and boundaries. It cannot deal with things in flux and with inter-dependencies. The Web exists in ‘now’ time; it does not have a built in notion of time. It assumes that you want the current version of something – you cannot use a URI to get to a prior version.

Diagram to show publication on the Web
Slide from Herbert van de Sompel’s presentation showing the publication context on the Web

We don’t really object to this limitation, something evidenced by the fact that we generally accept links that take us to 404 pages, as if it is just an inevitable inconvenience. Maybe many people just don’t think that there is any real interest in or requirement for ‘obsolete’ resources, and what is current is what is important on the Web.

Of course, there is the Internet Archive and other similar initiatives in Web archiving, but they are not integrated into the Web. You have to go somewhere completely different in order to search for older copies of resources.

If the research paper remains the same, but resources that are an integral part of it change over time, then we need to change archiving to reflect this. We need to think about how to reference assets over time and how to recreate older versions. Otherwise, we access the current version, but we are not getting the context that was there at the time of creation; we are getting something different.

Can we recreate a version of a scholarly record? Can we go back to certain point it time so we can see linked assets from a paper as they were at the time of publication? At the moment we are likely to get many 404s when we try to access links associated with a publication. Herbert showed one survey on the decay of URLs in Medline, which is about 10% per year, especially with links to thinks like related databases.

One solution to this is to be able to follow a URI in time – to be able to click on URI and say ‘I want to see this as was 2 years ago’.  Herbert went on to talk about something he has created called Memento. Memento aims to better integrate the current and past Web. It allows you to select a day or time in the browser and effectively take the URI back in time. Currently, the team are looking at enabling people to browse past pages of Wikipedia. Memento has a fairly good success rate with going back to retrieve old versions, although it will not work for all resources. I tried it with the Archives Hub and found it easy to take the website back to how it looked right in the very early days.

Screen shot of the Archives Hub hompeage
Using Memento to take the Archives Hub back in time.

One issue is that the archived copies are not always created near the time of publication. But for those that are, they are created simply as part of the normal activity of the Web, by services like the Internet Archive or the British Library, so there is no extra work involved.

Herbert outlined some of the issues with using DOIs (digital object identifiers), which provide identifiers for resources that use a resolver to ensure that the identifier can remain the same over time. This is useful if, for example, a publisher is bought out – the identifier is still the same as the resolver redirects to the right location However, a DOI resolver exists in the perpetual now. It is not possible to travel back in time using HTTP URIs. This is maybe one illustration of the way some of the processes that we have implemented over the Web do not really fulfil our current needs, as things change and resources become more complex and dynamic.

With Memento, the same HTTP URI can function as the reference to temporally evolving resources. The importance of this type of functionality is becoming more recognised. There is a new experimental URI scheme, DURI , or Dated URI. The ideas is that a URI, such as http://www.ntnu.no, can be dated: 1997-06-17:http://www.ntnu.no (this is an example and is not actionable now). Herbert did raise another possibly of developing Websites that can deal with the TEL (telephone) protocol. The idea would be that the browser asks you whether the Website can use the TEL protocol, and if it can, you get this option offered to you. You can then use this and reference a resource and use Memento to go back in time.

Herbert concluded that the idea of ‘archiving’ should not be just a one-off event, but needs to happen continually. In fact, it could happen whenever there is an interaction. Also, when new materials are taken into a repository, you could scan for links and put them into an archive, so the links don’t die. If you archive the links at the time of publication or when materials submitted to a repository, then you protect against losing the context of the resource.

Herbert introduced us to SiteStory, which offers transactional archiving of a a web server. Usually a web archive sends out a robot, gathers and dumps the data. With SiteStory the web server takes an active part. Every time a user requests a page it is also pushed back into the archive, so you get a fine grained history of the resource. Something like this could be done by publishers/service providers, with the idea that they hold onto the hits, the impact, the audience. It certainly does seem to be a growing area of interest.

Herbert’s slides are available on Slideshare.