Creating a COVID-19 archive at the Royal College of Nursing

Archives Hub feature for November 2020

Now more than ever as we continue to battle the COVID-19 pandemic, the world is reliant on its digital infrastructure; the need to provide and access accurate and up-to-date information is of paramount importance. This raises some interesting questions, challenges and opportunities for archive services who can play their part in the collective response to the crisis by capturing and recording events, activities and decisions. Archives and recordkeeping professionals have always supported the notions of accountability and transparency through their work, something which is being demonstrated in real time during the development of the pandemic.

As the UK’s largest trade union and professional association for nurses, the Royal College of Nursing (RCN) has been supporting and representing nurses and healthcare workers throughout the pandemic. It is vital that records of how this has been done are available to the organisation in perpetuity as evidence of advice given and decisions taken. The RCN has a responsibility to its members to be able to demonstrate that the organisation has been working in their best interests and the interests of their patients. In turn, the RCN archive has a responsibility to ensure that records with evidential and research value are captured, preserved and accessible to right audiences at the right time.

One of our first attempts at archiving the RCN COVID-19 webpages using our digital archive.
One of our first attempts at archiving the RCN COVID-19 webpages using our digital archive.

As a result, like many of our archivist and recordkeeping colleagues across the world, we have created a COVID-19 archive. Since the beginning of the year the RCN archive team have been actively collecting records relating to COVID-19 from across the organisation to build up a picture of how the pandemic has unfolded through the eyes of RCN members and staff. Unsurprisingly, this covers a wide range of record types and digital formats: web crawls of special COVID-19 webpages containing up-to-date guidance and advice, targeted staff emails, member surveys on working conditions and PPE, General Secretary’s video messages, special committee situation reports, newly created online nursing resources, publications – the list could go on. Within this set of records is a complex combination of access requirements and restrictions which, through balancing business confidentiality with public interest, we will manage alongside the records themselves.

We are in the fortunate position of having a remotely accessible network and a digital archive, which has meant that we have been able to collect these records as they have been created and start uploading them to our digital archive straight away. While some of the records we’re collecting as part of the COVID-19 archive project would have been transferred to us anyway, there are several new record series on our 2020 collecting plan as a result of the pandemic. For example, our first venture in web archiving was a test crawl of the RCN COVID-19 webpages; these are now collected regularly and form an integral part of the COVID-19 archive. Having seen and been inspired by the experiences of other archives already running successful daily web crawls to capture public advice and the public response, we decided to capture our pages daily as well – this ensured that we were keeping up to speed with each piece of new advice and guidance shared on the webpages. As the rate of updates to the pages has slowed, we have since reduced the frequency to weekly, although we continue to monitor them, ready to capture more frequently if needed. This was the pilot web archiving project we didn’t know we were doing until it happened, and it has in turn has sparked interest in a larger web archiving project to capture the whole RCN website, which is well underway.

A video message from Donna Kinnar, General Secretary, on the staff intranet. An example of the range of formats collected for the COVID-19 archive.
A video message from Donna Kinnar, General Secretary, on the staff intranet. An example of the range of formats collected for the COVID-19 archive.

Alongside the collecting of material, we have been considering how the records of the COVID-19 archive will fit into our existing catalogue structure. While it would be easy to create a new Fonds for COVID-19, we realised that this view was being skewed by our thoughts about future access to the material, and the ease at which colleagues or researchers would be able to view all the material neatly packaged together. Instead we plan to preserve the context of the records by arranging them by creator, in our case this is mostly the department of origin, to fit within our existing catalogue structure. There will be occasions when it is important to view all COVID-19 records together to get a complete picture of the reaction and response to the pandemic, so using the ‘linked collection’ feature in our digital archive we plan to create a virtual COVID-19 collection containing records from across different record series to allow this level of access. Beyond this we are considering which records from our COVID-19 archive will be shared on our public digital archive website to ensure the transparency and accountability that creating the COVID-19 archive in the first place helps to achieve.

We have certainly learnt a lot this year and the team has upskilled, becoming more proficient and confident in processing a wide range of digital formats, from collection through to access. Our sector has also stepped up by providing online webinars and training events to share our experiences of this extraordinary time. In May we participated in a panel discussion facilitated by Preservica, our digital archive supplier, who generously donated 250GB of storage space for us to store the COVID-19 archive. At the event we shared our plans and projects for collecting COVID-19 records with the archive community alongside colleagues from a wide range of institutions. These included Network Rail, who have been collecting records such as emergency train timetables introduced in response to the falling customer demand, and all the documentation that went into making this happen, and University at Buffalo in the US, who are encouraging students and staff to share their experiences of the pandemic by submitting video diaries and photographs to the archive. Learning about and reflecting on the wide range of collecting projects happening around the world is as informative as it is inspiring.

An example of a publication for the COVID-19 archive. This is the cover of the April 2020 Bulletin RCN members magazine.
An example of a publication for the COVID-19 archive. This is the cover of the April 2020 Bulletin RCN members magazine.

It is amazing to think that in the (probably not too distant) future the COVID-19 records we have collected will be catalogued, available to view online through our digital archive and be being used to inform research into, and evaluations of, the response of the UK’s largest independent nursing organisation and our role in how Britain handled the pandemic.

Katherine Chorley, Digital Asst Archivist
Royal College of Nursing Archives


Browse all Royal College of Nursing Archives collections on the Archives Hub.

Previous RCN Archives feature: Cathlin du Sautoy and Hermione Blackwood: personal papers at the Royal College of Nursing Archives

All images copyright Royal College of Nursing Archives. Reproduced with the kind permission of the copyright holders.

Digital Curation: think use, not preservation

For the keynote presentation at the DCC/RIN Research Data Management Forum on ‘The Economics of Applying and Sustaining Digital Curation’, Chris Rusbridge gave us some reflections from the Blue Ribbon Task Force (BRTF): on Sustainable Digital Preservation and Access. This was a 2 year project, finishing earlier this year, and the final report is available from: of digital data

Chris kicked off by asking us to think about how we currently support access to digital information. Avenues include Government grants, advertisements (e.g. through Google), subscriptions (to journals), pay per service (e.g. Amazon Web service), and donations.

One of the key themes that he raised and returned to was around the alignment, or lack of alignment between those who pay, those who provide and those who benefit from digital data: they are not necessarily the same, and the more different they are the harder it may be to create a sustainable model . Who owns, who benefits, who selects, who preserves, who pays?  This has interesting parallels with archive repositories, where an institution may pay for the acquisition, appraisal, storage, cataloguing and access for these resources, but the beneficiaries are far broader than just members of the institution. Some institutions may require payment for access, but others will provide access free of charge. They may see this as a means to enhance their reputation and status as a learned society.

Around 15 years ago we started to think about digital preservation as a technical problem and then the OAIS reference model was produced. The technical capabilities that we now have are well up to the task, although Chris warned that the most elegant technical solution is no good if it is not sustainable; digital preservation has to be a sustainable economic activity. Today the focus is on the economic and organisational problems. It is not just about money; it requires building upon a value proposition, providing incentives to act and defining roles and responsibilities.

Digital preservation represents a derived demand.  No one ‘wants’ preservation per se; what they want is access to a resource.  It is not easy to sell a derived demand – often it needs to be sold on some other  basis. This idea of selling the importance of providing use (over time) rather than trying to sell the idea of preservation was emphasised throughout the Forum.

Digital preservation is also ‘path dependent’, meaning that the actions and decision you take change over time; they are different at different points of the life-cycle. Today’s actions can remove other options for all time.

Cultural issues, and mindset may be an issue here, and I was interested in the potential problem Chris proposed of  the ‘free-rider’ culture when it comes to making research datasets available. It may be that some (many?) researchers don’t want to pay for things, under value services and maybe underestimate costs. Researchers may also resent conformity and what they see as beauracracy. All in all, it may be difficult to make a case that researchers should in some way pay. This may be compounded by a sense that money invested in preservation is money taken out of research.  Chris suggested that the incentives for preservation are less apparent to the individual researcher, but are more clearly defined when the data is aggregated.

Typically, long-term preservation activities  have been funded by short-term resource allocation, although maybe this is gradually changing; a more thorny issue is that of recognising and valuing the benefits of digital preservation, to provide incentives that attract funding. More work needs to be done on articulating the benefits in order to cultivate a sense of the value.However, other speakers at the Forum wondered whether we should actually take the value as a given – maybe we shouldn’t keep asking the question about benefits, but simply acknowledge that it is the right thing to make research and other digital outputs available long-term?  We may be creating problems for ourselves if we emphasise the need to demonstrate value too much, and then struggle to quantify the value. However, this was just one argument, and overall I think that there was a belief that we do need to understand and articulate the benefits of providing long-term access.

There is often a lack of clear responsibility around digital preservation – maybe this is one of those areas where it’s always thought to be someone else’s responsibility? So, appropriate organisation and governance is essential for efficient ongoing preservation, especially when considering the tendency for data to be transferred – these ‘handoffs’ need to be secure.

The three imperatives that the BRTF report comes up with are: to articulate a compelling value proposition; to provide clear incentives to preserve in the public interest; to define role and responsibilities.

Commenting briefly on the post BRTF developments, Chris mentioned the EU digital agenda and the  LIBER pan-european survey on sustainability preparedness.

There are some mandates emerging:  the NERC and ESRC, for example.  Some publishers do require authors to make available data that substantiates an article, but at present this is not rigorous enough. We need to focus more on the data behind the research and how important it is.

Chris contrasted domain data repositories and institutional data repositories. Domain data repositories: leverage scale and expertise; are valuable for ‘high curation’ data; can carry out a ‘community proxy’ role such as tool development; aggregate demand; are potentially vulnerable to policy change (e.g. AHDS). A mixed funding models desirable for domain data repositories (e.g. ICPSR). Institutional data repositories: have a reputational business case (risk management, records management aspects, showcasing); should be aligned with institutional goals; can link to institutional research services (e.g. universal backup); can work well for ‘low curation’ cases (relatively small, static datasets); demand aggregation across a set of disciplines.

One issue that came up in the discussion was that we must remember that in fact digital preservation is relatively cheap, especially when compared to the preservation of hard-copy archives, held in acid-free boxes on rows and rows of shelving in secure, controlled search rooms.  So, if the cost is actually not prohibitive, and the technical know-how is there, then it seems imperative to address the organisational issues and to really hammer home the true value of preserving our digital data.

Democracy 2.0 in the US

Democracy 2.0: A Case Study in Open Government from across the pond.

I have just listened to a presentation by David Ferriero – 10th Archivist of the US at the National Archives and Records Administration ( He was talking about democracy, about being open and participatory. He contrasted the very early days of American independence, where there was a high level of secrecy in Government, to the current climate, where those who make decisions are not isolated from the citizens, and citizens’ voices can be heard. He referred to this as ‘Democracy 2.0.’ Barack Obama set out his open government directive right from the off, promoting the principles of more transparecy, participation and collaboration. Ferriero talked about seeking to inform, educate and maybe even entertain citizens.

The backbone of open government must be good record keeping. Records document individual rights and entitlements, record actions of government and who is responsible and accountable. They give us the history of the national experience. Only 2-3 percent of records created in conducting the public’s business are considered to be of permanent value and therefore kept in the US archives (still, obviously, a mind-bogglingly huge amount of stuff).

Ferriero emphasised the need to ensure that Federal records of historical value are in good order. But there are still too many records are at risk of damange or loss. A recent review of record keeping in Federal Agencies showed that 4 out of 5 agencies are at high or moderate risk of improper destruction of records. Cost effective IT solutions are required to address this, and NARA is looking to lead in this area. An electronic records archive (ERA) is being build in partnership with the private sector to hold all the Federal Government’s electronic records, and Ferriero sees this as the priority and the most important challenge for the National Archives. He felt that new kinds of records create new challenges, that is, records created as result of social media, and an ERA needs to be able to take care of these types of records.

Change in processes and change in culture is required to meet the new online landscape. The whole commerce of information has changed permanently and we need to be good stewards of the new dynamic. There needs to be better engagement with employees and with the public. NARA are looking to improve their online capabilities to improve the delivery of records. They are developing their catalogue into a social catalogue that allows users to contribute and using Web 2.0 tools to allow greater communication between staff. They are also going beyond their own website to reach users where they are, using YouTube, Twitter, blogs, etc. They intend to develop comprehensive social media strategy (which will be well worth reading if it does emerge).

The US Government are publishing high value datasets on and Ferriero said that they are eager to see the response to this, in terms of the innovative use of data. They are searching for ways to step of digitisation – looking at what to prioritise and how to accomplish the most with least cost. They want to provide open government leadership to Federal Agencies, for example, mediating in disputes relating to FoI. There are around 2,000 different security classification guides in the government, which makes record processing very comlex. There is a big backlog of documents waiting to be declassified, some pertaining to World War Two, the Koeran War and the Vietnam War, so they will be of great interest to researchers.

Ferriero also talked about the challenge of making the distiction between business records and personal records. He felt that the personal has to be there, within the archive, to help future researchers recreate the full picture of events.

There is still a problem with Government Agencies all doing their own thing. The Chief Information officers of all agencies have a Council (the CIO Council). The records managers have the Records Management Council. But it is a case of never the twain shall meet at the moment. Even within Agencies the two often have nothing to do with eachother….there are now plans to address this!

This was a presentation that ticked many of the boxes of concern – the importance of addressing electronic records, new media, bringing people together to create efficiencies and engaging the citizens. But then, of course,  it’s easy to do that in words….