Is Linked Data an Appropriate Technology for Implementing an Archive’s Catalogue?

Here at the Archives Hub we’ve not been so focussed on Linked Data (LD) in recent years, as we’ve mainly been working on developing and embedding our new system and workflows. However, we have continued to remain interested in what’s going on and are still looking at making Linked Data available in a sustainable way. We did do a substantial amount of work a number of years back on the LOCAH project from which we provided a subset of archival linked data at data.archiveshub.ac.uk.  Our next step this time round is likely to be embedding schema.org markup within the Hub descriptions. We’ve been closely involved in the W3C Schema Architypes Group activities, with Archives Hub URIs forming the basis of the group’s proposals to extend the “Schema.org schema for the improved representation of digital and physical archives and their contents”.

We are also aiming to reconnect more closely with the LODLAM community generally, and to this end I attended a TNA ‘Big Ideas’ session ‘Is Linked Data an appropriate technology for implementing an archive’s catalogue?’ given by Jean-Luc Cochard of the Swiss Federal Archives. I took a few notes which I thought it might be useful to share here.

Why we looked at Linked Data?

This was initially inspired by the Stanford LD 2011 workshop and the 2014 Open data.swiss initiative. In 2014 they built their first ‘aLOD’ prototype – http://alod.ch/

The Swiss have many archive silos from which they transformed the content of some systems to LD and then were able to merge. They created basic LD views, Jean-Luc noting that the LD data is less structured than data in the main archival systems, an example of which is e.g. http://data.ge.alod.ch/id/archivalresource/adl-j-125

They also developed a new interface http://alod.ch/search/ with which they were trying for an innovative approach to presenting the data such as providing a histogram with dates.  It’s currently just a prototype interface running off SPARQL with only 16,000 entries so far.

They are also now currently implementing a new archival information system (AIS) and are considering LD technolgy for the new system, but may go with a more conventional database approach. The new system has to work with the overall technical architecture.

Linked data maturity?

Jean-Luc noted that they expect that in three years born digital will greatly expand by factor of ten, though 90% of the archive is currently analogue. The system needs to cope with 50M – 1.5B triples. They have implemented Stardog triple stores 5.0.5 and 5.2. The larger configuration is a 1 TB RAM, 56 CPU and 8 TB disk machine.

As part of performance testing they have tried loading the system with up to 10 Billion triples and running various insert, delete and query functions. The larger config machine allowed 50M triple inserts in 5 min. 100M plus triples took 20min to insert. With the update function things were found to be quite stable.  They then combined querying with triple insertions at the same time, and this highlighted some issues with slow insertions with a smaller machine. They also tried full text indexing with the larger config machine. They got very variable results with some very slow response times with the insertions, finding the latter was a bug in the system.

Is Linked Data adequate for the task?

A key weakness of their current archival system is that you can only assign records to one provenance/person. Also, their current system can’t connect records to other databases, so they have the usual silo problem. Linked data can solve some of these problems. As part of the project they looked at various specs and standards:

BIBFRAME v2.0 2016
Europeana EDM released 2014.
EGAD activities – RiC-CM -> RiC-O based on OWL (Record in context)
A local initiative- Matterhorn RDF Model.  Matterhorn uses existing technologies, RDA, BPMN, DC, PREMIS. There is a first draft available.

They also looked at relevant EU R&D projects: ‘Prelia’, on preservation of LD and ‘Diachron’ – managing evolution and preservation of LD.

Jean-Luc noted that the versatility of LD is appealing for several reasons –

  • It can be used at both the data and metadata levels.
  • It brings together multiple data models.
  • It allows data model evolution.
  • They believe it is adequate to publish archive catalogue on the web.
  • It can be used in closed environment.

Jean-Luc  mentioned a dilemma they have between RDF based Triple stores and graph databases. Graph databases tend to be proprietary solutions, but have some advantages. Graph databases tend to use ACID transactions intended to guarantee validity even in the event of errors, power failures, etc., but they are not sure how ACID reliable triple stores are.

Their next step is expert discussion of a common approach, with a common RDF model. Further investigation is needed regarding triple store weaknesses.

Cathlin du Sautoy and Hermione Blackwood: personal papers at the Royal College of Nursing Archives

Archives Hub feature for May 2018

The archive of the Royal College of Nursing is a fascinating mix of business and personal. We collect the organisational records of the College, which go back to its foundation in 1916. These include meeting minutes, premises records, RCN publications and marketing ephemera, and tell the story of the College as a professional organisation (and later a trade union) for nurses. The other half of the archive consists of a large number of personal papers collections, each relating to an individual nurse and containing a vast array of items, from lecture notes and badges to First World War scrapbooks and photographs. Some of our oldest material predates the founding of the College by 50 years. The RCN’s personal papers collection is a wonderful source for learning about the lives and professional challenges of nurses across the UK.

Du Sautoy and Blackwood with nursing colleagues and Victor and Yvette at Blerancourt 1921. Copyright the Royal College of Nursing 2018.
Du Sautoy and Blackwood with nursing colleagues and Victor and Yvette at Blerancourt, 1921. Copyright the Royal College of Nursing, 2018.

The personal papers collection of Cathlin du Sautoy is a perfect example of the variety shown by our collections, not least because Cathlin du Sautoy was herself a very interesting woman.

Cathlin du Sautoy was born in 1875 to John and Annie du Sautoy. Her father was a civil engineer and the family lived in Yorkshire. After three years’ study of Domestic Science at Cardiff College she was appointed as lecturing sister at Tredegar House, the training school for nurses for the London Hospital. Her teaching subject was Sick Room Cookery, Physiology, Hygiene and the Chemistry of Food. She then entered training at Guy’s Hospital for three years and was the Gold Medallist of her year. A career in nursing and nurse teaching followed, at such institutions as the Queen Victoria Jubilee Institute, the British Red Cross Society and the Ulster Medical Board.  She was deeply involved with nursing in France during and after the First World War, organising Red Cross units in the UK and in France, and helping to set up an English-style District Nurse programme in Reims after the end of the war.

Yvette, Blackwood, Du Sautoy and Victor Dec 1926. Copyright the Royal College of Nursing 2018.
Yvette, Blackwood, Du Sautoy and Victor, Dec 1926. Copyright the Royal College of Nursing, 2018.

During the First World War, when she was in her late 30s, she met Lady Hermione Blackwood, who was a VAD in France. They would become lifelong companions, settling in the Vale of Health in Hampstead with their two adopted French children, Victor and Yvette, after the war. The couple acted as air-raid wardens during the Second World War and were active in the local area and hospital. Cathlin du Sautoy died in 1968, eight years after the death of Hermione Blackwood.

Du Sautoy and Blackwood with nursing colleagues and Victor at Vouziers, Feb 1920. Copyright the Royal College of Nursing, 2018.
Du Sautoy and Blackwood with nursing colleagues and Victor at Vouziers, Feb 1920. Copyright the Royal College of Nursing, 2018.

Her papers clearly show an extremely capable nurse and family-oriented woman. The two sides of her obviously fitted neatly into each other, with many photographs of Cathlin and Hermione in full nursing uniform, holding baby Victor (known as ‘Hiddy’) in France. There are letters in the collection about Cathlin’s career alongside letters from Hermione about the children’s clothes and their holiday plans. There are Cathlin’s nursing badges and medals and a copy of Hermione’s Queen’s Nursing Institute magazine. The collection is a beautiful mix of the personal and the professional and shows how, in nursing, the two often go hand-in-hand. The couple met whilst nursing and, whilst Lady Hermione Blackwood did not nurse after the war, Cathlin du Sautoy was actively involved in the management of the Royal College of Nursing and the running of the local hospital. She obviously had a deep interest in helping others, as her and Hermione’s stints as air raid wardens during the Second World War (when du Sautoy was in her 70s) show.

Du Sautoy in her ARP warden's uniform 1944. ,Copyright the Royal College of Nursing, 2018.
Du Sautoy in her ARP warden’s uniform, 1944. Copyright the Royal College of Nursing, 2018.

You can see more images of Cathlin and Hermione at the exhibition currently on display at RCN Scotland’s headquarters in Edinburgh – the couple are an important part of the exhibition, which celebrates diversity in nursing and is based largely on the RCN’s personal papers collections.

Sophie Volker, Archivist
Royal College of Nursing Archives

Related

Papers of Cathlin Du Sautoy, 1904-2007

Explore all Royal College of Nursing collections on the Archives Hub.

Royal College of Nursing exhibition: Hidden in plain sight: celebrating nursing diversity

All images copyright Royal College of Nursing Archives and reproduced with the kind permission of the copyright holders.