Exploring New Worlds in the Archives Hub

This blog post forms part of History Day 2020, a day of online interactive events for students, researchers and history enthusiasts to explore library, museum, archive and history collections across the UK and beyond.

Use the Archives Hub, a free resource, to find unique sources for your research, both physical and digital. Search across descriptions of archives, held at over 350 institutions across the UK.

History Day 2020 coincides with the Being Human festival, the UK’s national festival of the humanities. Their theme this year is ‘New Worlds’, so taking this as our inspiration, we’re highlighting a range of archive collections – across Travel, Exploration, Space Exploration and Science Fiction.

Travel

Austen Henry Layard’s passport (1) (LAY/1/4/8)
Austen Henry Layard’s passport (1) (LAY/1/4/8). Image copyright: University of Newcastle.

Unearthing Family Treasures: The Layard and Blenkinsopp Coulson Archives
In 1839 a young lawyer left behind his London office for a post in the Ceylon (now Sri Lanka) Civil Service, thus beginning a series of travels, adventures and discoveries which would result in him achieving world renown for uncovering and shining a light on the ancient civilizations of Mesopotamia, in particularly Assyrian culture. That young man was Austen Henry Layard. Read the feature, by University of Newcastle Special Collections.

Papers of Elizabeth Thomson, 1847-1918, teacher, missionary, traveller and suffragette, c1914
Throughout the 1890s and 1900s Thomson travelled the world with her sister, Agnes, working as teachers and missionaries. The countries they visited include India, Japan, the USA, Germany and Italy. In the summer of 1899 Thomson reports that she visited Faizabad in India to learn Urdu but could not stand the heat and left for Almora in 1902. In 1907 she sailed to Bombay to complete missionary work, before teaching English in Sangor for the winter. In 1909 she travelled back to the UK, via Vienna, Prague, Dresden and Berlin, to settle in Edinburgh. Material held by University of Glasgow Archive Services – see the full collection description.

Steel engraving, 1875. © Image is in the public domain.
Steel engraving, 1875. © Image is in the public domain.

Sentimental Journey: a focus on travel in the archives
The hundreds of collections relating to travel featured in the Archives Hub shed light on multiple aspects of travel, from royalty to the working classes, and encompassing touring, business, exploration and research, the work of missionaries and nomadic cultures. Read the feature.

An abstract of a voyage from England to the Mediteranian: the diary of an anonymous English naval victualler, 1694-1696
Contains the log of an anonymous English naval victualler on a voyage from Gravesend in England to Cadiz in the Mediterranean between 31 December 1694 and 29 October 1696. Material is in English Spanish Latin Hebrew. Written in a single neat late seventeenth-century English hand with the text on each page set within faint ruled lines. There are many tables, diagrams, and quite finely-drawn illustrations of places en route, especially in Spain, and interesting objects, such as keys and seals. Material held by University of Leeds Special Collections – see the full collection description.

Bodiwan Papers, 1634-1923
The papers of Michael D. Jones and his family, which include numerous letters to Michael D. Jones from the Welsh settlers in Patagonia or relating to them, prior to the sailing of the Mimosa and after. Amongst them is a letter from Charles de Gaulle, the eminent Breton and Celticist, expressing his interest in the scheme to found a Welsh colony in Patagonia. Also, amongst the correspondents are L. Patagonia Humphreys, Rev. D. Lloyd Jones, Rhuthun and Mihangel ap Iwan and Llwyd ap Iwan. The papers reflect the hardship suffered by the new settlers as well as the investment made by Michael D. Jones in the venture. There are bills and receipts relating to the Mimosa, share certificates, statistics regarding population for 1879. Also, a bank pass book of the Welsh Colonising and General Trading Company Ltd, 1870-1883, and a register of the Welsh applicants to Patagonia, 1875-1876. The collection is held by Archifdy Prifysgol Bangor / Bangor University Archives – see the full collection description.

The London to Istanbul European Highway
Part of The National Motor Museum Trust Motoring Archive‘s Bradley Collection, including striking illustrations by Margaret Bradley. Read the feature.

The handsome blue car, by Margaret Bradley. ‘With apologies…this being a rough sketch…made somewhere in the middle of no mild channel’. Sketch by Margaret Bradley, copyright the National Motor Museum Trust.
The handsome blue car, by Margaret Bradley. ‘With apologies…this being a rough sketch…made somewhere in the middle of no mild channel’. Sketch by Margaret Bradley, copyright the National Motor Museum Trust.

Exploration

Cambridge Svalbard Exploration Collection, 1933-1992
The collection documents many decades of scientific work undertaken by (mostly) Cambridge researchers from 1938 until the early 1990s. These were mostly led by Walter Brian Harland (1917-2003), who also became the collator of the materials collected in Spitsbergen. The documentary archive complements the physical collection of geological specimens collected during those expeditions. Svalbard is located in the north-western corner of the Barents Shelf 650km north of Norway, and is named after the Dutch Captain, Barents, who is credited with the modern discovery of the islands in 1596 and after whom the Barents Sea is named. Collection held by Sedgwick Museum of Earth Sciences, University of Cambridge – see the full collection description.

Online Resource: Old Maps Online – provided by Great Britain Historical GIS Project, Maps Online is a search portal that combines the historical map collections of several organisations around the world. Users can search across collections through a single interface and easily locate multiple maps of a geographical area. The interface is free and access is open to all users. A wide range of different types of map are available, including: land maps; sea charts; boundary and estate maps; military and political maps; and town plans. Historical maps of many countries are available – including South and Central America from the 16th to the 20th centuries; Britain and particularly London, up to 1860; North America in the 18th and 19th centuries; pre-1900 Dutch Maps; the North West of England; and Moscow. More details.

Challenger Expedition Photographs, 1870s-1885; 1981-1983
HMS Challenger set out to collect specimens from different depths of water across the globe. The voyage took place between 1872 and 1876. It is thought that this was the first expedition to routinely use photography to document the journey. There was a darkroom on board so photographs could be developed on the ship. Material held by National Museums Scotland – see the full collection description.

Shackleton’s Endurance Expedition Centenary
27th October 1915: Antarctic expedition ship Endurance was abandoned on the orders of Sir Ernest Shackleton and their expedition became fight for survival. Read the feature by the Scott Polar Research Institute, University of Cambridge.

Space Exploration

John Herschel’s photograph of his father’s 40-foot telescope.
Herschel’s 40-foot telescope, circular glass plate photograph. The telescope’s wooden scaffolding is seen here on 9 September 1839, at Observatory House in Slough, England. It was photographed by the astronomer John Herschel (1792-1871) before its demolition. The telescope was designed by John’s father, the German-born British astronomer William Herschel (1738-1822). The tube was 40 feet (12 metres) long. The first observations with this telescope were carried out 50 years earlier on 28 August 1789, when two new moons of Saturn (Enceladus and Mimas) were discovered. 50 years later, by 1839, John Herschel and W H Fox Talbot had invented the process we now know as photography. This is one of the earliest surviving glass plate photographs. Image copyright: Royal Astronomical Society Archives

Russian Space Exploration, 1903
Drawings, documents, photographs, ephemeral objects and memorabilia relating to early Russian space exploration. Objects include domestic items such as cigarette cases, ashtrays, cigarette ornamental dispensers, desk thermometers, ornamental lamps and tea glass holders. Included in the collection are photo albums and a press cutting album made by a school child as well as stamp collections. The collection boasts rare drawings by Konstantin Tsiolkovsky in which he envisaged the exit from a spacecraft into the vacuum of space as well as a drawing of a Reactive engine (Rocket engine); one of the first designs of its kind from c.1930. The collection is held by De Montfort University Archives and Special Collections – see the full collection description.

Jodrell Bank Observatory Archive, c.1924-1993
The Jodrell Bank Observatory is one of the world’s largest radio-telescope facilities. Originally known as the Jodrell Bank Experimental Station, it was renamed the Nuffield Radio Astronomy Laboratories in 1966, and changed to its current name in 1999. The first radar transmitter and receiver was installed by Bernard Lovell, then working as a physicist at the University of Manchester, at Jodrell Bank, Cheshire, in December 1945 (the University campus had proved unsuitable because of the high level of electrical interference). At this period Lovell was researching cosmic rays under the direction of Patrick Blackett, professor of physics at the University of Manchester. Lovell’s work involved studying radio echoes from large cosmic ray showers in the Earth’s atmosphere, using old military radars. As a result of this, Lovell went on to make important discoveries in meteoric astronomy. The collection is held by University of Manchester Library – see the full collection description.

The Herschel archive at the Royal Astronomical Society
The Royal Astronomical Society is the custodian of a significant collection of the astronomy-related papers of William, Caroline and John Herschel. Read the feature.

Caroline Herschel.
Caroline Lucretia Herschel (1750-1848), German- born British astronomer, in 1847, pointing at the orbit of a comet on a map of the solar system. The map shows all the planets out to Saturn. Uranus had been discovered in 1781 by William Herschel, but was at first thought to be a comet. Neptune was discovered in 1846. The map also shows the asteroids Ceres (discovered in 1801), Pallas (1802), Juno (1804) and Vesta (1807). Caroline was the sister of William Herschel, and worked with him in England. She discovered eight new comets between 1786 and 1797. After her brother’s death in 1822, Caroline returned to Hanover, where she died at the age of 98. This artwork shows Herschel in Hanover in 1847, the year before she died. Image copyright: Royal Astronomical Society Archives

Science Fiction

Papers of Douglas Noël Adams, 1952-2001 (Circa.)
Douglas Noël Adams was born in Cambridge in 1952. He was awarded an exhibition to read English at St John’s College, Cambridge, obtaining his BA in 1974. While at Cambridge, Adams occupied himself chiefly in writing, performing in, and producing comedy sketches and revues, establishing connections that were to be integral to his future work. His career took off with ‘The Hitchhiker’s Guide to the Galaxy’, a six-part comic science-fiction radio series commissioned by the BBC in 1977 and broadcast in 1978. Novelisation and a second series were followed by further books in what became billed as ‘the increasingly inaccurately named Hitchhiker’s Trilogy’. The ‘Hitchhiker’s Guide’ series has taken many forms, including audio recordings; stage adaptations; a television series; a computer game; publication of the original radio scripts; radio adaptations of the remaining novels, and a film. Adams’s other creative work included writing and script-editing for BBC Television’s ‘Doctor Who’. Material held by St John’s College Library Special Collections, University of Cambridge – see the full collection description.

Papers of Brian Aldiss, 1966-1995
Brian Aldiss was born in 1925 in Dereham, Norfolk. After war service in the Royal Corps of Signals he entered the bookselling trade, working at Sanders & Co. in Oxford. His first work as a writer was The Brightfount Diaries, a fictionalised diary of a bookseller first published as a column in The Bookseller during 1954 and 1955 and published as one volume by Faber & Faber in 1955. The following year he became a full-time writer, and in 1957 his first science fiction book, the short story collection Space, Time and Nathaniel was published. His first science fiction novel, Non-Stop was published in 1958. Since then Aldiss has been a prolific writer, best known for his science fiction novels, novellas and short stories, including the award-winning Helliconia trilogy. He has also been a historian and critic of the genre, and has edited many science fiction collections. In addition, his ‘mainstream’ writing has included the novels The Male Response, Forgotten Life and the semi-autobiographical Horatio Stubbs sequence. He was elected a Fellow of the Royal Society of Literature in 1989. In 1990 he published his autobiography, Bury my heart at W.H. Smith’s. the collection is held by the University of Reading Special Collections Services – see the full collection description.

Other ‘New Worlds’

Pan-African Congress 1945 and 1995 Archive
The Pan-African Congress was a series of meetings, held throughout the world. In 1945 Manchester hosted the 5th Pan-African Congress. The Pan-African Congress was successful in bringing attention to the decolonization in Africa and in the West Indies. The Congress gained the reputation as a peace maker and made significant advance for the Pan-African cause. One of the demands was to end colonial rule and end racial discrimination, against imperialism and it demanded human rights and equality of economic opportunity. The manifesto given by the Pan-African Congress included the political and economic demands of the Congress for a new world context of international cooperation. material is held by the Ahmed Iqbal Ullah Race Relations Resource Centre – see the full collection description.

Records of the British Union for the Abolition of Vivisection, 1865-1996
The British Union for the Abolition of Vivisection (BUAV) was founded in 1898 by Miss Frances Power Cobbe (1822-1904). Concern for the welfare of animals was not a new phenomena, the first wave of anti-vivisection feeling in England commenced around the middle of the nineteenth century. The Second World War appeared to foster greater ideas of cooperation within the animal welfare movement. The Conference of anti-vivisection Societies first met on 20 November 1942. Five societies were represented at the invitation of BUAV ‘for the purpose of discussing and making plans for a joint intensive campaign, after the war, to claim the total abolition of vivisection as a necessary step towards securing for animals their rightful place in the new world order, which it is generally believed will follow the peace’. The immediate post war period began to see a rise in public demonstrations as a medium to spread the anti-vivisection message, in particular these were held outside vivisection laboratories. The collection is held by Hull University Archives, Hull History Centre – see the full collection description.

The Percy Johnson-Marshall Collection, 1931-1993
Percy Edwin Alan Johnson-Marshall (1915-1993) was one of the most energetic of a generation of town-planners who began their careers in the 1930s and, after the Second World War, dedicated their lives to the creation of a new world of social equity through the radical transformation of the human environment. Material held by Edinburgh University Library Special Collections – see the full collection description.

Find out more

Names (7): Into the Unknown

On the Archives Hub we have plenty of name entries without dates. Here is an example of the name string ‘Elizabeth Roberts’ (picked entirely randomly) from several different contributors:

Richard and Elizabeth Roberts
Roberts, Elizabeth fl. 1931
Elizabeth Grace Roberts
Roberts, Elizabeth Grace
Elizabeth Roberts
Roberts, Elizabeth
ROBERTS, Elizabeth Grace
ROBERTS, Mrs Elizabeth Grace

The challenge we have is how to work this names like this. Let me modify this list into an imaginary but nonetheless realistic list of names that we might have on the Hub, just to provide a useful example (apologies to any Elizabeth Roberts’ out there):

Elizabeth Roberts 1790-1865
Elizabeth Roberts, 1901-1962
Elizabeth Roberts b 1932
Elizabeth Roberts fl. 1958
Elizabeth Roberts, artist
Elizabeth Roberts
Elizabeth Roberts
Elizabeth Roberts

How should we treat these names in the Archives Hub display? If we can make decisions about that, it may influence how we process the names.

These names can be separated into two types (1) name strings that identify a person (2) name strings that don’t identify a person. This is a fundamental difference. It effectively creates two different things. One is an identifier for a person; one is simply a string that we can say is a name, but nothing more.

If we put two descriptions together because they are both a match to Elizabeth Roberts, 1790-1865, then we are stating that we think this is the same person, so the researcher can easily see collections and other information about them. 

If we put two descriptions together that are both related to Elizabeth Roberts we are not doing the same thing.  We are simply matching two strings. 

Which of these names is an identifier? That depends upon levels of confidence, and that is why being able to set and modify levels of confidence is crucial.

Elizabeth Roberts 1790-1865 – this is enough to identify a person.  In theory, there could be two people with the same life dates, but the chances are very low. So, we would bring together two entries and represented them on one name page.

Elizabeth Roberts b 1932 – Is a birth or death date enough? It allows for some measure of certainty with identity, and we would probably deem this to be enough to identify a person and match to another Elizabeth Roberts born in 1932, but it is not certain. If this Elizabeth Roberts was the creator, and she has several mentions of ‘art’, ‘artist’ and ‘painting’ in her biography, it is more likely that she is the same as Elizabeth Roberts, artist and might be useful to create a link, but would it be enough for a match?

Elizabeth Roberts fl 1931 – whilst a floruit date helps place the person in a time period, it is not enough to confidently identify a person.  

Elizabeth Roberts, artist – occupation or other epithet enough is not usually enough to identify someone.   If there is a biographical history, there is more information about the person, but this is not enough to be sure. 

If we had an entry such as Elizabeth Roberts, Baroness Wood of Foxley (completely imaginary and just for the purposes of example), then the epithet is more helpful. We might decide that this identifies a person enough for a match with any other instances of Elizabeth Roberts with baroness wood and foxley in the name string.

If we had MacAlister, Sir Donald, 1st Baronet, physician and medical administrator then ‘1st baronet’ alongside the name should give enough confidence for a match with another entry for 1st Baronet.

Display behaviour

So, how might we reflect this in the display? It can be useful to think about the display and researcher requirements and expectations and work back from there to how we actually process the data.

Firstly we might group two entries if they have the same date.

But this does not offer much benefit to the end user. They still see eight entries for this name string. So, we might bring together the entries that match exactly on the name string.

But there are still two entries that are essentially just name strings – the fl. and the ‘artist’ entry are essentially the same as those without any additional information in that they are name strings and they do not identify a person, so it makes sense to group all of these entries.

screenshot of shortest list of names with matching

We now have a short set of entries. We can’t merge any more of them.

However, this does leave us with a problem. The end user is likely to assume that these all represent different people. That ‘Elizabeth Roberts’ is a different person from ‘Elizabeth Roberts 1901-1962’. The tricky thing is that she might be….and she might not be. It is likely that a user wanting Elizabeth Roberts with dates 1790-1865 would see the above list and click on the matching entry, not realising that the last three entries could also refer to the same person.  We don’t want to exclude these from the researcher’s thinking without hinting that they may represent the same person.

We might give the list a heading that hints at the reality, such as ‘We have found the following matches:’. Maybe ‘matches’ would have a tool tip to say that the entries without dates could match the entries with dates. It is quite hard to even find a way to say this succinctly and clearly.

The identifiable names would link to name pages. We might provide information on the name pages to again emphasise that other Elizabeth Roberts entries could be of interest. We haven’t yet decided what would be best in terms of behaviour for the non-identifiable names – they might simply link to a description search – it does not make much sense to have a full name page for an unidentified person where all you have is one link to one archive description. We can’t provide links to any other resources for a non-identifiable name; unless we simply provide e.g. a Wikipedia lookup on the name. But again, we face the issue of misleading the end user; implying a ‘same as’ link when we do not have enough grounds to do that.

Names as creators

We may decide to treat creator names differently. Archival creator does have a significant meaning – it emphasises that this is a collections about that person or organisation (though even the nature of the about-ness is difficult to convey). But many users do not necessarily appreciate what an archival creator is, and many descriptions don’t provide biographical histories, so could this end up creating confusion? Also, in the end a creator name is far more likely to include life dates, so then they would have a full name page anyway. What would be the benefit of treating a creator name with no life dates and no biographical history differently from an index term and giving it a name page? You would just be linking to one archive, albeit ‘their’ archive.

What about if a name string record, say the Elizabeth Roberts fl 1931, has been ingested as an EAC record, i.e. a name record that was created by one of our contributors? It is likely that name records will include a full date of birth, or at least a birth or death date, but this is not certain. Whilst we are not currently set up to take in EAC-CPF name records, we do plan to do this in the future. If the name is provided through an EAC record and they are a creator, they may have a detailed biography, and may have other useful information, such as a chronology, so a name page would be worthwhile.  

This short analysis shows some of the problems with providing a name-based interface. We will undoubtedly encounter more thorny issues. The challenge, as is so often the case, is just as much about how to convey meaning to end users when they are not necessarily familiar with archival perspectives, as it is about how to process the data.

And we haven’t even got to thinking about Eliza Roberts or Lizzy Roberts…..

Birkbeck’s Archive

Archives Hub feature for October 2020

Birkbeck was founded as the London Mechanics’ Institute on the evening of the 11th November 1823, when approximately 2,000 people listened to Dr George Birkbeck speak on the importance of education for working Londoners at the Crown and Anchor Tavern on the Strand.  Supporters there that evening included Jeremy Bentham, the philosopher and originator of Utilitarianism, Sir John Hobhouse, a Radical MP who held several important government posts across his career, and Henry Brougham, a liberal MP, anti-slavery campaigner and educational reformer.

George Birkbeck, founder of Birkbeck painted by Samuel Lane circa 1825, Birkbeck Image Collection.

Birkbeck has been transforming lives by helping people access higher education for nearly 200 years. This year, 2020, we celebrate our 100th anniversary of our membership of the University of London. When Birkbeck joined the University of London, it was on the condition that it should continue to provide evening teaching, and this remains our central mission.

The Library at Breams Building, Chancery Lane, Birkbeck Image Collection.
The Library at Breams Building, Chancery Lane, Birkbeck Image Collection.

As we move toward our 200th anniversary in 2023, part of the Birkbeck archive was rediscovered in an offsite storage facility. This has proved to be a rich source, not only providing insights not into our institutional history but also stories of both staff and students allowing us glimpses into their lives. We now find ourselves in the position of having two sections of the archive, each telling our story from different perspectives.

One section of the archive is held in the main Birkbeck building and is comprised of records pertaining to the history of Birkbeck from an organisational context, including minutes of various committees, published student journals and newsletters, annual reports, calendars, early student registers and staff information. 

Birkbeck College, Courses of Study front cover. Birkbeck Image Collection.
Birkbeck College, Courses of Study front cover. Birkbeck Image Collection.

The second section is held offsite and is made up of a range of material including; war correspondence, departmental papers, estates documents, all of which demonstrate Birkbeck’s unique aim and how that aim has held strong through changing political, economic and cultural times.

To date one Birkbeck academic, Professor Joanna Bourke, has explored this material, along with two of her PhD students. They have found it to be an excellent source for their research. One of the themes that runs through the archive is around trends in education such as educational policies and practices. This includes charting the life cycle of different academic disciplines as well as documenting different approaches to teaching and the broader aspects student life.

Art class at the Birkbeck Literary and Scientific Institution, Breams Buildings, circa 1915, Birkbeck, University of London. Birkbeck Image Collection.

Like many university archives, we have records of notable Birkbeckians who worked or studied with Birkbeck. We can now develop more of a picture of the lives of people such as; JD Bernal (Crystallography), Eric Hobsbawm (History), Nikolaus Pevsner (History of Art), Helen Gwynne-Vaughan (Botany). We can also learn more about those who were less well-known who studied here and made an impact like the playwright Arthur Wing Pinero and socialist, women’s rights activist Annie Besant. The library is creating an online timeline to highlight the life and work of various Birkbeck academics as part of the celebrations in the lead up to our 200th anniversary.

Helen Gwynne Vaughan in her Botany Laboratory with students circa 1923, Birkbeck, University of London. Birkbeck Image Collection.

In terms offering different perspectives, this part of the archive also holds accounts of the wider Birkbeck community, beyond the academic staff and students, those members of staff working in catering and hospitality roles, administrative staff, laboratory technicians. This provides an opportunity to explore social history through those lived experiences documented through various formats, such as letters and photographs.

It’s an exciting time at Birkbeck as we continue to uphold the ethos and pursue the central mission of providing access to education for all. Birkbeck is still London’s only specialist provider of part-time evening higher education as well as being a world-class research institution. The archive will continue to tell the story of Birkbeck as an institution as well as all those who work, study and research here. You can follow Birkbeck’s journey to its 200th anniversary.

Main Birkbeck Building, Birkbeck Image Collection.

Emma Illingworth
Subject Librarian for Science (Biological, Earth & Planetary, Psychological)
Library Services, Birkbeck, University of London

Related

Browse all Birkbeck Library Archives and Special Collections, University of London descriptions available to date on the Archives Hub.

All images copyright Birkbeck Library Archives and Special Collections, University of London. Reproduced with the kind permission of the copyright holders.

Names (6): Deduplication at scale

Having written several blogs setting out ideas and thoughts about challenges with names, this post sets out some of our plans going forwards in order to create name records for a national aggregator; something that can work at scale and in a sustainable way. The technical work is largely being undertaken by Knowledge Integration, our system suppliers, though working closely with the Archives Hub team.

Consider one repository – one Hub contributor. They have multiple archives described on the Archives Hub, and maybe hundreds or thousands of agents (people and organisations) included in those descriptions. All of this information will be put into a ‘management index‘. This will be done for all contributors. So, the management index will include all the content, from all levels, including all the names. A huge bucket of data to start us off.

A names authority source such as VIAF or any other names data that we would like to work with will not be treated any differently to Archives Hub data at this stage. In essence matching names is matching names, whatever the data source. So, matching Archives Hub names internally is the same as matching Archives Hub names to VIAF, or to Library Hub, for example. However, this ‘names authority’ data will not go into our big bucket of Archives Hub data, because, unless we create a match with a name on the Hub, the authority data is not relevant to us. Putting the whole of VIAF into our bucket of data would create something truly huge. It is only if we think that this external data source has a name that matches a person or organisation on the Hub that it becomes important. So data from external sources are stored in separate reference indexes (buckets) for the purposes of matching.

Tokenisation

Knowledge Integration are employing a method known as tokenization, which allows us to group the data from the indexes into levels (It is quite technical and I’m not qualified to go into it in detail, so I only refer briefly to the basic principles here. Wikipedia has quite a good description of tokenization). With this process, we can establish levels that we believe will suit our purposes in terms of confidence. Level 1 might be for what we think is a guaranteed match, such as where an identifier matches. So, for example, Wikidata might have the VIAF identifier included, so that the VIAF and Wikidata name can be matched. In some cases, the Archives Hub data includes VIAF IDs, so then the Hub data can be matched to VIAF. We also hope to work with and create matches to Library Hub data, as they also have VIAF ID’s.

Image showing versions of a name all with the same ID.
If all versions of a name have the same ID then they can be matched.

Level 2 might be a more configurable threshold based around the name. We might say that a match on name and date of birth, for example, is very likely an indication of a ‘same as’ relationship. We might say that ‘James T Kirk’ is the same person as ‘James Kirk’ if we have the same date of birth. This is where trial and error is inevitable, in order to test out degrees of confidence. Level 3 might bring in supporting information, such as biographical history or information about occupation or associated places. It is not useful by itself, but in conjunction with the name, it can add a degree of certainty.

Screenshot of part of a biographical history
Biographical information may be used to help match names

We are also thinking about a Level 4 for approaches that are Archives Hub specific. For example, if the same name is provided by the same repository, could we say it is more likely to be the same person?

This tokenisation process is all about creating a configurable process for deduplication. Tokens are created only for the purposes of matching. Once we have our levels decided, we can create a deduplication index and run the matching algorithm to see what we get.

Approaches to indexing

For deduplication indexing, the first thing to do is to convert to lower case and remove all of the non-alpha characters. (NB: For non-latin scripts, there are challenges that we may not be able to tackle in this phase of the project).

The tokens within the record will be indexed in multiple ways within the deduplication index to facilitate matching. This includes indexing all words in order that they appear, and also individual word matches.

Then, particularly when considering using text such as biographies to help identify matches, we can use bigrams and trigrams. These essentially divide text into two and three words chunks. A search can then identify how many groups of two and three words have matched. Generally, this is a useful method of ascertaining whether documents are about the same thing. It may help us with identifying name matches based upon supporting information. This is very much an exploratory approach, and we don’t know if it will help substantially with this project, but certainly it will be worth trying out this approach, and also considering using it for future data analysis projects.

Character trigrams break down individual words into groups of three characters and may be useful for the actual names. This should be useful for a more fuzzy matching approach, and it help to deal with typos. It can also help with things like plurals, which is relevant for working with the supporting information.

We are also going to explore hypocorisms. This means trying out matches for names such as Jim, Jimmy and James or Ned, Ed, Ted and Edward. A hypocorism is often defined as a pet name or term of endearment, but for us it is more about forename variations. Obviously Jim Jones is not necessarily the same person as James Jones, but there is a possibility of it, so it is useful to make that kind of match on name synonyms. It is often defined as a pet name or term of endearment.

Hypocorisms refers to pet names or terms of endearment

From this indexing approach we can try things out and see what works. There is little doubt that it will require an iterative and flexible approach. We can’t afford to set up a whole process that proves ineffective so that we have to start again. We need an approach that is basically sound and allows for infinite adjustments. This is particularly vital because this is about creating a framework that will be successful on an on-going basis, for a national-scale service. That is an entirely different challenge to creating a successful outcome for a finite project where you are not expecting to implement the process on an on-going basis. Apart from anything else, a project with a defined timescale and outcome gives you more leeway to have a bit of human intervention and tweak things manually to get a good result.

Group records

Using the tokenisers and matching methods we can try processing the data for matches. When records are matched with a degree of certainty, a group record is created in the deduplication index. It is allocated a group id and contains the ids of all of the linked records. This is used as the basis for the ‘master record’ creation.

Primary or master records

I have previously blogged some thoughts about the ‘master record’ idea. Our current proposal is that every Archives Hub name is a primary record, unless it is matched. So, if we start out with six variations of Martha Beatrice Webb, 1858-1943, then at that point they are all primary records and they would all display. If we match four of them, to a confidence threshold that we are happy with, then we have three primary records. One of the primary records covers four archives. We may be able to still link the other two instances of this name to the aggregated record, but we can assign a lower confidence threshold to this.

Diagram showing instances of the name Beatrice Webb and how they might match.
Deduplication for ‘Beatrice Webb’

In the above example (which is made up, but reflects some of the variations for this particular name) four of the instances of the name have been matched, and so that creates a new primary record, with child records. Two of the instances have not been matched. We might link them in some way, hence the dotted line, or they might end up as entirely separate primary records. The instance of Beatrix Potter, nee Webb, has not been matched (these two individuals are often confused, especially as they have the same death date). If we set levels of confidence wrongly, this name could easily be matched to ‘Beatrice Webb’.

The reasoning behind this approach is that we aggregate where we can, but we have a model that works comfortably with the impossibility of matching all names. Ideally we provide end users with one name record for one person – a record that links to archive collections and other related resources. But we have to balance this against levels of confidence, and we have to be careful about creating false matches. Where we do create a match, the records that were previously primary records become ‘child records’ and they no longer display in the end user interface. This means we reduce the likelihood of the end user searching for ‘william churchill’ and getting 25 results. We aim for one result, linking to all relevant archives, but we may end up with two or three results for names that have many variations, which is still a vast improvement.

If we have several primary records for the same person (due to name variations) then it may be that new data we receive will help us create a match. This cannot be a static process; it has to be an effective ongoing workflow.

Comic strips and seaside holidays: unexpected stories from the Save the Children Archive

Archives Hub feature for September 2020

The Save the Children (SCF) archive, held at the Cadbury Research Library, University of Birmingham, charts the development of the charity from its creation in 1919. The collection includes a wealth of material relating to the charity’s founder, Eglantyne Jebb, and these papers provide a fascinating insight into how SCF operated during the 1920s. They also highlight the personal stories of individuals associated with SCF.

Concertina comic strips

Illustrated concertina comic strip (ref: SCF/EJ/9/2).

One fascinating item is a wonderful illustrated concertina comic strip created by Corinne de Candole, documenting her first week working at the SCF office in April 1925. She dedicated the strip to ‘Miss Jebb who showed me how the New World is being built at the Office of the Save the Children Fund’. The strip depicts Corinne’s interview with a Mrs Beach, as well as the making of blue cloaks and flags and ‘planning for the new world’.

Travelling to Geneva (ref: SCF/EJ/9/2).

Another two comic strips reveal how Corinne travelled to Geneva for the summer school in 1925 and she also wrote two poems about this experience: ‘The Disobedient Lady who never got to the SCF Summer School’ and ‘The Obedient Lady who went to the SCF Summer School’. Through these documents we can sense the pride with which Corinne felt for working for SCF and her thoughts on how it was helping change the world.

Thank you letters

The overseas country papers in the Eglantyne Jebb series highlight the personal stories of those affected by the crisis in Europe after the First World War. The Horak family, from Hungary, wrote a letter of appreciation to SCF, offering thanks and remembering their benefactors.

The Horak family letter with typed translation, 1922 (ref: SCF/EJ/1/17/1).

‘From the bottom of our hearts sending our Christmas Greetings and very best wishes [and] we are always thinking gratefully of those who helped to get homes for us poor war invalids and widows with our families. May you be as happy as you have made us […] The little cottage means also a new life to us, making us forget our sufferings and losses. We beg the Almighty to pour his blessing over you and your family and give long life and happiness to those who provided us with a home. This will be our prayer on this holy Christmas eve.’

The letter is accompanied by a photograph of the Horak family (ref: SCF/EJ/1/17/1).

In a letter to Miss Vulliamy, who was leading SCF funded projects in Poland, Vera Staack describes how her mother, and herself, had to flee Russia due to the Bolsheviks:  ‘But why are they frightened, why do I read such terror in their eyes? I shall explain you the reason. The red banner flashes, and on it the black words which make everybody tremble. “Death to the bourgeois.”…..The fathers or mothers are taken from their children, children are torn from their parents sides. And so everybody tries to hide quickly.’

‘The picture of the past rises involuntary before me. Christmas Eve! It was our last Christmas Eve in our native land-in far off Moscow. An enormous Christmas-tree made dazzlingly brilliant by quantities of electric lamps and brilliant ornaments and many, many presents…..And all this has been taken from me by the Bolsheviks. Dear Miss Vulliamy, and I shall have no more Christmas-trees or Christmas Eves, and mother is always very cross now, cries often, and wishes to speak to no one. She was quite different before.’

Letter from Vera Staack, 1921 (ref: SCF/EJ/1/22/7).

‘And now good-bye, my dear, dear English friend. I hug you very hard and remain your very respectful and unhappy little Domby friend

She ends ‘P.S. Why are men so wicked, dear Miss Vulliamy.’

A seaside holiday

Another example can be found in a report entitled ‘A seaside holiday’, written by M. Brown, where we learn of the impact that a trip to the beach had for a group of young children: ‘“Who pushes the sea?” Is water never still?” “Does sand bite?” […] even the Ukrainian student was among the unbelievers who doubted whether the sea was salt, and made a wild dash to stoop down and taste it to make quite sure that he was not being deceived.’

The children then share their stories of the horrors that they have been through: ‘that was a long time ago…my mama died in the truck on the way from Russia. She died of hunger my mama did not live long after my daddy was killed by the Bolshevists. I wouldn’t believe it at first when the doctor came round and bent down and listened to her heart and said that mam was dead.’

‘A seaside holiday’ report, 1922 (ref: SCF/EJ/1/22/8).

‘All the children have their own sad story, and all have lived through strange and dreadful times, and in all their young faces can be read the tragedy of the homeless and the outcast. It is to build up their energy for the life struggle before them that Miss Vulliamy inaugurated the Children’s Holiday Home at Danzig in 1922.’

These archives offer a glimpse into the traumatic events which children and families faced in the aftermath of the First World War, the attempts by SCF to help and the appreciation that this generated.

Matthew Goodwin
Save the Children Project Archivist
Cadbury Research Library, University of Birmingham

Related

Browse all Cadbury Research Library, University of Birmingham descriptions available to date on the Archives Hub.

All images copyright Cadbury Research Library, University of Birmingham. Reproduced with the kind permission of the copyright holders.

Names (5): The Problem of anonymity

cartoon of person asking 'who am I?'It is easy to focus on names that represent fairly well known people.  But one of the challenges for archives is to work with little known people – names that represent someone who is referenced in a catalogue – maybe they are indexed because they are a correspondent for example – they appear in one of a series of letters – but there is no more information about them other than their name. They may be referenced in other sources, but we have little to go on in order to discover that, and often they won’t be represented – it may be that this is the only written source that includes them.

In a names service, we can add a name – let’s say ‘Louisa Jane Justamond’ – a name from https://archiveshub.jisc.ac.uk/data/gb12-ms.add.8556 (‘The Garland continued’, a collection of poems addressed to her).  We only have that one instance of that name. It is not in VIAF, it is not in Wikidata. There is an instance listed in ‘A genealogical and heraldic dictionary of the landed gentry of Great Britain’ (a precursor to Burke’s peerage). But unless we decide to use that an external source, write a name matching algorithm and decide, on levels of confidence, that it is indeed a match, that is not going to help us.  We are left with a name attached to one archive collection and nothing else.

We can create a name record for Justamond, but if we display it on the Archives Hub it will simply show her name and a link back to the related description.  It will be extremely minimal.

However, what we don’t know is whether new collections will be added to the Archives Hub, or new information added to Wikidata or another source that we use, such that this person becomes more identifiable.   We simply don’t know what the value of a name might be.  In the future, having a record of this person could prove to be immensely useful in making a connection.

Archives have what you might call a long tail of names. It is something that characterises our holdings. It is something that sets us apart from libraries and museums, at least to a degree.  Most names represented in library holdings (or names they represent in their catalogues and other finding aids) represent identifiable people.

Graph showing the long tail of names
The long tail of names

In archives, we have collections that represent ordinary people, not published, not celebrated, not notorious, with no documented place in history. We also have collections that include people where it is hard to know whether an individual is more widely known, because the archive collection does not entirely identify them.

Either way, it leaves us with a question about how to deal with a name that has nothing else attached to it other than ‘this name is in this letter’.

Building an index of all names means that we have a store of data that can be used for further exploration. It could sit behind the scenes, but it can be used to try out tools, data manipulation and matching.  In other words, the data is a separate thing from what you decide to display.

Having a name (maybe not knowing exactly who the name represents) and knowing that the name is in three different archives has value.  We can say ‘in the absence of any other information, we assume these names represent the same person’, or we can simply present the information and not make any conclusions (although that begs the question of how you present it without encouraging assumptions).  It is then up to researchers to explore further.  We might find new data sources that help to clarify names. We might get new descriptions that help to do this.

Many archival descriptions include subjects and, to a lesser extent, places. If you have Stephen Merryweather in one, with an index term of botany, and S. Merryweather in another, with the same index term, then you could say it is more likely to be a match. There is a question of how you might then present that information. The use of algorithms raises the issue of how to convey levels of confidence. It feels as if we need to have a more sophisticated – and recognised – means of presenting levels of confidence.

This whole issue of confidence levels is more of a focus for archives, because of the anonymity I’ve talked about.

Diagram showing Relationships of data involved in creating name records
Relationships of data involved in creating name records

The ‘Name’ records shown above are the names within archival descriptions (EAD records on the Hub).  These names can be pulled out from ‘origination’ (creator) and from ‘persname’ (usually in the controlaccess index section, but potentially elsewhere in the description).  These names may represent ‘unknown’ people, the EAD may not even indicate whether they are personal or corporate or family names. They may not include dates, they may just be ‘Mary Fleming’ or ‘Mary Fleming fl 1717’.  They may also be ‘unknown’, ‘[unknown]’, or even ‘unknown unknown’ (keeping the surname, forename structure!).  They may be ‘Name of author (various)’ or ‘Various health authority bodies’ or ‘Possibly Miss M. Lindsay’. All these are examples from our data.  They illustrate the conflict between human readable data – where ‘unknown’ is useful – and machine processable data – where semantics are important, and a name is ideally just a name.

If we create ‘Name’ entries for all of these then we have a store of data to work with, something I’ve mentioned before in my Names Project blogs.  We can then find out how many ‘Mary Fleming’ entries there are, or  how many ‘M Fleming’ entries. How we then choose to display that information to end users is a separate question.  But with the advances in machine learning, it is becoming an increasingly pertinent question.

We have an opportunity with archival metadata, with the way that archives represent ‘ordinary life’. But it is a challenge Catalogues are still not really set up to identify entities (in a way that works for machine processing). We create what we refer to as ‘name authorities’ but we do not usually consider the importance of matching names outside of individual organisations. The Archives Hub has an opportunity to work on behalf of UK archives to try to draw out people and, in a sense, identify them, or at least, enable them to be more contextualised. But it will require a good deal of experimentation and expertise in working with disparate data.  However, if we create a pool of names and provide an API, that would enable others to work with the data, and try different approaches.  This is a big challenge, and it needs a concerted and collaborative approach.

 

picture of anonymous crowd

 

 

Fish are jumpin’ in the Archives

Archives Hub feature for August 2020

Summertime and the livin’ is easy...” ¹. Well, it’s a rather wet summer in the UK but all the better for exploring collections on the theme of fish!

Plotosus lineatus (Catfish). Copyright: Alain Feulvarch (https://commons.wikimedia.org/wiki/File:Catfish_Plotosus_lineatus.jpg). Creative Commons 2.0 license: https://creativecommons.org/licenses/by/2.0/deed.en
Plotosus lineatus (Catfish). Copyright: Alain Feulvarch (https://commons.wikimedia.org/wiki/File:Catfish_Plotosus_lineatus.jpg). Creative Commons 2.0 license.

We’ve trawled the Archives Hub (sorry, couldn’t resist!) to bring you a selection of the wonderful, and sometimes surprising, collections relating to fish, ranging across research, expeditions, fisheries, the fishing industry and river authorities – not forgetting a fish and chip shop, a theatre and several appropriately named individuals.

Research and Expeditions

Fishes Collected by Darwin, 1842. 300 pages of notes on the fish collected by Darwin on the Beagle, compiled by Leonard Jenyns (1800-1893), a clergyman and naturalist; Jenyns changed his name to Leonard Blomefield in 1871. Held by the Museum of Zoology Archives, University of Cambridge https://archiveshub.jisc.ac.uk/data/gb433-jenynsdarwin.

C Tate Regan collection, 1912-1913. Charles Tate Regan (born in 1878) was keeper of zoology at the British Museum. He worked on the scientific results of the Scottish National Antarctic Expedition, 1902-1904 (leader William Speirs Bruce) and the British Antarctic Expedition, 1910-1913 (leader Robert Falcon Scott). He died in 1948. Published work includes ‘Antarctic fishes of the Scottish National Antarctic Expedition’ in the Reports of the scientific results of the voyage of the steam yacht Scotia and ‘Fishes’ and ‘Larval and post larval fishes’ published in the zoology reports of the British Antarctic Expedition, 1910-1913. Held by the Scott Polar Research Institute Archives, University of Cambridge https://archiveshub.jisc.ac.uk/data/gb15-charlestateregan.

Cuthbertson drawing of an Atlantic lizardfish. Copyright the National Museums Scotland Library.
Cuthbertson drawing of an Atlantic lizardfish. Copyright the National Museums Scotland Library (adapted from the full image included in the William Speirs Bruce Archive feature, August 2017).

Winifred E. Frost collection, 1930s-1960s. Frost was an authority on the natural history of fish in the Lake District. Research includes work on euphausids with professor James Johnstone at Liverpool university and she worked for the fisheries branch at Dublin investigating trout in the River Lifey. She was appointed to the Freshwater Biological Association in 1938 and was awarded a D.S.c. by Liverpool University for her published papers. She wrote The Trout with Margaret E.Brown (Varley) published in 1967 that took 21 years to prepare. She was a member of the Council of the Salmon and trout association, and president of the Windermere and District angling association, also travelling to international scientific meetings and undertaking investigation of eels in Africa. Held by the Freshwater Biological Association Archives https://archiveshub.jisc.ac.uk/data/gb986-frow.

Notes towards a dictionary of fish names, by Paul Barbier (C20th). Barbier was Professor of French Language and Literature at the University of Leeds, 1903-1938. The collection comprises 8 boxes of notes prepared in the course of research for an unpublished dictionary of names of fishes. Held by University of Leeds Special Collections https://archiveshub.jisc.ac.uk/data/gb206-ms125.

Solenostomus paradoxus - Harlequin Ghost Pipefish. © Steve Childs (https://commons.wikimedia.org/wiki/File:Solenostomus_paradoxus_-_Harlequin_Ghost_Pipefish.jpg). Creative Commons 2.0 license https://creativecommons.org/licenses/by-sa/2.0/deed.en.
Solenostomus paradoxus – Harlequin Ghost Pipefish. © Steve Childs (https://commons.wikimedia.org/wiki/File:Solenostomus_paradoxus_-_Harlequin_Ghost_Pipefish.jpg). Creative Commons 2.0 license.

Rosemary Lowe-McConnell Collection, 1934-1947. Lowe-McConnell was a pioneer in tropical fish ecology. She was born in Liverpool, and graduated from the university. She worked at the Freshwater Biological Association studying the migration of silver eels. In 1993 Michael N. Bruton interviewed Lowe-Connell on the personal reasons behind her choice of work, and her personal influences, and experiences of being a woman in a male dominated world. Initially she wanted to be an explorer/naturalist, with the reply being ‘never mind dear, perhaps you can teach’.  When applying for the colonial services in 1945, to be an entomologist, they would not employ a female one, but the tropical fisheries department was new, and not considered as important. Despite her being forced to resign in 1954 when the marriage bar was in place, she was more interested in pursuing her findings than concerned with job status, and she believed that the fact she had been offered the directorship at the Joint Fisheries Research organisation in central Africa (which she rejected) showed her that she was accepted despite being female. Held by Freshwater Biological Association Archives
https://archiveshub.jisc.ac.uk/data/gb986-lowr.

Journal of John Walsh’s Visit to France in 1772. John Walsh (1726-1795) was elected to the Royal Society in 1770, and became known for his work on the electric ray, Torpedo marmorata. In 1769 Edward Banfield proved that the electric eel emitted electric shocks, and Walsh set out to confirm that the ray had a similar power. In this he was encouraged by Benjamin Franklin, whose American colleagues were undertaking similar investigations. With his nephew Arthur Fowkes he spent the summer of 1772 at La Rochelle, where the ray was often captured. The fish could survive many hours out of water, and Walsh was able to conduct experiments ashore and successfully proved that the ray’s shocks were caused by electricity. His findings were published in the Royal Society’s Philosophical Transactions, vol. 63 (1773), pp. 461-77, and the Royal Society awarded him the Copley medal for his achievement. Held by University of Manchester Library https://archiveshub.jisc.ac.uk/data/gb133-engms724.

Fisheries and the Fishing industry

Records of Aberdeen Fish Curers and Merchants Association, 1888-1947. The association was established in May 1888, as Aberdeen Fish Trade Association, and was incorporated with its present title in 1944. It began in response to the introduction of sales by auction in the late nineteenth century, its first achievement being an agreement amongst fish sellers to provide discounts for cash sales to accredited buyers. Membership was open to wholesale fish merchants and fish curers carrying on a business in Aberdeen, and in 1980 stood at more than 200. Held by University of Aberdeen Special Collections https://archiveshub.jisc.ac.uk/data/gb231-ms3054.

Records of the Berwick Salmon Fisheries Co Ltd, salmon fishers, Berwick upon Tweed, England, 1562-1964 (predominant 1860-1964). The Old Shipping Co, shipping traders and salmon fishers, Berwick-upon-Tweed, Northumberland, England, was established at some point prior to 1766 by a group of local men, mainly coopers, who held shares in a small sailing fleet engaged in the London, coastal and foreign trade. As commodities included salmon, the company leased fishing rights on the river Tweed. The shipping vessels were sold off in 1869 as business had become unprofitable and the company’s name changed to Berwick Salmon Fisheries Co Ltd in 1872. Held by University of Glasgow Archive Services  https://archiveshub.jisc.ac.uk/data/gb248-ugd245.

Volume containing two copies of a printed register relating to Netherlands herring fisheries, 1749: entitled Naamlyst der boekhouders, schepen, en stuurluiden van de haring-shepen, in’t Yaar 1749, van Enchisen en de Ryp, ter haring-shepen uitgevaren (Jan von Guissen, Enkhuisen, 1749), giving details of the ships, owners and captains of the fleets of Enkhuisen and De Rijp. Added in manuscript are details of the total catch for 1749, and the catch for individual ships on various voyages. Held by Senate House Library Archives, University of London 
https://archiveshub.jisc.ac.uk/data/gb96-ms115.

Women Fish Sellers - from Hamilton, Robert (1866) British Fishes, Part II, Naturalist's Library, vol. 37, London: Chatto and Windus. Image in the public domain (photograph from the Freshwater and Marine Image Bank at the University of Washington).
Women Fish Sellers – from Hamilton, Robert (1866) British Fishes, Part II, Naturalist’s Library, vol. 37, London: Chatto and Windus. Image in the public domain (photograph from the Freshwater and Marine Image Bank at the University of Washington).

Grimsby Steam and Diesel Fishing Vessels’ Engineers’ and Firemen’s Union, 1897-1987. The Grimsby Steam Fishing Vessels’ Engineers’ and Firemen’s Union was founded in 1896. It changed its name to the Grimsby Steam and Diesel Fishing Vessels’ Engineers’ and Firemen’s Union in 1961. In 1976 it transferred engagements to the Transport and General Workers’ Union, becoming 10/3c Branch. Held by Modern Records Centre, University of Warwick https://archiveshub.jisc.ac.uk/data/gb152-gsf.

The business records of Shippam’s Ltd, 1853-1995. The Shippam’s business first started in 1786, when Charles Shippam established a grocery store in Westgate, Chichester. In 1886 they began food manufacturing and in 1894 launched a wide range of potted meat and fish pastes, for which Shippam’s was to become internationally famous. Held by West Sussex Record Office https://archiveshub.jisc.ac.uk/data/gb182-shippam’s.

Fish and Chips

Fish and chips on the seafront at Hunstanton, Norfolk UK (in this instance the fish is deep fried plaice). © Andrew Dunn, http://www.andrewdunnphoto.com/. Creative license https://creativecommons.org/licenses/by-sa/2.0/deed.en.
Fish and chips on the seafront at Hunstanton, Norfolk UK (in this instance the fish is deep fried plaice). © Andrew Dunn, http://www.andrewdunnphoto.com/. Creative Commons  2.0 license.

Records of Pesci Bros Fish and Chip Shop, 1920-1994. The Pesci family, originally from Bardi in Italy, came to Barking from Wales in 1934, and went on to open a fish and chip shop at 15 Broadway. Only a few years later the shop was compulsorily purchased by Barking Borough Council so that the site could be used for the building of the new Town Hall. After a long search for a new premises, the family finally re-opened at 26 Ripple Road in 1939. The business flourished for nearly 60 years. Held by Barking and Dagenham Archive and Local Studies Centre https://archiveshub.jisc.ac.uk/data/gb350-bd76.

River authorities

Records of the Centre for Environment, Fisheries and Aquaculture Science, Benarth Road, Conwy, 1916-1994. In December 1999 the Conwy Laboratory closed after approximately ninety years of pioneering research and development into fish and shellfish aquaculture. The laboratory’s foundation came about following the building of mussel purification tanks by Conwy Corporation in 1913, in an attempt to improve the quality of Conwy mussels, which had been at the centre of several serious infections. The collection is of scientific importance in documenting experiments of international significance. Additionally, it reflects the traditional activities of the mussel fishermen themselves. Held by Gwasanaeth Archifau Conwy / Conwy Archive Service https://archiveshub.jisc.ac.uk/data/gb2008-cd3.

Environment Agency Collection, 1786-2010. The collection consists of reports, surveys, data records, maps, administrative records and other material relating to the work of the Environment Agency (and of its predecessor organisations the various River Boards, River Authorities, Water Authorities and the National Rivers Authority). A few documents date back to the 19th century and earlier, the majority spans the 1930s to the 1990s. Most of the collection relates to the Agency’s monitoring and management of the area’s river and lake catchments, with an emphasis on fisheries, biodiversity, constructions such as fish passes, weirs and fish traps, fish diseases, water quality and pollution. Included are papers relating to the Agency’s corporate, strategy and public affairs, as well as information on regional and national byelaws, net limitation orders and historic fishery rights. Held by Freshwater Biological Association Archives https://archiveshub.jisc.ac.uk/data/gb986-enva.

A Different Kettle of Fish

Records relating to Ada Fish, First World War munitions worker at Pembrey, 1918-1919. Held by West Glamorgan Archive Service https://archiveshub.jisc.ac.uk/data/gb216-d/dz969.

Fisher Theatre, Bungay, 1790-1886. The Fisher theatre at Bungay, Suffolk, opened in February 1828. Built by David Fisher I, the theatre was one of a dozen serving the circuit of Fisher’s company, The Norfolk and Suffolk Company of Comedians and seasons of performances were produced on a two-year cycle. The theatre was sold by the Fishers in 1844 and was used subsequently as a corn hall, furniture store, steam laundry, cinema, and textile warehouse. In 2000 the building was acquired by the Bungay Arts Trust. After extensive renovations the building was re-opened in 2006 as a community theatre and arts centre which is also licensed for wedding and civil ceremonies. Held by the University of East Anglia Archives https://archiveshub.jisc.ac.uk/data/gb1187-ftb.

Papers of Robert Salmon Hutton, 1897-1970. Hutton was born in 1876 in London. His family owned a silversmiths in Sheffield. Hutton pursued his research interests in electro-metallurgy with Professor Arthur Schuster at Manchester and Henri Moissan in Paris. From 1900-1908 he was a lecturer in electro-chemistry at the University of Manchester, where he carried out pioneering work on electric furnace technology, seeing its value for commercial metallurgy. In 1903 he perfected a method for the mass production of fused silica. Hutton had a great interest in research and development, and he was aware of failings in this area by British metallurgical industries. A great believer in the value of technical libraries, he was a founder of the Association of Scientific Libraries Information Bureau (ASLIB) in 1924. Held by University of Manchester Library https://archiveshub.jisc.ac.uk/data/gb133-hut.

Engraving of Anthias Anthias at that time called Anthias Sacer. The Author ran out of resources while issuing this book and therefore every engraving had its own sponsor. This one has been sponsored by Sigmund Zois Freiherr von Edelstein. Author: Bloch, Marcus Elieser, 1723-1799. Item/Page/Plate: Pl. 315, opp. p. 86. Image in the Public Domain(https://creativecommons.org/publicdomain/mark/1.0/deed.en; PD-US), courtesy of The New York Public Library, www.nypl.org.
Engraving of Anthias Anthias at that time called Anthias Sacer. The Author ran out of resources while issuing this book and therefore every engraving had its own sponsor. This one has been sponsored by Sigmund Zois Freiherr von Edelstein. Author: Bloch, Marcus Elieser, 1723-1799. Item/Page/Plate: Pl. 315, opp. p. 86. Image in the Public Domain (https://creativecommons.org/publicdomain/mark/1.0/deed.en; PD-US, courtesy of The New York Public Library).

Herring, Thomas (1693-1757). Papers of Thomas Herring, Archbishop of Canterbury 1747-57. 4 volumes, held by Lambeth Palace Library https://archiveshub.jisc.ac.uk/data/gb109-herring.

Papers of George Gordon Hake, 1891-1904. Hake was born in 1847. He spent thirteen years from 1891 working in South Africa, initially with the British South Africa Company and later with the Tanganyika Telegraph Service during 1889 and 1903 in the Mashonaland area. He died in 1903 and was buried at Port Herald. Hake was closely connected to the Rossetti family in their later years, acting as a ‘minder’ to Dante Gabriel Rossetti during one of their family holidays. Christina Rossetti was also godmother to his daughter Ursula. Held by School of Oriental and African Studies (SOAS) Archives, University of London https://archiveshub.jisc.ac.uk/data/gb102-ppms40.

Henry Guppy (1861-1948) was librarian of the John Rylands Library from 1900-1948. Held by University of Manchester Library https://archiveshub.jisc.ac.uk/data/gb133-tft/tft/1/459.

Declaration of Trust of Leasehold Property in Breams Buildings, Chancery Lane, London, 1888. Lease for the Breams Building, which was the main Birkbeck site from 1888-1952. The lease is in the form of a soft cover book, written over several velum pages, with wax seals on the last page. Held by Birkbeck Library Archives and Special Collections, University of London https://archiveshub.jisc.ac.uk/data/gb1832-bbk/bbk/6/1.

John Whiting Archive, 1917-1963. Whiting, a playwright and actor, was born in 1917 Salisbury, UK. He received his education at Taunton School and then later trained as an actor at Royal Academy of Dramatic Art. After his time in the army Whiting had some success as an actor and then went onto write numerous plays, short stories and plays for radio. Whiting also took up theatre criticism during the last few years of his life for ‘London Magazine’, some of his work can be found in the ‘The Art of Dramatist’ (1970). Held by V&A Theatre and Performance Collections https://archiveshub.jisc.ac.uk/data/gb71-thm/222.

Roe Manuscripts, 10th-17th century. Sir Thomas Roe was born in 1580 or 1581, and matriculated at Magdalen College, Oxford, in 1593, but took no degree. In 1605 he was knighted, and in 1614 began his official journeys to the East which made him famous. From that year to 1618 he was Ambassador to Jehngr, the Mogul emperor of Hindustan, and from 1621 to 1628 to the Turkish Court. In 1640 Roe was elected a burgess of the University in Parliament, and died in 1644. The manuscript collection comprises:  27 Greek, one Hebrew, one Arabic, and one Latin. Held by the Bodleian Library, University of Oxford https://archiveshub.jisc.ac.uk/data/gb161-mss.roe1-17,18a-b,19-29.

A "tornado" of schooling barracudas at Sanganeb Reef, Sudan. Copyright: Robin Hughes (https://commons.wikimedia.org/wiki/File:Barracuda_Tornado.jpg). Creative Commons 2.0 license: https://creativecommons.org/licenses/by-sa/2.0/deed.en.
A “tornado” of schooling barracudas at Sanganeb Reef, Sudan. Copyright: Robin Hughes (https://commons.wikimedia.org/wiki/File:Barracuda_Tornado.jpg). Creative Commons 2.0 license.

Rocket assisted take off by a Barracuda, 1945 – on HMS Trumpeter. 2 photos, held by Gwasanaeth Archifau Conwy / Conwy Archive Service https://archiveshub.jisc.ac.uk/data/gb2008-cp1727/cp1727/4/1/40.

Previous features relating to Fish

Silt, sluices and smelt fishing – The Eau Brink Cut and the Bedford Level Corporation Archive

Silt, sluices and smelt fishing – The Eau Brink Cut and the Bedford Level Corporation Archive

William Speirs Bruce Archive in the National Museums Scotland Library

William Speirs Bruce Archive in the National Museums Scotland Library

1. George Gershwin – Summertime lyrics: https://www.stlyrics.com/songs/g/georgegershwin8836/summertime299720.html

Names (4): Ethics and identity

As archivists, we deal with ethical issues a good deal.  But the ability to link disparate and diverse data sources opens up new challenges in this area, and I wanted to explore this a bit.

If you do a general search for ethics and data, top of the list comes health. An interesting example of data join-up is the move to link health data to census data, which could potentially highlight where health needs are not being met:

“Health services are required to demonstrate that they are meeting the needs of ethnic minority populations. This is difficult, because routine data on health rarely include reliable data on ethnicity. But data on ethnicity are included in census returns, and if health and census data for the same individuals can be linked, the problem might be solved.” (Ethnicity and the ethics of data linkage)

However, individuals who stated their ethnicity in census returns were not told that this might subsequently be linked with their health data. Should explicit informed consent be given? Given the potential benefits, is this a reasonable ask? It is certainly getting into hazardous terrain to ignore the principle of informed consent. In their book ‘Rethinking Informed Consent in Bioethics‘, Manson and O’Neill argue that informed consent cannot be fully specific or fully explicit. They argue for a distinctive approach where rights can be waived or set aside in controlled and specific ways.

This leads to a wider question, is fully explicit and specific informed consent actually achievable within the joined-up online world? A world where data travels across connections, is blended, re-mixed, re-purposed. A world where APIs allow data to be accessed and utilised for all sorts of purposes, and ‘open data’ has become a rallying cry.  Is there a need to engage the public more fully in order to gain public confidence in what open data really means, and in order to debate what ‘informed consent’ is, and where it is really required?

I am working on a project to create name records, and I am looking at bringing data sources together. Of course, this is hardly new. Wikipedia is the most well-known hub for biographical data. Anyone can add anything to a Wikipedia page (within some limits, and with some policing and editing by Wikipedia, but in essence it is an open database).  Wikidata, which underlies Wikipedia, is about bringing sources together in an automated way.  Projects within cultural heritage are also working on linked data approaches to create rich sources of information on people. SNAC has taken archival data from many different archive repositories and brought it together. A page for one person, such as Martin Luther-King provides a whole host of associations and links. These sources are not all individually checked and verified, because this kind of work has to be done algorithmically. However, there is a great deal of provenance information, so that all sources used are clear.

image of page from the face of white australia website
The Face of White Australia

There are some amazing projects working to reveal hidden histories. Tim Sherratt has done some brilliant work with Australian records. Projects such as Invisible Australians, which aims to reveal hidden lives, using biographical information found in the records. He has helped to create some wonderful sites that reveal histories that have been marginalised.  Tim talks about ‘hacking heritage’ and says: ‘By manipulating the contexts of cultural heritage collections we can start to see their limits and biases. By hacking heritage we can move beyond search interfaces and image galleries to develop an understanding of what’s missing.’ (Hacking heritage, blog post)  He emphasises that access to indigenous cultural collections should be subject to community consultation and control.  But what does community consultation and control really mean?

I have always been keen to work with the names in archival descriptions – archival creators and all the other people who are associated with a collection. They are listed in the catalogue (leastways the names that we can work with are listed – many names obviously aren’t included, but that’s another story), so they are already publicly declared. It is not a case of whether the name should be made public at all, or, at least, that decision has been made already by the cataloguer.   But our plan is to take the names and bring them to the fore – to give them their own existence within our service.  We are taking them out of the context of a single archive collection and putting them into a broader one. In so doing, we want to give the archive collections themselves more social context, we want to give more effective access to distributed historical records, and we also want to enable researchers to travel through connections to create their own narratives.

This may help to reveal things about our history and highlight the roles that people have played. It may bring people to the fore people who have been marginalised.  Of course, it does not address the problem of biases and subjective approaches to accessions and cataloguing. But a joined-up approach may help us to see those biases and gaps; to understand more about the silent spaces.

Creating persistent identifiers and linking data reveals knowledge. It is temping to see that in simple terms as a good thing.  But what about privacy and ethics?  Even if someone is no longer living, there are still privacy issues, and many people represented in archives are alive.

Do individuals want to be persistently identified? What about if they change their identity? Do they want a pseudonym associated with their real name? They might have very good reasons for keeping their identity private. Persistent identification encourages openness and transparency, which can have real benefits, but it is not always benign.  It is like any information – it can be used for good and bad purposes, and who is to say what is good and what is not? Obviously we have GDPR and the Data Protection Act, and these have a good deal to say about obligations, the value of historical research and the right to be forgotten. This is something we’ll need to take into account. But linked data principles are not so much about working with personal data as working with data that may not seem personal, but that can help to reveal things when linked with other sources of data.

GDPR supports the principle of transparency and the importance of people’s awareness and control over what happens to their personal data. Even if we are not creating and storing personal data, it seems important to engage with data protection and what this means. The challenge of how to think about data when it is part of an ever shifting and growing  global data environment seems to me to be a huge one.

Certainly the horse has bolted to some degree with regards to joining up data. The Web lowered barriers considerably, and now we increasingly have structured data, so it is somewhat like one gigantic database. Finding things out about individuals is entirely feasible with or without something like a Names service created by the Archives Hub. We are not creating any new content, but creating this interface means we are consciously bringing data together, and obviously we want to be responsible, and respect people’s right to privacy. Clearly it is entirely impractical to try to get permission from all those living people who might be included. So, in the end, we are taking a degree of risk with privacy.  Of course, we will un-publish on request, and engage with any feedback and concerns. But at present we are taking the view that the advantages and benefits outweigh the risks.

 

Image of exhibition photograph of black rights march

“Imagine being a sibling in a family that continually removes you from photos; tries its best to erase you…As you go through [the scrapbook] you see events where you know you were there, but you are still missing.”  Lae’l Hughes-Watkins (University of Maryland) gave an impassioned and inspiring talk at DCDC 2019 about her experiences.  She argued that archivists need to interrogate the reality that has been presented, and accept that our ideas of neutrality are misplaced. She wants a history that actively represents her – her history and culture, and experiences as a black woman in the USA. She related moving stories of people with amazing stories (and amazing archives) who distrust cultural institutions because they don’t feel included or represented.

This may seem a long way away from our small project to create name records, but in reality our project could be seen as one very small part of a move towards what Lae’l is talking about.  Bringing descriptions together from across the UK together maybe helps us to play a small role in this – aiming to move towards documenting the full breadth of human experience. The archives that we cover may retain the biases and gaps for some time to come (probably for ever, given that documentary evidence tends to represent the powerful and the elite much more strongly), but by aggregating and creating connections with other sources, we help to paint a bigger picture.  By creating name records we help to contextualise people, making it much easier to bring other lives and events into the picture. It is a move towards recognising the limitation of what is actually in the archive, and reaching out to take advantage of what is on the Web.  In doing this through explicitly identifying people we do leave ourselves more open to the dangers of not respecting privacy or anonymity. When we plug fully into the Web, we become a part of its infinite possibilities, which is always going to be a revealing, exciting, uncontrollable and risky business. By allowing others to use this data in different ways, we open it up to diverse perspectives and uses.

 

 

 

Here’s a riddle: how can you work in an Archive Centre when you can’t work in an Archive Centre?

Archives Hub feature for July 2020

It’s a dilemma in this strange and worrying time. The collections are there, you know this. You know they are safe. For the time being, for you to remain safe, for all of us to remain safe, you can’t go near them. But this is your job, and much more than that – a passion. We know that archives are stories, solidified memories of individuals, groups, institutions. Many have been around a lot longer than us, and will be there after we’re gone. But at this point of their long, interesting history, we are their gatekeepers, their tenders. Donors from all walks of life have entrusted us with their stories, letting go of the physical, holding only to the ephemeral, and yet now…now we too are distanced from the physical. So, again, how do we work in an Archive Centre when we can’t work in an Archive Centre?

Blythe Duff is a Scottish actress born in East Kilbride on 25 November 1962.  She has worked continuously since her debut as part of the Scottish Youth Festival in 1984. Though she has gone on to ply her trade mainly in theatre, she is perhaps best known for her role as Detective Sergeant Jackie Reid in the long-running Glasgow-based crime series Taggart. In 2011 she was awarded an honorary doctorate from Glasgow Caledonian University for services to the performing arts and in 2012 was made a cultural fellow of GCU.

It was in this guise that in 2018 she generously donated her decades-worth of accumulated Taggart artefacts to GCU Archive Centre. It is a rich, fascinating and rewarding resource for fans of the show both die-hard and casual, for aspiring scriptwriters, those with an interest in television production, and indeed for anyone with even a passing interest in Glasgow through the lens of British popular culture.

I’ve been thinking about this collection in these fast and slow days, weeks, and months of lockdown, as I adjust to this new, remote set-up. Once the working day is done, the laptop shut for the evening, I find myself, like so many, at a loose end. With so much temporarily closed, the question has become not so much what do I do, as what do I watch?

Blythe Duff and John Michie standing side by side between shelves of archive boxes and materials. Each is looking into camera and holding several scripts.
Blythe Duff and fellow Taggart star John Michie in GCU Archive Centre at the launch of her papers on 24th October 2018.

With this in mind the Blythe Duff Taggart papers are a fascinating insight into the televisual process of the late 20th century. As a scriptwriting graduate, I am particularly enthralled by the variety of artefacts on offer. There are 138 individual scripts contained in the collection, spanning from Blythe’s debut on the show in 1990 all the way to 2010. Researchers will find a mixture of rehearsal scripts and shooting scripts, a fantastic insight into the malleable nature of the production process. Particularly poignant is the two versions of 1994’s two-parter ‘Legends’. Mark McManus, the titular Taggart, tragically died before production had finished. The two versions, one featuring Detective Chief Inspector Jim Taggart, and the other re-written without, offer a glimpse into what could have been, as well as the embryonic steps of the show of which Taggart was to become.

It is the little details in the collection that draw me back to it – the scribbled notes on the pages, the inside jokes of the cast. Though the collection is currently uncatalogued, researchers will find Blythe’s personalised chair cover, a monogrammed Taggart jacket, along with a photo of Blythe in character in full police uniform. There are books as well; 25 Years of Taggart and Taggart’s Glasgow.  Other artefacts include Taggart wrap party flyers, postcards of different actors from the show – one signed by cast members. There’s even a Taggart Mystery Jigsaw Puzzle game!

Selection of photographs, artefacts, all from television show Taggart, artfully laid on black backdrop.
Selected Taggart treasures from the Blythe Duff papers.

Since becoming available to researchers, it is one of the collections at GCU Archive Centre that has proved most popular with a wide range of visitors. Almost as soon as it was publicised with a visit to the Archive Centre by Blythe and fellow cast member John Michie, we’ve had members of the public – some of whom had never been in an archive before – pop their head into the reading room and ask if they could read an episode. We’ve had a family of fanatics all the way from Australia, a couple from England where the husband surprised his super-fan wife for a special birthday, and many more besides.

It’s also a particularly relevant resource for the University’s learning and teaching as GCU has offered a Masters course in Television Fiction Writing since 2010, the first of its kind in the UK. One of the course leaders, Chris Dolan, was previously a writer for Taggart. Students of the course have examined the scripts, seeing how they’re structured, potentially being inspired in their own work.

Close up photo of cover page of script for episode of Taggart. ‘Blythe’ handwritten in top corner.
Cover page of one of Blythe’s scripts.

The frustration of not being able to go into the Archive Centre each day, not being able to see collections, or chat to team members with ease, is very real. Nonetheless, we have all adjusted to working from home. Team meetings still occur through the magic of MS Teams, projects are still ongoing, new challenges arise and are met. And in the thick of the unprecedented time we are in, if I think back to my initial question, I realise it is possible to work in an Archive Centre even if you can’t work there. For it is the collective knowledge we have, and our willingness to ensure collections are protected and as available to as many as possible that is the lifeblood of archival work. Archives are indeed stories, and at this juncture we’ve reached a twist worthy of Taggart himself. But the path we’re on, though long and difficult will lead us all back to where we want to be. It’s too tragic a time to call it a happy ending, but we’ve certainly had enough of cliff-hangers and will take a bittersweet conclusion.

David Ward
Archive Assistant
Glasgow Caledonian University Archive Centre – Sir Alex Ferguson Library

Related

Browse all Glasgow Caledonian University Archives and Special Collections descriptions available to date on the Archives Hub

All images copyright Glasgow Caledonian University. Reproduced with the kind permission of the copyright holders.

Names (3): One name record to bind them

It has been great to get comments and feedback around names, and I wanted to expand upon something that a few people have commented on….the ideal of one ‘authority record’ for one person or organisation.

model showing relationships of catalogues and name records
Model showing potential relationships between catalogues and name records

 

The above diagram is a proposal for the relationships we might have – note that is it a working model, and may well change over time. You can see the catalogues (the descriptions of archives) include people, some with biographical histories, and these people are either creators of archive collections or referenced in them.  Each of these people then gets a name record (bottom left box), so we might have e.g. three name records for  the same name (and the same name may potentially the same person…or may not). We will work with the store of records that we have with the aim of creating matches, and ending up with a generic or main name record (green box, top left).

The ‘main record’ or ‘master record’ or whatever we might call it, for each individual person or organisation, is not an ‘archival record’. It is not intended simply to be a reflection of what is in our own data. It is intended to be a page dedicated to that person or organisation.  Our current feeling is that this should not be seen as domain specific; in fact, we want to get away from the idea that data is domain specific.  It is about an entity (a person or organisation), and what we know of that entity.

Keeping in mind the green box, and looking at the person page for Robin Day from Exploring British Design, a previous AHRC project we ran with Brighton Design Archive, you get a sense of the type of thing we mean.

Page for Robin Day, from the Exploring British Design website
Exploring British Design: Robin Day

This page presents as a general information page about a designer. It is not branded as a page about archives. It takes information in from different sources. Is it an ‘authority’ record?  I’m really not sure; I wouldn’t call it that. The point is really that it enables researchers to put Robin Day into the context of other people, organisations places and events, or at least it demonstrates how that can be done. It creates a network, and it intends to show the value of including archives in a network, rather than standing apart, in their ‘own world’.

Screenshot of an entity relationship diagram for Robin Day
Visualised relationships

 

The network can easily be visualised. There are tools out there to do this. The challenge is to create the data to feed into these visualisers. Again, this visualisation is not about archival name authority records, it is not domain specific.

 

 

In the Robin Day page, we have a section for related archives and museum resources.

screenshot showing archives related to Robin Day
Related archive and museum resources

 

This lists archives Robin Day is the ‘creator of’ or archives he is ‘associated with’.  It links to the Archives Hub, but also to other sources. One of the options for end users is to go and find out more about the archival sources, but it is not prioritised above other options.

 

 

 

 

 

So, this is essentially the idea – a page for a person, a page for an organisation. An information resources that focuses on creating a network of connections.  We think this is a good approach, but creating something along these lines that is automated, sustainable and effective within an ongoing national service is much harder.

Why not just use this one record, link to the archive catalogues, and dispense with the individual name records that we have created? There are three reasons to consider providing access to the individual name records:  biographical history,  uncertainty around matching and ingesting name authority records.

I have already written about biographical and administrative history in a separate post.

In this phase of the Names Project the individual records for Beatrice Webb (as a name example), will be created either from the creator name or index terms that we have in the Archives Hub catalogues.

The main problem is the wide variation in name entries.

Webb, Beatrice
Webb, Beatrice, née Potter
Webb (Martha) Beatrice, 1858-1943
Webb, Martha Beatrice, 1858-1943
Webb;[Martha] Beatrice [nee Potter] 1858-1943

These are all entries in the Archives Hub.  We can match them all up, but can we say they are all the same?   Names without dates should not be matched with certainty, but quite often they will be the same person. (Beatrix Potter also often ends up being linked with Beatrice Webb, née Potter).

The decision we need to make is whether to provide links to these individual name records that we will have, or only use them as a source of data.  It seems valuable to enable end users to see these names as a group, but it is another thing to risk integrating information from them all into one name record.  There is no perfect answer to this, but it does seem important to clearly indicate the level of uncertainty.  So many names that we have don’t have life dates, or have variations in structure.  What we are looking to achieve is a clear provenance, giving end users the best understanding of what they are seeing.

What about name records that have been created by our contributors?  The name records we create ourselves from catalogue descriptions will generally be no more than the name, dates, and biographical history.  But, going forwards, we will want to work with much more detailed name records.

For Exploring British Design we created rich name records with an entity-relationship structure (essentially using the EAC-CPF structure and working in RDF),  to demonstrate the power of connecting entities.  For this purpose, we partially hand-crafted the name records, as well as carrying out some very complex processing to create various connections.

screenshot of part of the timeline for Robin Day
Part of the timeline for Robin Day

The example above shows events from the Robin Day timeline, with linked connections to related organisations.  If we ingest EAC-CPF records we might get timelines like this.

Name records may also include relationships. The Borthwick Institute has good examples of name records with plenty of rich relationship information. e.g. Charles Lindley Wood, Viscount Halifax.

screenshot of part of the Viscount Wood record showing relationships to other people
An excerpt from a Borthwick entry for Charles Lindley Wood

If we took this record into the Archives Hub it might seem to make sense for it to become the main person record for Wood.  But that would involve a process of making choices, preferencing one name record over another.  Possible, but tricky to do in an automated way. Another record office might also have a splendid example of a name entry for this person, with some different data. Furthermore, this record has links to the Borthwick catalogue. We would potentially have to remove these links.

It would be very challenging to create one record from several source EAC-CPF records for the same person –  to blend timelines, or sort out relationships listed in different records, bearing in mind that it needs to be done in an automated way, keeping version control and dealing with revisions and new data coming in that might add to the name record.  How could we compare and blend two lists of relationships? Or two chronologies? We’d probably end up having to keep them all, and then potentially have similar but different relationships and chronologies, giving a slightly confused user experience.

If we do ingest records like the one above, we will have to figure out how these  more detailed records will relate to what we have already created.  If, as planned, we have one generic name record for a person, it makes the job easier, as we won’t be looking to make any one EAC-CPF record into the main name record, we will simply link to it from the main record. Bear in mind, our main record is intended to be a domain-neutral entry – linking to other sources beyond archives.  EAC-CPF records might do this to some extent, but they are unlikely to link to the Jisc Library Hub, and probably won’t link to Wikidata, or other external sources.   They are far more likely to provide internal links to the archive catalogue they relate to.

Arguably, it might be easier to forget about creating name records ourselves (from the catalogue entries) and just work with name records that have been created by our contributors (which are likely to be well-structured and include life dates). But if we do that, the pot of names will grow slowly, as only a small proportion of repositories create name records. We can’t realistically give the end user a few thousand name records covering maybe 1-2% of our names – they might search for ‘Winston Churchill’ as a name, and find that we don’t have him!  It would not remove the problem of name matching, and it would make the whole idea of reaching out beyond the archive domain, by linking into other resources using our names as the hook, rather ineffectual.

Therefore, we propose to keep the separate name records in our system We propose to create a ‘generic record’, which is what would be prominent in the Archives Hub display. We would then have the potential to link the records together, to blend them,  to try some text mining and analysis techniques. It gives us options.  It would not be sensible to make those decisions now. It is better to lay the groundwork that enables us to be flexible.   This approach allows us to link to an individual name record where we don’t feel able to confirm a ‘same as’ relationship. It presents the option to the end user – here is a name – we think this is the same person, so we’ve provided a link.

The end user experience needs to make sense and not mislead or provide false information. Links to brief name records could seem confusing, but, as I have said, trying to bring together in one record all the information from several name records, with  their biographies, relationships, aliases, events, related resources, is likely to be a nightmare.  In the end, it will take a good deal more testing and working with researchers to work out what is best.