EAD and Next Generation Discovery

This post is in response to a recent article in Code4Lib, ‘Thresholds for Discovery: EAD Tag Analysis in ArchiveGrid, and Implications for Discovery Systems‘ by M. Bron, M. Proffitt and B. Washburn. All quotes are from that article, which looked at the instances of tags within ArchiveGrid, the US based archival aggregation run by OCLC. This post compares some of their findings to the UK based Archives Hub.

Date

In the ArchivesGrid analysis, the <unitdate> field use is around 72% within the high-level (usually collection level) description. The Archives Hub does significantly better here, with an almost universal inclusion of dates at this level of description. Therefore, a date search is not likely to exclude any potentially relevant descriptions. This is important, as researchers are likely to want to restrict their searches by date. Our new system also allows sorting retrieved results by date. The only issue we have is where the dates are non-standard and cause the ordering to break down in some way. But we do have both displayed dates and normalised dates, to enable better machine processing of the data.

Collection Title

“for sorting and browsing…utility depends on the content of the element.”

Titles are always provided, but they are very varied. Setting aside lower-level descriptions, which are particularly problematic, titles may be more or less informative. We may introduce sorting by title, but the utility of this will be limited. It is unlikely that titles will ever be controlled to the extent that they have a level of consistency, but it would be fascinating to analyse titles within the context of the ways people search on the Web, and see if we can gauge the value of different approaches to creating titles. In other words, what is the best type of title in terms of attracting researchers’ attention, search engine optimisation, display within search engine results, etc?

Lower-level descriptions tend to have titles such as ‘Accounts’, ‘Diary’ or something more difficult to understand out of context such as ‘Pigs and boars’ or ‘The Moon Dragon’. It is clearly vital to maintain the relationship of these lower-level descriptions to their parent level entries, otherwise they often become largely meaningless. But this should be perfectly possible when working on the Web.

It is important to ensure that a researcher finding a lower-level description through a general search engine gets a meaningful result.

Archives Hub search result from a Google search
A search result within Google

 

 

 

The above result is from a search for ‘garrick theatre archives joanna lumley’ – the sort of search a researcher might carry out. Whilst the link is directly to a lower -level entry for a play at the Garrick Theatre, the heading is for the archive collection. This entry is still not ideal, as the lower-level heading should be present as well. But it gives a reasonable sense of what the researcher will get if they click on this link. It includes the <unitid> from the parent entry and the URL for the lower-level, with the first part of the <scopecontent> for the entry.  It also includes the Archives Hub tag line, which could be considered superfluous to a search for Garrick Theatre archives! However, it does help to embed the idea of a service in the mind of the researcher – something they can use for their research.

Extent

“It would be useful to be able to sort by size of collection, however, this would require some level of confidence that the <extent> tag is both widely used and that the content of the tag would lends itself to sorting.”

This was an idea we had when working on our Linked Data output. We wanted to think about visualizations that would help researchers get a sense of the collections that are out there, where they are, how relevant they are, and so on. In theory the ‘extent’ could help with a weighting system, where we could think about a map-based visualization showing concentrations of archives about a person or subject. We could also potentially order results by size – from the largest archive to the smallest archive that matches a researchers’ search term. However, archivists do not have any kind of controlled vocabulary for ‘extent’. So, within the Archives Hub this field can contain anything from numbers of boxes and folders to length in linear metres, dimensions in cubic metres and items in terms of numbers of photographs, pamphlets and other formats. ISAD(G) doesn’t really help with this; the examples they give simply serve to show how varied the description of extent can be.

Genre

“Other examples of desired functionality include providing a means in the interface to limit a search to include only items that are in a certain genre (for example, photographs)”.

This is something that could potentially be useful to researchers, but archivists don’t tend to provide the necessary data. We would need descriptions to include the genre, using controlled vocabulary. If we had this we could potentially enable researchers to select types of materials they are interested in, or simply include a flag to show, e.g. where a collection includes photographs.

The problem with introducing a genre search is that you run the risk of excluding key descriptions, because the search will only include results where the description includes that data in the appropriate location. If the word ‘photograph’ is in the general description only then a specific genre search won’t find it. This means a large collection of photographs may be excluded from a search for photographs.

Subject

In the Bron/Proffitt/Washburn article <controlaccess> is present around 72% of the time. I was surprised that they did not choose to analyse tags within <controlaccess> as I think these ‘access points’ can play a very important role in archival descrpition.  They use the presence of <controlaccess> as an indication of the presence of subjects, and make the point that “given differences in library and archival practices, we would expect control of form and genre terms to be relatively high, and control of names and subjects to be relatively low.”

On the Archives Hub, use of subjects is relatively high (as well as personal and corporate names) and use of form and genre is very low. However, it is true to say that we have strongly encouraged adding subject terms, and archivists don’t generally see this as integral to cataloguing (although some certainly do!), so we like to think that we are partly responsible for such a high use of subject terms.

Subject terms are needed because they (1) help to pull out significant subjects, often from collections that are very diverse, (2) enable identification of words such as ‘church’ and ‘carpenter’ (ie. they are subjects, not surnames), (3) allow researchers to continue searching across the Archives Hub by subject (subjects are all linked to the browse list) and therefore pull collections together by theme (4) enable advanced searching (which is substantially used on the Hub).

Names (personal and corporate)

In Bron/Proffitt/Washburn the <origination> tag is present 87% of the time. The analysis did not include the use of <persname> and <corpname> within <origination> to identify the type of originator. In the Archives Hub the originator is a required field, and is present 99%+ of the time. However, we made what I think is a mistake in not providing for the addition of personal or corporate name identification within <origination> via our EAD Editor (for creating descriptions) or by simply recommending it as best practice. This means that most of our originators cannot be distinguished as people or corporate bodies. In addition, we have a number where several names are within one <origination> tag and where terms such as ‘and others’, ‘unknown’ or ‘various’ are used. This type of practice is disadvantageous to machine processing. We are looking to rectify it now, but addressing something like this in retrospect is never easy to do. The ideal is that all names within origination are separately entered and identified as people or organisations.

We do also have names within <controlaccess>, and this brings the same advantages as for <subjects>, ensuring the names are properly structured, can be used for searching and for bringing together archives relating to any one individual or organisation.

Repository

“Use of this element falls into the promising complete category (99.46%: see Table 7). However, a variety of practice is in play, with the name of the repository being embellished with <subarea> and <address> tags nested within <repository>.”

On the Archives Hub repository is mandatory, but as yet we do not have a checking system whereby a description is rejected if it does not contain this field. We are working towards something like this, using scripts to check for key information to help ensure validity and consistency at least to a minimum standard. On one occasion we did take in a substantial number of descriptions from a repository that omitted the name of repository, which is not very useful for an aggregation service! However, one thing about <repository> is that it is easy to add because it is always the same entry. Or at least it should be….we did recently discovery that a number of repositories had entered their name in various ways over the years and this is something we needed to correct.

Scope and content, biographical history and abstract

It is notable that in the US <abstract> is widely used, whereas we don’t use it at all. It is intended as a very brief summary, whereas <scopecontent> can be of any length.

“For search, its worth noting that the semantics of these elements are different, and may result in unexpected and false “relevance””

One of the advantages of including <controlaccess> terms is to mitigate against this kind of false relevance, as a search for ‘mason’ as a person and ‘mason’ as a subject is possible through restricted field searching.

The Bron/Proffitt /Washburn analysis shows <bioghist> used 70% of the time. This is lower than the Archives Hub, where it is rare for this field not to be included. Archivists seem to have a natural inclination to provide a reasonably detailed biographical history, especially for a large collection focussed on one individual or organisation.

Digital Archival Objects

It is a shame that the analysis did not include instances of <dao>, but it is likely to be fairly low (in line with previous analysis by Wisser and Dean, which puts it lower than 10%). The Archives Hub currently includes around 1,200 instances of images or links to digital content. But what would be interesting is to see how this is growing over time and whether the trajectory indicates that in 5 years or so we will be able to provide researchers with routes into much of the Archives Hub content. However, it is worth bearing in mind that many archives are not digitised and are not likely to be digitised, so it is important for us not to raise expectations that links to digital content will become a matter of course.

The Future of Discovery

“In order to make EAD-encoded finding aids more well suited for use in discovery systems, the population of key elements will need to be moved closer to high or (ideally) complete.”

This is undoubtedly true, but I wonder whether the priority over and above completeness is consistency and controlled vocabulary where appropriate. There is an argument in favour of a shorter description, that may exclude certain information about a collection, but is well structured and easier to machine process. (Of course, completeness and consistency is the ideal!).

The article highlights geo-location as something that is emerging within discovery services. The Archives Hub is planning on promoting this as an option once we move to the revised EAD schema (which will allow for this to be included), but it is a question of whether archivists choose to include geographical co-ordinates in their catalogues. We may need to find ways to make this as easy as possible and to show the potential benefits of doing so.

In terms of the future, we need a different perspective on what EAD can and should be:

“In the early days of EAD the focus was largely on moving finding aids from typescript to SGML and XML. Even with much attention given over to the development of institutional and consortial best practice guidelines and requirements, much work was done by brute force and often with little attention given to (or funds allocated for) making the data fit to the purpose of discovery.”

However, I would argue that one of the problems is that archivists sometimes still think in terms of typescript finding aids; of a printed finding aid that is available within the search room, and then made available online….as if they are essentially the same thing and we can use the same approach with both. I think more needs to be done to promote, explain and discuss ‘next generation finding aids’. By working with Linked Data, I have gained a very different perspective on what is possible, challenging the traditional approach to hierarchical finding aids.

Maybe we need some ‘next generation discovery’ workshops and discussions – but in order to really broaden our horizons we will need to take heed of what is going on outside of our own domain. We can no longer consider archival practice in isolation from discovery in the most general sense because the complexity and scale of online discovery requires us to learn from others with expertise and understanding of digital technologies.

 

 

 

 

 

 

 

We’re supporting EXPLORE YOUR ARCHIVE

Logo, Explore Your Archives campaign
Explore Your Archive, http://www.exploreyourarchive.org, developed by The Archives and Records Association (UK and Ireland) and The National Archives, is the biggest ever public awareness campaign by the archives sector of the UK and Ireland.

From 16 November there will be hundreds of events and activities taking place in all kinds of archives. Those who work in archives will also be sharing some of their wonderful stories and amazing treasures. The public are being encouraged not just to visit an archive or explore archival collections online, but to understand more of the vital role which archives play in education, business, transparency and identity.

How the Hub fits in

The Archives Hub is a gateway to archives held at over 220 institutions and organisations across the UK.

Explore…

Using our map to discover archives close to you:
http://archiveshub.ac.uk/contributorsmap/.

Search….

Using the Hub search at http://archiveshub.ac.uk/search.html to uncover other collections.

Discover…

Image: Ballerina advert.
© TSB savings advert, c. 1950. Lloyds Banking Group Archives.

A rich variety of content: The breadth of content on the Hub highlights how archives are integral to historical and cultural awareness. Our contributors include Universities, business archives, charities, local government, libraries, museums and cathedrals.

Here are just a few of the collections you can find:

From the Ancient…

Canterbury Cathedral: Records of the Dean and Chapter of Canterbury Cathedral, c800 to present. http://archiveshub.ac.uk/data/gb054-cca/dcc

The collection of records of Canterbury Cathedral includes material dating from the early Middle Ages right up to the present day. The material relates to the Cathedral’s estates and reflects the activities of the Dean and Chapter and its staff.

… to the Contemporary

Archive of the National Theatre of Scotland, 2006 to present.
http://archiveshub.ac.uk/data/gb247-stants

Launched in February 2006 and billing itself as a ‘theatre without walls’, the National Theatre of Scotland has no building of its own and operates within the existing infrastructure of Scottish theatre. Material is held at Glasgow University Library and includes programmes, press-cuttings, reviews and scripts.

From the Large…

Royal Greenwich Observatory: Records and Papers, 1675-1998.
http://archiveshub.ac.uk/data/gb012-ms.rgo

With around one kilometre of material, the records consist of all the surviving historical paper records of the Royal Observatory. Collections include: papers of the Astronomers Royal and telescope construction projects, management and observations, including the William Herschel Telescope and Radcliffe Observatory.

… to the Small

Gaelic Manuscripts, c. 1732-c. 1869. http://archiveshub.ac.uk/data/gb752-gm

One reel of microfilm comprising images of 23 original Gaelic manuscripts, relating to Ireland and to the activities of Irishmen at home and abroad, held at Queen’s University Belfast. It consists largely of fragments of both religious and secular verse, topographical poems and other tracts and tales dating mainly from the 18th and 19th centuries.

From the Young…

Children’s Society, 18th century – 21st century.
http://archiveshub.ac.uk/data/gb2180-tcs

The Children’s Society Archive comprises the records created and managed by The Children’s Society (titled The Waifs and Strays Society from 1881 to 1946). The majority of the collections date from the organisation’s founding in 1881. This includes a large quantity of visual material in the form of photographs and publicity material, as well as some audio-visual material.

… to the Older generation

Scrapbooks of Barking and Dagenham Branch of Age Concern, 2002-2008.
http://archiveshub.ac.uk/data/gb0350-bd58

This collection comprises six scrapbooks, containing newspaper cuttings on the Barking and Dagenham Branch of Age Concern, relating to events, as well as issues affecting elderly people in the borough.

From Northern Scotland…

Thomas S Muir, Architectural notes on churches on Scottish islands, 1850-1872. http://archiveshub.ac.uk/data/gb227-msbr783.m9

Thomas S Muir (1802-1888) worked for most of his life as a book-keeper in Edinburgh. All his spare time was devoted to his passion for early Scottish churches, visiting all the locations where ruins were to be found, including even the most inaccessible islands. The volume, ‘Ecclesiological notes on some of the islands of Scotland’, comprises detailed architectural descriptions, with line drawings, of features of churches and other ecclesiastical remains.

… to the Southerly Channel Islands

Image: Jersey Archive.
Image: Jersey Archive.

Archive of the States of Jersey, 1603 – 2010.
http://archiveshub.ac.uk/data/gb1539-c

The States of Jersey collection includes the minutes, correspondence, reports and acts of the States of Jersey. Also, the minutes of the different Committee’s of the States including Agriculture, Education, Defence, Housing, Social Security, Finance, Harbours and Airports, Health and Social Services, Tourism, Home Affairs, Planning and Environment, Economic Development and Policy and Resources.

From the Frozen Antarctic…

British Australian New Zealand Antarctic Research Expedition, 1929-1934. http://archiveshub.ac.uk/data/gb015-banzare

The collection comprises of press cuttings relating to the British Australian New Zealand Antarctic Research Expedition, 1929-1931.

…to the Heat of Africa

Africa 95, c. 1957-1996. http://archiveshub.ac.uk/data/gb102-africa95

Africa 95 was founded in 1992 to initiate and organise a nationwide season of the arts of Africa to be held in the UK in the last quarter of 1995. Printed material, photographs, and slides of the work of artists from Algeria, Egypt, Ethiopia, Ghana, Ivory Coast, Kenya, Morocco, Nigeria, Senegal, South Africa, Sudan, Uganda,Tanzania, Tunisia, Zambia, Zimbabwe, and the USA.

From the Fire brigade…

Fire Brigades Union, 1919-1997. http://archiveshub.ac.uk/data/gb152-mss.346

The Fire Brigades Union (FBU) was founded in 1918 as the Firemen’s Trade Union. The union began its life as a body very much based around the London area but soon expanded to include provincial brigades. The collection includes: Executive Council minutes, annual accounts, subject files (including Sizewell Public Inquiry, 1980s) and the national strike, 1977.

…to the Water board

Records relating to Derwent Valley Water Board, 1899-1974.
http://archiveshub.ac.uk/data/gb159-dvw

The collection comprises a full series of indexed bound minute books (1899-1974) containing annual statements of accounts, and other specific reports. Also, maps and plans relate to specific elements of intended works such as the building of Ladybower Reservoir in Derbyshire.

From the Arts…

D.H. Lawrence (1885-1930) Collection, 1865-1999.
http://archiveshub.ac.uk/data/gb159-la

The Lawrence Collection contains extensive materials by and about D.H. Lawrence, ranging in date from his childhood and including original manuscripts and his correspondence.

… to Science

Clifford Hiley Mortimer Collection, 1937-1980.
http://archiveshub.ac.uk/data/gb986-morc

This collection contains river and lake data in rivers in Britain, and correspondence regarding flows, inflows, chemical analyses and chemical stratification. It also includes mud samples!

From War…

Image: Poppy, World War One
© Image is in the public domain: papaver in High Wood, [tinelot@pobox.com Tinelot Wittermans]
Daniel Dougal First World War Diaries, 1914-1918.
http://archiveshub.ac.uk/data/gb133-ddd

Diaries of Daniel Dougal, which detail his service as an army doctor on the Western Front during the First World War. Dougal rose to become Deputy Assistant Director of Medical Services, 34th Division of the British Army, and his diaries provide important information on the operation of Army medical services.

… to Peace

Campaign for Nuclear Disarmament (CND), 1958-2008.
http://archiveshub.ac.uk/data/gb097-campaignfornucleardisarmament

The Campaign for Nuclear Disarmament (CND) is a non party-political British organisation advocating the abolition of nuclear weapons worldwide. Includes papers relating to the CND’s constitution, minutes of National Council, National Executive Committee annual conference papers and papers relating to Aldermaston marches and other demonstrations.

These are selected descriptions: there’s much more to discover by exploring the Hub! And we’re adding more descriptions every week. If you’d like to add your descriptions to the Hub, now’s a great time! See Be part of something bigger for information on how we can help you expose your collections to a worldwide audience.

Also of interest:

Work in an archive and want to be involved in the Explore Your Archive campaign?

It’s not too late to take part, visit: www.nationalarchives.gov.uk/yourtoolkit.

More on Collections

Image of Guardian staff
Guardian billing room staff, 1921. From the Guardian News and Media Archive. Copyright: Guardian.

Browse our Features pages to learn about the breadth of material described on the Hub: http://archiveshub.ac.uk/features/

Digital Humanities: Patterns, Pictures and Paradigms

The recent Digital Humanities @ University of Manchester conference presented research and pondered issues surrounding digital humanities. I attended the morning of the conference, interested to understand more about the discipline and how archivists might interact with digital humanists, and consider ways of opening up their materials that might facilitate this new kind of approach.

Visualisation within digital humanities  was presented in a keynote by Dr Massimo Riva, from Brown University. He talked about the importance of methodologies based on computation, whether the sources are analogue or digital, and how these techniques are becoming increasingly essential for humanities.  He asked whether a picture is worth one million words,  and presented some thought-provoking quotes relating to visualisation, such as a quote by John Berger: “The relation between what we see and what we know is never settled.” (John Berger, Ways of Seeing, 1972).

Riva talked about how visual projection is increasingly tied up with who we are and what we do. But is digital humanities translational or transformative? Are these tools useful for the pursuit of traditional scholarly goals, or do they herald a new paradigm?  Does digital humanities imply that scholars are making things as they research, not just generating texts?  Riva asked how we can combine close reading of individual artifacts and ‘distant reading’ of patterns across millions of artifacts. He posited that visualisation helps with issues of scale; making sense of huge amounts of data. It also helps cross boundaries of language and communication.

Riva talked about the fascinating Cave Writing at Brown University, a new kind of cognitive experience. It is a four-wall, immersive virtual reality device, a room of words. This led into his thoughts about data as a type of artifact and the nature of the archive.

“On the cusp of the twenty–first century…we speak of an ex–static archive, of an archive not assembled behind stone walls but suspended in a liquid element behind a luminous screen; the archive becomes a virtual repository of knowledge without visible limits, an archive in which the material now becomes immaterial.” This change “has altered in still unimaginable ways our relationship to the archive”. (Voss & Werner, 1999)

The Garibaldi panorama is a  276 feet long, a panorama that tells the story of Garibaldi, the Italian general and politician. blog-dighum-garibaldiIt is fragile and cannot be directly consulted by scholars. So, the whole panorama was photographed in 91 digital images in 2007. The digital experience is clearly different to the physical experience. But the resulting digital panorama can be interacted with it many various ways and it is widely available via the website along with various tools to help researchers interpret the panorama. It is interesting to think about how much this is in itself a curated experience, and how much it is an experience that the user curates themselves. Maybe it is both. If it is curated, then it is not really the archivists who are curators, but those who have created the experience  those with the ability to create such technical digital environments. It is also possible for students to create their own resources, and then for those resources to become part of the experience, such as an interactive timeline based on the panorama. So, students can enhance the metadata as a form of digital scholarship.

Riva showed an example of a collaborative environment where students can take parts of the panorama that interests them and explore it, finding links and connections and studying parts of the panorama along with relevant texts. It is fascinating as an archivist to see examples like this where the original archive remains the basis of the scholarly endeavour. The artifact is at a distance to the actual experience, but the researcher can analyse it to a very detailed level. It raises the whole debate around the importance of studying the original archive. As tools and environments become more and more sophisticated, it is possible to argue that the added value of a digital experience is very substantial, and for many researchers, preferable to handling the original.

Riva talked about the learning curve with the software. Scholars struggled to understand the full potential of it and what they could do and needed to invest time in this. But an important positive was that students could feedback to the programmers, in order to help them improve the environment.

We had short presentations on a diverse range of projects, all of which showed how digital humanities is helping to reveal history to us in many ways. Dr Guyda Armstrong made the point that library catalogues are more than they might seem – they are a part of cultural history. This is reflected in a bid for funding for a Digging into Data project, metaSCOPE, looking at bibliographical metadata as datamassive cultural history.  The questions the project hopes to answer are many: how are different cultures expressed in the data? How do library collections data reflect the epistemic values, national and disciplinary cultures and artifacts of production and dissemination expressed in their creation?  This project could help with mapping the history of publishing in space and time, as well as showing the history of one book over time.

We saw many examples of how visual work and digital humanities approaches can bring history to life and help with new understanding of many areas of research. I was interested to hear how the mapping of the Caribbean during the 18th century opened up the coastline to the slave traders, but the interior, which was not mapped in any detail, remained in many ways a free area, where the slave traders did not have control. The mapping had a direct influence on many people’s lives in very fundamental ways.

Another point that really stood out to me was the danger of numbers averaging out the human experience – a challenge with digital humanities approach, as, at the same time, numbers can give great insights into history. Maybe this is a very good reason why those who create tools and those who use them benefit from a shared understanding.

“All archaeological excavation is destruction”, so what actually lives on is the record you create, says Dr Stuart Campbell. Traditional monographs synthesize all the data. They represent what is created through the process of excavation. It is a very conventional approach. But things are changing and digital archiving creates new ways of working in the virtual world of archaeological data. Dr Campbell made the point that interpretation is often privileged over the data itself in traditional methods, but new approaches open up the data, allowing more narratives to be created. The process of data creation becomes apparent, and the approach scales up to allow querying that breaks out beyond the boundaries of archaeological sites. For example, he talked about looking at pattens on ancient pottery and plotting where the pottery comes from. New sophisticated tools allow different dimensions to be brought into the research.  Links can now be created that bring various social dimensions to archeological discoveries, but the understanding of what these connections really represent is less well understood or theorised.

Seemingly a contrast to many of the projects, a project to recreate the Gaskell house in blog-dighum-gaskellManchester is more about the physical experience. People will be able to take books down from the shelves, sit down and read them. But actually there is a digital approach here too, as the intention is to add value to the experience by enabling visitors to leaf through digital copies of Gaskell’s works and find out more about the process of writing and publishing by showing different versions of the same stories, handwritten, with annotations, and published. It is enhancing the physical experience with a tactile experience through digital means.

To end the morning we had a cautionary tale about the vulnerability of Websites. A very impressive site, allowing users to browse in detail through an Arabic manuscript, is to be taken down, presumably because of changes in personnel or priorities at the hosting institution.The sustainability of the digital approach is in itself a huge topic, whether it be the data or the dissemination approaches.

 

 

The Archives Hub, Swedish business, Welsh steel and British banks

blog-swedishvisit-medium
The Hub’s Jane Stevenson and Bethan Ruddock, with Stacy Capner, Nicholas Webb, and the delegation from the Swedish Business Archives Association

On 24th October, the Archives Hub was delighted to host a meeting of our colleagues from Sweden here in Manchester. The visitors were archivists with a particular interest in business and industry, and so we were very happy that Nicholas Webb from Barclays’ Group Archive and Stacy Capner, Business Archives Development Officer for Wales, both agreed to come along and speak.

Jane Stevenson opened with a presentation on the UK archival landscape. A topic that sounded easy in theory, but in practice is somewhat broad in scope! However, we tried to give our colleagues an overview of the professional bodies, standards training and career opportunities and concerns and challenges that make up the UK archives scene.

Per-Ola Karlsson, Head of Archives at the Swedish Center for Business History gave a talk on his work with the Centre for Business Archives. It was a shame that more colleagues from the business sector couldn’t join us because it was fascinating to hear about this approach to managing business archives. Per-Ola informed us that the Centre is the world’s largest private archive. The basic model is to hold business archives centrally; the centre will take in any business archive, and includes some of the leading businesses in Sweden, such as Ericsson, H&M and Unilever.

Per-Ola gave us some context to the formation of the Centre. Originally the assumption was that companies should take responsibility for their own archives, but this changed during the 1960’s, when companies were ceasing to exist and the archives were under threat. It was interesting to hear that the Government waded in on the debate, pressing for a solution (but reluctant to stump up any funds!).  Eventually regional business archives were established, and now the National Centre operates as a centre of expertise in business archives. Sweden has the most private business archives of any of the Nordic countries, and the contrast between the Swedish approach and Norwegian approach is marked, with Norway selecting companies’ archives, and Sweden encouraging all companies to deposit.

The pricing for the use of the Centre is by shelf metres. The depositor retains ownership and control, which in itself is a risk when the staff at the Centre invest so much time and effort in curating the collections. But they see their role as advocates and persuaders – they need to convince businesses that it makes good business sense to have an archive.  It means that requests for access can be vetted by the company, but many archives are fully open for researchers. Per-Ola talked about his role – in many ways serving the companies first, because the essence of the work is to attract archives; this is what will make the centre successful as a research centre.

It seemed to be a really positive thing to have this kind of model in so far as it promotes the importance of business archives and ensures there is a centre for advocating the vital importance of these archives for future research. The UK does great work through the Business Archives Council, but we wonder what business archivists would think of this kind of model for the UK? A central store for business archives, and a central pool of expertise. It means that in Sweden, archivists working within a business are much less common.

Stacy took us through the landscape of Wales, as told through its archives of industry. Coal, steel, iron, lager production, nuclear power – they are all quite localised, and tied in with local history in Wales. In the 1960’s, with the decline of heavy industry, many archives ended up in local record offices, but collection was not systematic.  There are no private business archives in Wales that are professionally managed.

Stacy pointed out that business archives are often more likely to be left uncatalogued – they are hard to deal with and understand, and more ‘attractive’ archives may take priority. Yet projects such as ‘Wales: Powering the World‘ show how business archives can be successfully used. One of the project’s outputs was a project by two Swansea University students encouraging others to use the archives (and especially business archives) to find research material.

We moved on to look at the archive at Barclays. Nick Webb gave us a thought-provoking talk that highlighted the role of an archive in a company that is struggling to regain its reputation. He gave very persuasive arguments around the vital role of an archive in providing transparency and, if not an objective view of history, at least a view that can be supported by documentary evidence. For instance, the archive shows Barclays’ true relationship to the slave trade, which is not as has often been portrayed. Whatever else the bank might be accused of, they had quite a strong Quaker history and campaigned against slavery. His lovely turn of phrase about archives being ‘a force against corporate amnesia’ really summed this up well. It was interesting to note how much the archive is used by employees – it really seemed that it has an important role to play and that this is properly recognised within the bank, especially since the team often put a monetary value on what they do! Nick has a great anecdote about a student who came into the archive to plough through archives about  Barclays’ work in Libya. He declared that the archive was the best source on pre-Gaddaffi Libyan history that he had come across. A great example of the surprises that are hidden within collections.

We ended with Bethan Ruddock and Jane Stevenson talking a bit about ‘the online archivist’ and expanding on some of the challenges archivists face in the digital age.

Altogether we had a great day. It was a great opportunity to hear about how another country approaches the challenges of business archives, and for us it was also a means to get a better understanding of the landscape of business archives within the UK.

 

Focus on: Lionel Robbins’ papers at LSE

Archives Hub Feature for October/November 2013

Photograph of Lionel Robbins (1929)
Lionel Robbins, 1929, LSE/UNREGISTERED/25/1/3

The economist and the wider world: the papers of Lionel Robbins (1898 – 1984) is a project which aims to provide access to the papers of Lionel Robbins at the London School of Economics and Political Science and promote them through a programme of cataloguing, digitisation and publicity.  The project has been generously supported by the LSE Annual Fund.  The cataloguing of the collection is now complete and the catalogue is accessible via the LSE Library archives catalogue.

Lionel Robbins was closely connected with LSE for over 60 years initially as a student, then as a professor and Chair of Economics, and also through his work for the Library Appeal and on the Court of Governors. The title of the project is ‘The economist and the wider world’ and Lionel Robbins’ papers contain all the economic-related material you might expect in the personal archive of such an important figure to the practice and theory of economics.  However ‘the wider world’ of the collection title hints at the wealth of other subjects that are also covered in this collection.

Poem by Robbins
‘The return from the war’, poem by Lionel Robbins, 1918, 1922, ROBBINS/2/4

Robbins’ passion for the arts is well represented throughout his correspondence with friends and family, as well as through his work as a Trustee of the National Gallery and the Royal Opera House.  The collection also contains his own artistic endeavors as a young man in the form of poems and short stories.  Some of these, such as the poems written on his return from the First World War, are particularly moving.  There are some well-known names in the correspondents, such as Henry Moore and Kenneth Clarke, and some infamous, such as Anthony Blunt.

Robbins' diary extract (1944)
Extract from Bretton Woods diary, 1944, ROBBINS/6/1/2

There are detailed diaries covering the period during and following the Second World War when Lionel Robbins was part of the Economic Section of the War Cabinet sent to the U.S.A. for the post-war economic negotiations.  These diaries, including one from the Bretton Woods conference, give a personal account of some defining moments in post-war economic and political history.  The diaries from the Hot Spring conference of 1943 and Bretton Woods in 1944 have been digitised.  Complementing his professional reports on his war-time work in the U.S.A are the letters he sent home to his wife Iris.  He would write to her at least once a week, often once every few days, as well as writing to his children.

The period at LSE known as the Troubles, in the late 1960s, is well documented in the collection. This was a period of student unrest and protest at LSE following controversy over the appointment of a new Director.  As a member of the Court of Governors Lionel Robbins held copies of the minutes and papers of meetings that determined how the organisation would respond to student protests.  He also collected examples of the student protest publications and press reports on the situation.  The LSE Library Appeal which resulted in the successful purchase and renovation of the current LSE library premises was headed by Lionel Robbins.  The collection contains minutes and papers relating to this appeal alongside correspondence and examples of the successful marketing campaigns and strategies.

Robbins' speeches on HE 1963-1977
Speeches by Lionel Robbins on higher education, 1963 – 1977 ROBBINS/8/1/3

Throughout his life Lionel continued to write and publish books and articles on economics and the collection contains the finished products as well as drafts, proofs and correspondence with publishers.  His work as Chairman of the Financial Times is also documented.   Lecture notes, student references, correspondence with students and former students and economics department circulars provide a detailed account of his work teaching at LSE, which he continued on a part-time basis until 1981 – 1982.

Members of the Committee on Higher Education, photograph (1962)
Members of the Committee on Higher Education visiting Stanford University, 1962, ROBBINS/13/5

In 1960 Lionel Robbins was invited to head a Committee on Higher Education to review current full-time higher education provision in the UK and advise the Government on long-term development.  The report became known as the Robbins Report which essentially aimed to show that higher education could benefit all and its access should be expanded to everyone.  This month marks the 50th anniversary of the final submission of the Robbins Report.  The official papers for the Report are held at the National Archives however the Lionel Robbins Papers contains correspondence about the Report, as well as subsequent speeches and articles written by Robbins on higher education.  To celebrate the 50th anniversary of the Robbins Report LSE has organised a public event ‘Shaping Higher Education Fifty Years After Robbins: what views to the future?’ on Tuesday 22nd October.

Robbins' artillery notebook (1916)
Artillery notebook kept by Lionel Robbins, 1916, ROBBINS/2/3

The Lionel Robbins catalogue on the Archives Hub now makes available the variety of subjects covered in the Lionel Robbins papers, and opens the collection up to new researchers.

Kathryn Hannan, Project Archivist
‘The economist and the wider world: the papers of Lionel Robbins (1898 – 1984)’

Useful links

Lionel Robbins Papers on the LSE archives catalogue – http://archives.lse.ac.uk/Record.aspx?src=CalmView.Catalog&id=ROBBINS

From 22nd October 2013 an exhibition The economist and the wider world Lionel Robbins (1898 – 1984), will be available on the LSE Digital Library http://digital.library.lse.ac.uk/exhibitions/lionel-robbins-the-economist-and-the-wider-world

Lionel Robbins project blog posts http://lib-1.lse.ac.uk/archivesblog/?cat=270 on LSE archive blog Out of the Box

 

Save

Long Live the Art School!

Archives Hub Feature for August/September 2013

In 1913 the Surrey History Centre celebrated the history of tertiary art education in Surrey, from the late nineteenth century to the 1970s with an exhibition and series of events.

Guildford School of Art, undated [1970s]
Guildford School of Art, undated [1970s]

 

 

 

 

 

 

 

 

 

 

 

 

 

Industry, Science and Art

Opening of the Epsom Technical Institute by Lord Rosebery
Opening of the Epsom Technical Institute by Lord Rosebery

From our archives Technical Institutes and Art Schools, Industry, Science and Art were combined from the start, in the 19th century. Practical skills and work were taught alongside theoretical, to train students in industry work.

The Epsom Technical Institute 1896 Prospectus states it deals in Technical Instruction of ‘Science, Art, Technical, Manual, and Commercial Classes, and Lectures’  and is run partly by the Science and Art Department in South Kensington. Commercial classes highlight how these classes are meant to be used in work.

1925-1926 Epsom Prospectus
1925-1926 Epsom Prospectus

The combination of Science and Art can be seen clearly in the Drawing and Carpentry Classes where to attend the Carpentry Class ‘it is distinctly understood that pupils must attend the Drawing Class or they will not be accepted into this [Carpentry] Class’

During the 19th century to the 1930s from records that we have in the archives, Art and Technical Institute classes are firmly focused on the industry and how the courses can be used vocationally. As years progress there is a more of a  mix of vocational and theory, more industrial classes, (such as Building Construction) is phased out, and replaced with classes that we associate with Art Schools today, including Graphic Design, Photography, and Fine Art.

Women in the Arts

Throughout the records of the Art Schools there is reference to the specific subject of ‘Women’s Crafts’,  for example in the Epsom School of 1938 timetable. There are also subjects that include ‘Cookery’ and ‘Shorthand’ ,‘Typewriting’  and ‘Dressmaking, that while not explicitly stating that is gender explicit, generated more female than male students.

Epsom and Ewell school of art time table 1938-39
Epsom and Ewell school of art time table 1938-39

Courses included in the Epsom School of Art and Technical Institute 1896 and 1897 prospectuses were: Shorthand, Drawing, Carpentry, Home Nursing, Cookery and French.

In classes in the Epsom 1932 prospectuses ‘the Cookery and Dressmaking classes are recommended to those interested in Domestic Subjects’, while ‘for boys and young men there are carefully arranged classes that should prove of great value. Their attention is also drawn to the instruction given in Interior Decoration, Architectural Design, Geometry and Perspective in the Art School’.

War Time Education

As across the country, including in all education, art schools suffered within both world wars.

Guildford school of art Field and Farm (School of Printing)
Guildford school of art Field and Farm (School of Printing)

There are no records existing for our Art School Archives the period between 1900-1920, but the fact that in the 1920-1921 Epsom prospectus there seems to be more classes seen to be more ‘feminine’ based, suggests that Art Schools suffered a loss of male students after the First World War.

Art Schools have always been associated with Technical Institutes, and industrial work; practical work and work associated with the war effort were a priority.

 

Art Schools and Activism

The Guildford School of Art students took a protest during 1968 in relation to the quality of art teaching, and the lack of control the students had over this. This protest took place in the background of protesting taking place from other Art schools in the UK.

Guildford Student Protest 1968
Guildford Student Protest 1968

A young Jack Straw was also involved

In his autobiography Last Man Standing: Memoirs of a Political Survivor (Chapter 3, Respected but Not Respectable  Macmillan, 2012)he mentions the following about his time at the NUS (p.74) :

My first six months at the NUS were uncomfortable. I was an intruder. I had stood up against the successful candidate, Trevor Fisk, and was now his deputy. I was given marginal responsibilities, like art colleges, in the hope I’d get bored and go away, but suddenly the art schools erupted. There were long occupations at colleges like Hornsey and Guildford colleges of art. I had something useful to do, and also developed firm friendships with some of those involved, like Kim Howells, later MP for Pontypridd and a fellow Foreign Office minister, and Kate Hoey, later MP for Vauxhall and minister for sport.

More information and images on these themes will be available at the exhibition

The catalogues relating to Surrey Art School education can be found here on Archives Hub

Epsom and Ewell Technical Institute and School of Art: http://archiveshub.ac.uk/data/gb3094-epew

Guildford School of Art Archive: http://archiveshub.ac.uk/data/gb3094-gcol

Farnham School of Art Archive: http://archiveshub.ac.uk/data/gb3094-fcol

Further material can be seen on our History Pin site http://www.historypin.com/channels/view/21466076#|photos/list/ and on our online image page http://community.ucreative.ac.uk/article/37669/Online-images-and-Exhibitions

Rebekah Taylor, University for the Creative Arts

 

Save

Save

Archives Hub and VIAF Name Matching

We have recently been reprocessing the Archives Hub data, transforming it into RDF based Linked Data, and as part of this we have been working on names matching. For Linked Data, creating links to external data sources is key – it is what defines Linked Data and gives the opportunities, potentially, for researchers to explore topics across data sources.

This names matching work has big implications for archives. I have already talked extensively in the Hub Blog about the importance of structured data, which is more effectively machine processable. For archival descriptions, we have a huge opportunity to link to all sorts of useful data sources, and one of the key means to link our data is through personal names. To do this effectively, we need names to be structured, and this is one of the reasons why the Hub practice of structuring names by separating out surname, forename, dates, titles and descriptive information (epithets) is so useful. We do this structuring even though EAD (the recognised XML standard for archives) doesn’t actually allow for it. We took the decision that the advantages would outweigh the disadvantages of a non-standard approach (and we can export the data without this additional markup, so really there is no disadvantage).

We have been working on the matching, using the freely available Open Refine data processing tool with the VIAF reconciliation service developed by Roderick Page. Freely available tools like this are so important for projects like ours, and we’re really grateful that we were able to take advantage of this service.

The matching has generally been very successful. Out of 5,076 names, just over 2,000 were linked from the Hub entry to the VIAF entry, which is a pretty good percentage.

This post provides some perspectives on the nature of the data and the results of the matching work.

Full names and epithets

With a name like ‘Bell, Sir Charles, 1774-1842, knight surgeon’, (you can see his entry in our current Linked Data views at http://data.archiveshub.ac.uk/id/person/ncarules/bellsircharles1774-1842knightsurgeon) there is plenty of information – surname, forename, dates and an epithet to help uniquely identify the individual. However, with this name, a match was not found, despite an entry on VIAF: http://viaf.org/viaf/2619993 (which is why you may not yet see the VIAF link on our Linked Data view). Normally, this type of name would yield a match. The reason it didn’t is that the epithet came through in the data we used for matching.

Screenshot of names matching using Open Refine
Screenshot of names matching using Open Refine

This highlights an issue with the use of epithets within names. It is encouraged in the NCA Rules, and it does help to uniquely identify an individual, but it introduces an additional element in the string that makes it harder to match the data.

Where our process did not manage to get the family name, forename and dates to match with VIAF, we used the ‘label‘ information that we have in our Linked Data. This label information includes the epithet. For example: Nosek, Václav, 1892-1955, Czechoslovak politician. This doesn’t tend to find a match, because of the epithet. With examples like this we can manually check, and in this case there is a VIAF match (http://viaf.org/viaf/23683886). But manual checking is problematic where you have thousands of names.

In 95% of cases we did manage to omit the epithet. But sometimes the epithet was included because we used the label, as stated, or because the markup on the Archives Hub is not always consistent and sometimes the structured names I referred to above are not present in Hub data because the data has come from other systems. (We may have found a way to remove these stray epithets, but it would have taken a good deal more time and effort to achieve).

Bringing together information on an individual

The reference to Sir Charles Bell came from a collection of “Papers of Sir Charles Bell” (http://archiveshub.ac.uk/data/gb96-ms386). In this description his occupation is “surgeon”. In the VIAF description (http://viaf.org/viaf/2619993) he is described as “Scottish painter, draftsman, and engraver”. Ostensibly this doesn’t look like the same person, but looking down the VIAF description, you can see titles such as “The nervous system of the human body” and other works that are clearly written by a scientist. The linking of our description with the VIAF description brings together Sir Charles Bell scientist and Sir Charles Bell painter, a good illustration of how linking provides a better perspective, as the different data sources effectively become joined up.

Pulling sparse sources together

For Francis Campbell Ross Douglas VIAF only has the surname and forename (http://viaf.org/viaf/211588539/), although if you look at the source records you also find “Douglas Of Barloch” to help with identification. This is an example where the Hub record has much more information (http://archiveshub.ac.uk/data/gb097-douglasofbarloch), and therefore creating the link is particularly useful. It shows how archives can help contribute to our knowledge of individuals within the Linked Data space, as they often have little known information, gleaned from the archives themselves.

Hyphenated names

From the Hub description http://archiveshub.ac.uk/data/gb1538-s97 comes the name William Blair-Bell. The name with encoding (slightly simplified) is:

<persname>
<surname>Bell</surname>,
<forename>William Blair-</forename>
(<dates>1871-1936</dates>)
<epithet>British gynaecologist and obstetrician</epithet>
</persname>

This is an example of the application of the NCA Rules, which insist on the last entry element as the main element, so it means the element ‘Bell’ is marked up as the surname. In fact, the matching still works because, with all the elements there, the reconciliation service can still find the right person (http://viaf.org/viaf/14336292/). However, it still concerns me that within the archive sector we have a rule that separates out the surname in this way, as it makes the name non-standard compared to other data sources. It is interesting to note that the name is generally given as Blair-Bell, but the Library of Congress enters the name as Bell, W. Blair (William Blair), 1871-1936 (http://id.loc.gov/authorities/names/no92003069.html), so there is an inconsistency in how different services deal with hyphenated and compound surnames. It could be argued that once we have a match, the different formats matter less, as they are simply alternatives that can be used to identify the individual.

Hub names without structured markup

As stated, in the Hub names are marked up by surname, forename, dates, epithet, titles. However, there are still some entries that are not marked up like this, usually because they were created in proprietary software and exported. An example is Carlyon Bellairs (referenced in http://archiveshub.ac.uk/data/gb097-assoc17). The name is marked up as:

<persname>Bellairs, Carlyon, 1871-1955, RN Commander, politician</persname>

You can see the XML mark up at http://archiveshub.ac.uk/data/gb097-assoc17.xml?hub. We have been working on a script to markup the component parts of these names in the Hub, and we have been able to implement it successfully for several institutions. But it is not easy to do this with non-standard names (i.e. not in the surname, forename, dates, epithet format). We do have some instances of names such as the British Prime Minister, James Callaghan, or the author Rudyard Kipling, that are not yet marked up in this way. These individuals should be easy to match, but without the structure within the index term, it is harder for us to ensure that we can get just the name and dates from an unstructured name to match with VIAF.

It is also impossible to implement structured markup on a name where there is a compound surname entered according to NCA Rules – we simply cannot mark these names up correctly because we have no way of knowing whether part of the forename is actually part of the surname. For example, if we have the name “George, David Lloyd” we can’t write a script that can transform this into “Lloyd George, David” because most of the time a name like this will be two forenames and one surname.

The importance of life dates and the use of ‘Is Like’

If we don’t have life dates, it makes matching with certainty almost impossible. Of course, cataloguers can’t always find life dates for a person, but it is worth stressing that the need for life dates has become even more important in recent years, now we have the potential to process data in so many ways. An example is at http://archiveshub.ac.uk/data/gb532-bel – Joyce Margaret Bellamy, a Senior Research Officer at the University of Hull. As we don’t have a birth date, we did not get a match with her VIAF entry at http://viaf.org/viaf/94773174. If we have this kind of entry, without life dates, we could potentially decide to use a different status from an exact match (which usually uses the owl:sameAs property), and for example, we could use the ‘isLike‘ property from the Umbel vocabulary instead. This would be useful where we believe the two names to be referring to the same person, but this type of matching has to be done manually (although potentially we could run something where a name match without a date match was always an ‘isLike’). In the process of checking the 2,000 matches for our data we did enter a number of matches manually, and the whole process of checking took around 5 hours. Not too bad for 2,000 names, and with some time also given to thinking about the results (and making notes for this post!). But if we were to work on the entire Archives Hub data, we couldn’t undertake to do this kind of manual work unless we just had a few thousand ‘not sure’ names that we might be prepared to work through.

Matches without life dates

We do get matches to VIAF where we don’t have dates. We got a match for ‘Hilda Chamberlain’ with VIAF entry http://viaf.org/viaf/286538995/. This seems to be correct, as she is the daughter of Joseph Chamberlain, so we kept the match. But we had to check it manually. Another example is Hercules Ross – http://viaf.org/viaf/21209582/ – matched to the name in description http://archiveshub.ac.uk/data/gb254-ms17. But in this case we don’t really have enough evidence to identify the individual, even though the surname and forename match. The source of the name on VIAF is “Guild, J. Proceedings before the sheriff depute of Forfarshire … against Hercules Ross and David Scott, Esquires, 1809”, but the title deeds described in the Archives Hub cover the sixteenth to the nineteenth century!

With a name like Gustav Wilhelm Wolff (http://archiveshub.ac.uk/data/gb738-ms174), again we only have the name and not the life dates. The match given is for someone born in 1811 (http://viaf.org/viaf/8221966/), and the papers relate to Victorian Jews in Britain. This makes the match likely, but we can’t be sure without dates, so we could potentially enter an ‘is like’, to imply that they are the same person, but that we cannot be certain.

Floruit!

We had a number of individuals without known life dates where the cataloguer used a ‘floruit’, e.g. Sharman W. fl 1884 (Secretary of National Association for the Repeal of the Blasphemy Laws). This sort of entry, whilst it may be the total of the information the archivist has, is difficult to use to identify someone in order to match them. However, the majority of individuals with this kind of entry are not likely to be on VIAF simply because a floruit normally indicates someone for whom life dates cannot be found. It would be interesting to consider a tool that matches floruit dates to possible life dates (e.g. fl 1900-1910 would match to life dates of 1880-1945) but I’m not  sure how much it would add much to the accuracy of a match.

Alternative names

The reconciliation service often works where VIAF provides names that are not ‘the same’ as our name. So, for example, the Hub data may have the name ‘Orton, John Kingsley, 1933-1967’. This was linked to Joe Orton (http://viaf.org/viaf/22163951), and within the VIAF data you can see that Joe Orton is also known as John Kingsley Orton.

Fame does not always give identity

Sometimes very famous people prove problematic, and an example is someone like Queen Victoria, because the name doesn’t include a surname and people tend to enter it in various ways. There were a few examples of this type of thing in our data, although most royal names matched with no problem. It always helps if it is easier to structure a name, but kings, queens, popes, etc. are non-standard.

Some Hub names are quite fulsome, such as “Edward Albert Christian George Andrew Patrick David, 1894-1972, Duke of Windsor, formerly Edward VIII, King of Great Britain and Ireland”. This should link to VIAF http://viaf.org/viaf/47553571 (Windsor, Edward, Duke of, 1894-1972), but the match was not given due to the lack of similarity.

Accented characters may cause problems

We didn’t get a match on Jeremy Bentham, despite having the full structured name, but this may be because the VIAF match has an accent: http://viaf.org/viaf/59078842/. We could possibly have stripped out accents in our data, but in this case the accent was in the VIAF data.  I only found one example where this was a problem, but clearly many names do contain accented characters.

Matches sometimes surprise…

A particularly nice match came up for “Mary-Teresa Craigie Pearl 1867-1906 novelist, dramatist and journalist as John Oliver Hobbes nee Richards”. A complex string, but the algorithm matched the basic elements that we provided (Cragie Pearl, Mary-Teresa, 1867-1906) to the name ‘John Oliver Hobbes’ on VIAF.

Mismatches

Leonard Wright, a Leiutenant (http://archiveshub.ac.uk/data/gb99-kclmawrightlw) matched to Clara Colby (http://viaf.org/viaf/63445035/), also known as Mrs Leonard Wright Colby. Here is an example of an incorrect match due to the same name, but in VIAF the person is a ‘Mrs’ (due to the old fashioned practice of using the husband’s name). The reason for the match seems to be that the name on the Hub includes a floruit (Leonard Wright, fl 1916) which matches the death date of Mrs Leonard Wright (Leonard Wright, Mrs, d 1916).

On the Hub we have an example of an archive that includes “a letter from Charlotte Bronte to Elizabeth Firth”, and the name is simply given as Elizabeth Firth in the index. The match to VIAF was for Mrs J.F.B Firth (http://viaf.org/viaf/71217693/). In this case the match is wrong, as we can see from the Hub description that Elizabeth Firth is actually “Mrs. James Clarke Franks”, and the dates within the additional information don’t seem to match.

There were very few examples of this type of mismatch, but it shows why well structured data, with life dates, helps to minimize any incorrect matches.

Incorrect Suggestions

In the names that did not find definite matches (i.e outside of the 2,000 matches), there were a few examples of suggested names that did not bear much resemblance to the text provided. One example of this was for “Bell, Vanessa, 1879-1961”. The suggestions for ‘sameAs’ names to link to this individual were Stephen, Julia Prinsep British model, 1846-1895; Woolf, Virginia, 1882-1941; Stephen, Leslie, 1832-1904. In fact, VIAF does have Vanessa Bell (http://viaf.org/viaf/7399364), and the link appears to be that the names are related within VIAF (i.e. VIAF establishes that there is an association between these people). However, these were only suggestions, they were not given as matches.

Conclusions

If there was no match given, but we can see that the name and dates have gone to VIAF, then we would assume there simply is no match and VIAF does not have anyone with our surname, forename and dates. But if we can see an epithet has also been included in the data we have provided, then there may well be a match because the epithet can be problematic for finding a match. Our intention would be to continue to improve our filtering to try to remove all epithets, but if the names are not properly structured this can be difficult.

When actually checking data like this, one thing that really comes to the fore is the risk of a ‘sameAs’ where the individual is not the same, and this is a particular risk where you are dealing with a notorious character – maybe a criminal. A number of war criminals are referred to in the Hub data, and it would be very unwise to link these to the wrong person – this is why it is best to only provide matches where the life dates match, but it is not impossible to have the same name with the same life dates of course.

In conclusion I would say that wherever our names have life dates, and these can be successfully carried over to the matching process, the likelihood of a correct match is 99%, but there is always a risk of a mismatch. Clearly the main problem would lie with two people sharing a name and life dates, and the chances of this happening will increase if we only have birth or death date.

Jisc Linking Lives project at Mimas: Jane Stevenson, Adrian Stevenson, Lee Baylis

Sentimental Journey: a focus on travel in the archives

Archives Hub Feature for August 2013

Steel engraving of Capri from 1875 named Picturesque Europe
© Image is in the public domain

The season of summer often brings hopes and plans for holidays and this month we’re looking at the wider theme of travel.

The hundreds of collections relating to travel featured in the Archives Hub shed light on multiple aspects of travel, from royalty to the working classes, and encompassing touring, business, exploration and research, the work of missionaries and nomadic cultures.

“The world is a book and those who do not travel read only one page” – St. Augustine.

Travel diaries

There are a number of travel diaries recording impressions of, and experiences in, the UK, Europe and beyond from a bygone era. ‘Grand tours’, leisurely and often luxurious, were the domain of the more privileged classes, where sometimes business and pleasure were combined. In more recent times, the pursuit of knowledge, education and ideas has motivated similar educational journeys.

Collections:

Thomas Moody, journal of a tour through Switzerland and Italy, 1822.
http://archiveshub.ac.uk/data/gb227-msd919.m7e22

Beatrice Webb, A summer holiday in Scotland, 1884.
http://archiveshub.ac.uk/data/gb227-msda865.w4

Harriet Susan Miller: Continental Tour Journal, c. 1856.
https://archiveshub.jisc.ac.uk/data/gb12-ms.add.6230

Watercolour paintings and photographs of Canada by an unidentified artist, 1884.
The paintings and photographs are held within a large album, providing a record of a journey by unidentified travellers to Canada from Liverpool in 1884. http://archiveshub.ac.uk/data/gb159-ms57

Extracts from the journal of William George Meredith during a trip to Spain and the East in the years 1830-1831.
Accompanied by Benjamin Disraeli, together with associated correspondence.
https://archiveshub.jisc.ac.uk/data/gb206-brothertoncollectionms19cmeredith(1)

Diary of travels through Italy and France, compiled by Sir William Trumbull, 1664-1665.
http://archiveshub.ac.uk/data/gb206-brothertoncollectionmstrvd1

Nassau William Senior Papers, 1830-1864.
Copies of journals kept by Nassau William Senior recording his visits to France, Germany, Austria, Italy, Ireland, Greece, Algeria and Egypt between 1850 and 1862. http://archiveshub.ac.uk/data/gb222-bmssnws

Papers of Sir Leonard David Gammans and Lady Ann Muriel Gammans, ne Paul, 1916-1971.
Diaries, notebooks, etc. of Leonard David Gammans, 1916-1956; diaries. etc. of Ann Muriel Gammans, 1918-1970; tourist brochures and other printed material concerning South Africa, [1965-1971]. http://archiveshub.ac.uk/data/gb161-mss.brit.emp.s.506

J.R.T. Pollard Papers, 1930-1999.
The collection consists of diaries and papers of J.R.T. Pollard. The diaries include details of the author’s extensive travel, particularly in Europe and observations regarding his years of army service in Africa (1941-1945). http://archiveshub.ac.uk/data/gb222-bmssjpol

Manuscript Itinerary of Henry III of England.
Not quite a diary, but of special note, is the late 19th Century Manuscript itinerary showing the geographical whereabouts of Henry III, where known, for all dates from 1216 to 1272. http://archiveshub.ac.uk/data/gb133-engms123

Business and work-related travel

Collections:

Records of the United Commercial Travellers’ Association (Nottingham Branch), 1908-1975.
The collection comprises accounts from 1932-1967, Committee minutes from 1908-1967 and registers from 1920-1975.
http://archiveshub.ac.uk/data/gb159-ct

Papers of James Craig Henderson, fl. 1941-1950, commercial traveller.
Commercial traveller in the Middle East.
https://archiveshub.jisc.ac.uk/data/gb248-ugd305

Papers of John Hunter, fl 1865-1912, carpenter’s mate, Royal Navy.
http://archiveshub.ac.uk/data/gb248-ugc076

John William Ramsay, 13th Earl of Dalhousie: Naval Notebook, HMS Galatea , 1869-1871.
http://archiveshub.ac.uk/data/gb12-ms.add.9279

Papers of John Wylie, merchant, Glasgow, Scotland, 1809-1840.
http://archiveshub.ac.uk/data/gb248-ugd028

Household book of James Sharp, Archbishop of St Andrews, 1663-1666.
Household account book of James Sharp, archbishop of St Andrews, kept by his secretary George Martin of Claremont, including details of journeys to Edinburgh and London.
http://archiveshub.ac.uk/data/gb227-msbx5395.s4m2

Exploration and research

Photograph of Icebergs, Greenland Sea by Frank Illingworth.
Photograph of Greenland Sea by Frank Illingworth. Copyright © Scott Polar Research Institute, University of Cambridge.

Contrasting with travel for pure pleasure, was travel for the purpose of exploration, discovery and research.

Collections:

William Gibb: Journals of Voyages in the Carnatic and the Yangtze River, 1838-1844.
http://archiveshub.ac.uk/data/gb12-ms.add.9377

Johan Hjort collection, 1912.
The collection comprises of correspondence by Hjort to polar explorer William Speirs Bruce (leader of the Scottish National Antarctic Expedition, 1902-1904).
http://archiveshub.ac.uk/data/gb15-johanhjort

Michael William Leonard Tutton: Natural History Diary, 1930-1932.
Natural history diary kept while Tutton was a King’s Scholar at Eton, which was awarded the Natural History Prize, 1930-1931. The diary contains notes on occurrences of insects, especially butterflies and moths, and occasionally birds and mammals.
http://archiveshub.ac.uk/data/gb12-ms.add.8769

Henry Seebohm: Ornithological Notebook.
Unfinished notes of visits to Glossop, Worksop, Ashopton and other places in Derbyshire; to the Farne Islands and Coquet Islands, Northumberland; to Flamborough Head, Yorkshire; and to Asia Minor (Constantinople and Smyrna) in 1872. The notebook also includes some watercolour sketches.
http://archiveshub.ac.uk/data/gb12-ms.add.8794

Missionaries

Collections:

Memoirs of Elizabeth Thomson, 1847-1918.
Teacher, missionary, traveller and suffragette, c1914.
http://archiveshub.ac.uk/data/gb248-ugc053

Diary of the Rev. David Cargill, 1 May 1842 – 29 Mar 1843.
Diary kept on his second missionary journey to Tonga.
http://archiveshub.ac.uk/data/gb231-ms0911

Papers of George Murray Davidson Short, 1890-1978.
Arts graduate and missionary, Glasgow, Scotland 1927.
http://archiveshub.ac.uk/data/gb248-ugc049

Alexander Gillon Macalpine.
Malawi missionary papers and linguistic studies, 1893-1964.
http://archiveshub.ac.uk/data/gb237-coll-48

Records of the Calabar Mission, 1849-1969.
http://archiveshub.ac.uk/data/gb237-coll-212

St Joseph’s Society Missionary Society (Mill Hill Missionaries), 1865- .
http://archiveshub.ac.uk/data/gb2254-stjosephsmissionarysociety

Romanies and Gypsies

Romany Vardo of the English Gypsies
© Image is in the public domain

Collections:

The Gypsy Collections, c.1860-1998.
The collection consists of two separately-catalogued but interlinked parts, the Gypsy Lore Society Archive (GLS) and the Scott Macfie Gypsy Collection (SMGC).
https://archiveshub.jisc.ac.uk/data/gb141-gls%26gb141smgc

Manuscripts relating to gypsies and other travellers collected by Sir Angus Fraser, 1752-1976.
http://archiveshub.ac.uk/data/gb206-brothertoncollectionmsrom-fraser2

Georg Althaus Photographs (including Hanns Weltzel Papers and Photographs).
1907 – 1960s.
http://archiveshub.ac.uk/data/gb141-glsadd.ga

Letters of Jeanie Robertson, 1954-1956.
The Scottish traditional folk singer Jeanie Robertson is regarded as a seminal figure in the music culture of Scotland’s travelling people. The collection includes letters from Robertson to the poet Hamish Henderson (1919-2002).
http://archiveshub.ac.uk/data/gb237-coll-725

Miscellaneous and related information

The Records of the Traveller’s Aid Society, 1885-1939.
The Travellers’ Aid Society was initiated in 1885 by the Young Women’s Christian Association to aid female passengers arriving at ports and railway stations, where they were met by accredited station workers who reported to the Travellers Aid Society Committee.
http://archiveshub.ac.uk/data/gb106-4/tas

Cold Comfort, The Franklin expeditions (previous feature).
http://archiveshub.ac.uk/features/jul04.shtml

Charles Darwin and the Beagle Collections in the University of Cambridge: a Voyage Round the World (previous feature).
http://archiveshub.ac.uk/features/darwin.shtml

Romanies and Gypsiologists (previous feature).
http://archiveshub.ac.uk/features/jun06.shtml

200 years of railways (previous feature).
http://archiveshub.ac.uk/features/railways.shtml

Sea-Fever: Britain’s maritime heritage (previous feature).
http://archiveshub.ac.uk/features/apr05.shtml

Also of interest

Perthshire Cant: Secret language of Scottish travellers, BBC History:
http://www.bbc.co.uk/history/0/22874080

20 Gorgeous Posters From a Time When Travel Was Glamorous blog post:
http://gizmodo.com/20-gorgeous-posters-from-a-time-when-travel-was-glamoro-758243140

Save

Facing the Music: are researchers and information professionals dancing to different tunes?

Still of presentation at ELAG 2013
What are the chief weapons we need to use to improve the user experience?

At ELAG 2013 I gave a presentation with a colleague from The University of Amsterdam, Lukas Koster. We wanted to do something entertaining, but with a worthwhile message that we both feel strongly about. We believe that more needs to be done to integrate resources and provide them to researchers in a way that suits end-user needs. We gave a presentation where we urged our colleagues to ‘mind the gap’ between the perspective of the information professional – their jargon and their complicated systems, which often fail to link resources adequately – and the researcher, who wants an integrated approach, language that is not a barrier to use and expects the power of the Web to be used within a library context, just as they might when looking for music online.

Still of a presentation where a librarian is explaining the library system to a researcher
A researcher tries to make sense of the library systems

Our presentation included two sketches: one in a music shop, where a punter (the ‘seeker’) expects the shop owner (the ‘pusher’) to know who else bought this music and what they thought of if; and one in a library, where the seeker wants an overview of everything available, and they want to look at research data and other resources without struggling with different catalogue systems and terminology.

In our presentation we referred to the ‘seeker’ wanting a discipline-focussed approach (not format based), and access regardless of location. I highlighted one of the problems with searching by showing examples of search terms used on the Archives Hub where the researchers were confused by the results. The terms researchers use don’t always fit into our approach, using controlled vocabularies.  We talked about the importance of connections between information. Our profession is making headway here, but there is a long way to go before researchers can really pull things together across different systems.

I spoke about the danger of making assumptions about our users and showed some examples of the Archives Hub survey results. Researchers don’t always come to our websites knowing what they are or what they want; they don’t necessarily have the same understanding of ‘archives’ as we do. Lukas expanded more on our musical theme. We can learn from some of the initiatives in this area – such as the ability people have to explore the musical world in so many different ways though things like MusicBrainz. Lukas also showed examples of researcher interfaces, looking to pull things together for the end user. Isn’t the idea of giving the researcher the ability to manage all of their research in this way  something libraries should be spearheading?

Image of a woman at a desk surrounded by books
A librarian contemplates the end of the index card…

We concluded that the vision of integrated, interconnected data is not easy. As information professionals we may have to move out of our comfort zones. But we don’t have any choice unless we want to be sidelined. This means that we need to change our mindsets (we talked about a ‘librarian lobe’!) and we need to actually think about whether it is us that needs to learn information literacy because we need to learn to think more like the end user!

Still of a scence in which the librarian cuts up a book for the researcher
The librarian has a frustrating time with a researcher who only wants one chapter!

See the slides on Slideshare.

The presentation is on You Tube, but be warned there are scenes of book cutting that may be upsetting to some!

 

ICT and the Student Experience

A HEFCE study from 2010 states that “96% of students use the internet as a source of information” (1). This makes me wonder about the 4% that don’t; it’s not an insignificant number. The same study found that “69% of students use the internet daily as part of their studies”, so 31% don’t use it on a daily basis (which I take to mean ‘very frequently’).

There have been many reports on the subject of technology and its impact on learning, teaching and education. This HEFCE/NUS study is useful because it concentrates on surveying students rather than teachers or information professionals. One of the key findings is that it is important to think about the “effective use of technology” and “not just technology for technology’s sake”. Many students still find conventional methods of teaching superior (a finding that has come up in other studies), and students prefer a choice in how they learn. However, the potential for use of ICT is clear, and the need to engage with it is clear, so it is worrying that students believe that a significant number of staff “lack even the most rudimentary IT skills”. It is hardy surprising that the experiences of students vary considerably when they are partly dependent upon the skills and understanding of their teachers, and whether teachers use technology appropriately and effectively.

At the recent ELAG conference I gave a joint presentation with Lukas Koster, a colleague from the University of Amsterdam, in which we talked about (and acted out via two short sketches) the gap between researchers’ needs and what information professionals provide. Thinking simply about something as seemingly obvious as the understanding and use of

Examples of interface terminology from archives sites
Random selection of interface terminology from archives sites.

the term ‘archives’ is a good case in point. Should we ensure that students understand the different definitions of archives? The distinction between archives that are collections with a common provenance and archives that are artificial collections? The different characters of archives that are datasets, generally used by social scientists? The “abuse” of the term archives for pretty much anything that is stored in any kind of longer-term way? Should users understand archival arrangement and how to drill down into collections? Should they understand ‘fonds’, ‘manuscripts’, ‘levels’, ‘parent collection’? Or is it that we should think more about how to translate these things into everyday language and simple design, and how to work things like archival hierarchy into easy-to-use interfaces?  I think we should take the opportunities that technology provides to find ways to present information in such a way that we facilitate the user experience. But if students are reporting a lack of basic ICT skills amongst teachers, you have to wonder whether this is a problem within the archive and library sector as well. Do information professionals have appropriate ICT skills fit for ensuring that we can tailor our services to meet the needs of the technically savvy user?

Should we be teaching information literacy to students? One of the problems with this idea is that they tend to think they are already pretty literate in terms of use of the internet. In the HEFCE report, a survey of 213 FE students found that 88% felt they were effective online researchers and the majority said they were self-taught. They would not be likely to attend training on how to use the internet. And there is a question over whether they need to be taught how to use it in the ‘right’ way, or whether information professionals should, in fact, work with the reality of how it is being used (even if it is deemed to be ‘wrong’ in some way).  Students are clear that they do want training “around how to effectively research and reference reliable online resources”, and maybe this is what we should be concentrating on (although it might be worth considering what ‘effective use of the internet’ and ‘effective research using the internet’ actually mean). Maybe this distinction highlights the problem with how to measure effective use of the internet, and how to define online or discovery skills.

A British Library survey from 2010 found that “only a small proportion [of students] …are using technology such as virtual-research environments, social bookmarking, data and text mining, wikis, blogs and RSS-feed alerts in their work.”  This is despite the fact that many respondents in the survey said they found such tools valuable. This study also showed that students turn to their peers or supervisors rather than library staff for help.

Part of the problem may be that the vast majority of users use the internet for leisure purposes as well as work or study, so the boundaries can become blurred, and they may feel that they are adept users without distinguishing between different types of use. They feel that they are ‘fine with the technology’, although I wonder if that could be because they spend hours playing World of Warcraft, or use Facebook or Twitter every day, or regularly download music and watch YouTube. Does that mean they will use technology in an effective way as part of their studies? The trouble is that if someone believes that they are adept at searching, they may not go that extra mile to reflect on what they are doing and how effective it really is. Do we need to adjust our ways of thinking to make our resources more user-friendly to people coming from this kind of ‘I know what I’m doing’ mindset, or do we have to disabuse them of this idea and re-train them (or exhort them to read help pages for example…which seems like a fruitless mission)? Certainly students have shown some concern over “surface learning” (skim reading, learning only the minimum, and not getting a broader understanding of issues), so there is some recognition of an issue here, and the tendency to take a superficial approach might be reinforced if we shy away from providing more sophisticated tools and interfaces.

The British Library report on the Information Behaviour of the Researcher of the Future reinforces the idea that there is a gulf between students’ assumptions regarding their ICT skills versus the reality, which reveals a real lack of understanding. It also found a significant lack of training in discovery and use of tools for postgraduate students. Studies like this can help us think about how to design our websites, and provide tools and services to help researchers using archives. We have the challenges of how to make archives more accessible and easy to discover as well as thinking about how to help students use and interpret them effectively: “The college students of the open source, open content era will distinguish themselves from their peers and competitors, not by the information they know, but by how well they convert that knowledge to wisdom, slowly and deeply internalized.” (Sheila Stearns, “Literacy in the University of 2025: Still A Great Thing‟, from The Future of Higher Education , ed. by Gary Olson & John W Presley, (Boulder: Paradigm Publishers, 2009) pp. 98-99).

What are the Solutions?

We should make user testing more integral to the development of our interfaces. It requires resource, but for the Archives Hub we found that even carrying out 10 one-hour interviews with students and academics helped us to understand where we were making assumptions and how we could make small modifications that would improve our site. And our annual online survey continues to provide really useful feedback which we use to adjust our interface design, navigation and terminology. We can understand more about our users, and sometimes our assumptions about them are challenged.

graph showing where people came from who visited the Hub
Archives Hub survey 2013: Why did you come to the Hub today?

User groups for commercial software providers can petition to ensure that out-of-the-box solutions also meet users’ needs and take account of the latest research and understanding of users’ experiences, expectations and preferences in terms of what we provide for them. This may be a harder call, because vendors are not necessarily flexible and agile; they may not be willing to make radical changes unless they see a strong business case (i.e. income may be the strongest factor).

We can build a picture of our users via our statistics. We can look at how users came into the site, the landing pages, where they went from there, which pages are most/least popular, how long they spent on individual pages, etc. This can offer real insights into user behaviour. I think a few training sessions on using Google Analytics on archive sites could come in handy!

We can carry out testing to find out how well sites rank on search engines, and assess the sort of experience users get when they come into a specialist site from a general search engine. What is the text a Google search shows when it finds one of your collections? What do people get to when they click on that link? Is it clear where they are and what they can do when they get to your site?

 * * *

This is the only generation where the teachers and information professionals have grown up in a pre-digital world, and the students (unless they are mature students) are digital natives. Of course, we can’t just sit back and wait a generation for the teachers and information professionals to become more digitally minded! But it is interesting to wonder whether in 25 years time there will be much more consensus in approaches to and uses of ICT, or whether the same issues will be around.

Nigel Shadbolt has described the Web as “one of the most disruptive innovations we have ever witnessed” and at present we really seem to be struggling to find out how best to use it (and not use it), how and when to train people to use it and how and when to integrate it into teaching, learning and research in an effective way.

It seems to me that there are so many narratives and assessments at present – studies and reports that seem to run the gamut of positive to negative. Is technology isolating or socialising? Are social networks making learning more superficial or enabling richer discussion and analysis? Is open access democratising or income-reducing? Is the high cost of technology encouraging elitism in education? Does the fact that information is so easily accessible mean that researchers are less bothered about working to find new sources of information?  With all these types of debates there is probably no clear answer, but let us hope we are moving forward in understanding and in our appreciation of what the Web can do to both enhance and transform learning, teaching and research.