The long tail of archives

For many of us, the importance of measuring use and impact are coming more to the fore. Funders are often keen for indications of the ‘value’ of archives and typically look for charts and graphs that can provide some kind of summary of users’ interaction with archives. For the Hub, in the most direct sense this is about use of the descriptions of archives, although, of course, we are just as interested in whether researchers go on to consult archives directly.

The pattern of use of archives and the implications of this are complex. The long tail has become a phrase that is banded around quite a bit, and to my mind it is one of those concepts that is quite useful. It was popularised by Chris Anderson, more in relation to the commercial world, relating to selling a smaller number of items in large quantities and a large number of items in relatively small quantities, and you can read more about it in Wikipedia: Long Tail.

If we think about books, we might assume that a smaller number of popular titles are widely used and use gradually declines until you reach a long tail of low use.  We might think that the pattern, very broadly speaking, is a bit like this:

I attended a talk at the UKSG Conference recently, where Terry Bucknell from the University of Liverpool was talking about the purchase of e-books for the University. He had some very whizzy and really quite absorbing statistics that analysed the use of packages of e-books. It seems that it is hard to predict use and that whilst a new package of e-books is the most widely used for that particular year, the older packages are still significantly used, and indeed, some books that are barely used one year may be get significant use in subsequent years. The patterns of use suggested that patron-driven acquisition, or selection of titles after one year of use, were not as good value as e-book packages, although you cannot accurately measure the return on investment after only one year.

Archives are kind of like this only a whole lot more tricky to deal with.

For archives, my feeling is that the graph is more like this:

No prizes for guessing which are the vastly more used collections*. We have highly used collections for popular research activities, archives of high-profile people and archives around significant events, and it is often these that are digitised in order to protect the originals.  But it is true to say that a large proportion of archives are in the ‘long tail’ of use.

I think this can be a problem for us. Use statistics can dominate perceptions of value and influence funding, often very profoundly. Yet I think that this is completely the wrong way to look at it. Direct use does not correlate to value, not within archives.

I think there are a number of factors at work here:

  • The use of archives is intimately bound up with how they are catalogued. If you have a collection of letters, and just describe it thus, maybe with the main author (or archival ‘creator’), and covering dates, then researchers will not know that there are letters by a number of very interesting people, about a whole range of subjects of great interest for all sorts of topics. Often, archivists don’t have the time to create rich metadata (I remember the frustrations of this lack of time). Having worked in the British Architectural Library, I remember that we had great stuff for social history, history of empire, in particular the Raj in India, urban planning, environment, even the history of kitchen design or local food and diet habits. We also had a wonderful collection of photographs, and I recall the Photographs Curator showing me some really early and beautiful photographs of Central Park in New York. Its these kind of surprises that are the stuff of archives, but we don’t often have time to bring these out in the cataoguing process.
  • The use of a particular archive collection may be low, and yet the value gained from the insights may be very substantial. Knowledge gained as a result of research in the archives may feed into one author’s book or article, and from there it may disseminate widely. So, one use of one archive may have high value over time. If you fed this kind of benefit in as indirect use, the pattern would look very different.
  • The ‘value’ of archives may change over time. Going back to my experience at the British Architectural Library, I remember being told how the drawings of Sir Edwin Lutyens were not considered particularly valuable back in the 1950s – he wasn’t very fashionable after his death. Yet now he is recognised as a truly great architect, and his archives and drawings are highly prized.
  • The use of archives may change over time. Just because an archive has not been used for some time – maybe only a couple of researchers have accessed it in a number of years – it doesn’t mean that it won’t become much more heavily used. I think that research, just like many things, is subject to fashions to some extent, and how we choose to look back at our past changes over time. This is one of the challenges for archivists in terms of acquisitions. What is required is a long-term perspective but organisations all too often operate within short-term perspectives.
  • Some archives may never be highly used, maybe due to various difficulties interpreting them. I suppose Latin manuscripts come to mind, but also other manuscripts that are very hard to read and those pesky letters that are cross-written. Also, some things are specialised and require professional or some kind of expert knowledge in order to understand them. This does not make them less valuable. It’s easy to think of examples of great and vital works of our history that are not easy for most people to read or interpret, but that are hugely important.
  • Some archives are very fragile, and therefore use has to be limited. Digitising may be one option, but this is costly, and there are a lot of fragile archives out there.

I’m sure I could think of some more – any thoughts on this are very welcome!

So, I think that it’s important for archivists to demonstrate that whilst there may be a long tail to archives, the value of many of those archives that are not highly used can be very substantial. I realise that this is not an easy task, but we do have one invention in our favour: The Web. Not to mention the standards that we have built up over time to help us to describe our content. The long tail graph does demonstrate to us that the ‘long tail of use’ can be just as much, or more, than the ‘high column of use’. The use of the Web is vital in making this into a reality, because researchers all over the world can discover archives that were previously extremely hard to surface.  That does still leave the problems of not being able to catalogue in depth in order to help surface content…the experiments with crowd-sourcing and user generated content may prove to be one answer. I’d like to see a study of this – have the experiments with asking researchers to help us catalogue our content proved successful if we take a broad overview? I’ve seen some feedback on individual projects, such as OldWeather:

“Old Weather (http://www.oldweather.org) is now more than 50% complete, with more than 400,000 pages transcribed and 80 ships’ logs finished. This is all thanks to the incredible effort that you have all put in. The science and history teams are constantly amazed at the work you’re all doing.” (a recent email sent out to the contributors, or ‘ship captains’).

If anyone has any thoughts or stories about demonstrating value, we’d love to hear your views.

* family history sources

A bit about Resource Discovery

The UK Archives Discovery Network (UKAD) recently advertised our up and coming Forum on the archives-nra listserv. This prompted one response to ask whether ‘resource discovery’ is what we now call cataloguing and getting the catalogues online. The respondent went on to ask why we feel it necessary to change the terminology of what we do, and labelled the term resource discovery as ‘gobledegook’. My first reaction to this was one of surprise, as I see it as a pretty plain talking way of describing the location and retrieval of information , but then I thought that it’s always worth considering how people react and what leads them to take a different perspective.

It made me think that even within a fairly small community, which archivists are, we can exist in very different worlds and have very different experiences and understanding. To me, ‘resource discovery’ is a given; it is not in any way an obscure term or a novel concept. But I now work in a very different environment from when I was an archivist looking after physical collections, and maybe that gives me a particular perspective. Being manager of the Archives Hub, I have found that a significant amount of time has to be dedicated to learning new things and absorbing new terminology. There seem to be learning curves all over the place, some little and some big. Learning curves around understanding how our Hub software (Cheshire) processes descriptions, Encoded Archival Description , deciding whether to move to the EAD schema, understanding namespaces, search engine optimisation, sitemaps, application programming interfaces, character encoding, stylesheets, log reports, ways to measure impact, machine-to-machine interfaces, scripts for automated data processing, linked data and the semantic web, etc. A great deal of this is about the use of technology, and figuring out how much you need to know about technology in order to use it to maximum effect. It is often a challenge, and our current Linked Data project, Locah, is very much a case in point (see the Locah blog). Of course, it is true that terminology can sometimes get in the way of understanding, and indeed, defining and having a common understanding of terms is often itself a challenge.

My expectation is that there will always be new standards, concepts and innovations to wrestle with, try to understand, integrate or exclude, accept or reject, on pretty much a daily basis. When I was the archivist at the RIBA (Royal Institute of British Architects), back in the 1990’s, my world centered much more around solid realities: around storerooms, temperature and humidity, acquisitions, appraisal, cataloguing, searchrooms and the never ending need for more space and more resources. I certainly had to learn new things, but I also had to spend far more time than I do now on routine or familiar tasks; very important, worthwhile tasks, but still largely familiar and centered around the institution that I worked for and the concepts terminology commonly used by archivists. If someone had asked me what resource discovery meant back then, I’m not sure how I would have responded. I think I would have said that it was to do with cataloguing, and I would have recognised the importance of consistency in cataloguing. I might have mentioned our Website, but only in as far as it provided access through to our database. The issues around cross-searching were still very new and ideas around usability and accessibility were yet to develop.

Now, I think about resource discovery a great deal, because I see it as part of my job to think of how to best represent the contributors who put time and effort into creating descriptions for the Hub. To use another increasingly pervasive term, I want to make the data that we have ‘work harder’. For me, catalogues that are available within repositories are just the beginning of the process. That’s fine if you have researchers who know that they are interested in your particular collections. But we need to think much more broadly about our potential global market: all the people out there who don’t know they are interested in archives – some, even, who don’t really know what archives are. To reach them, we have to think beyond individual repositories and we have to see things from the perspective of the researcher. How can we integrate our descriptions into the ‘global information environment’ in a much more effective way. A most basic step here, for example, is to think about search engine optimisation. Exposing archival descriptions through Google, and other search engines, has to be one very effective way to bring in new researchers. But it is not a straightforward exercise – books are written about SEO and experts charge for their services in helping optimise data for the Web. For the Archives Hub, we were lucky enough to be part of an exercise looking at SEO and how to improve it for our site. We are still (pretty much as I write) working on exposing our actual descriptions more effectively.

Linked Data provides another whole world of unfamiliar terminology to get your head round. Entities, triples, URI patterns, data models, concepts and real world things, sparql queries, vocabularies – the learning curve has indeed been steep. Working on outputting our data as RDF (a modelling framework for Linked Data) has made me think again about our approach to cataloguing and cataoguing standards. At the Hub, we’re always on about standards and interoperability, and it’s when you come to something like Linked Data, where there are exciting possibilities for all sorts of data connections, well beyond just the archive community, that you start to wish that archivists catalogued far more consistently. If only we had consistent ‘extent’ data, for example, we could look at developing a lovely map-based visualisation showing where there are archives based on specific subjects all around the country and have a sense of where there are more collections and where there are fewer collections. If only we had consistent entries for people’s names, we could do the same sort of thing here, but even with thesauri, we often have more than one name entry for the same person. I sometimes think that cataloguing is more of an art than a science, partly because it is nigh on impossible to know what the future will bring, and therefore knowing how to catalogue to make the most of as yet unknown technologies is tricky to say the least. But also, even within the environment we now have, archivists do not always fully appreciate the global and digital environment which requires new ways of thinking about description. Which brings me back to the idea of whether resource discovery is another term for cataloguing and getting catalogues online. No, it is not. It is about the user perspective, about how researchers locate resources and how we can improve that experience. It has increasingly become identified with the Web as a way to define the fundamental elements of the Web: objects that are available and can be accessed through the Internet, in fact, any concept that has an identity expressed as a URI. Yes, cataloguing is key to archives discovery, cataloguing to recognised standards is vital, and getting catalogued online in your own particular system is great…but there is so much more to the whole subject of enabling researchers to find, understand and use archives and integrating archives into the global world of resources available via the Web.

A model to bring museums, libraries and archives together

I am attending a workshop on the Conceptual Reference Model created by the International Council of Museums Committee on Documentation (CIDOC) this week.
The CIDOC Conceptual Reference Model (CRM) was created as a means of enabling information interchange and integration in the museum community and beyond. It “provides definitions and a formal structure for describing the implicit and explicit concepts and relationships used in cultural heritage documentation”.
It became an ISO standard in 2006 and a Special Interest Group continues to work to develop it and keep it in line with progress in conceptualisation for information integration.
The vision is to facilitate the harmonization of information across the cultural heritage sector, encompassing museums, libraries and archives, helping to create a global resource. The CRM is effectively an ontology describing concepts and relationships relevant to this kind of information. It is not in any sense a content standard, rather it takes what is available and looks at the underlying logic, analysing the structure in order to progress semantic interoperability.
I come to this as someone with a keen interest in interoperability, and I think that the Archives community should engage more actively in cross-sectoral initiatives that benefit resource discovery. I am interested to find out more about the practical application and adoption of the CRM. My concern is that in the attempt to cover all eventualities, it seems like quite a complex model. It seeks to ‘provide the level of detail and precision expected and required by museum professionals and researchers’. It covers detailed descriptions, contexts and relationships, which can often be very complex. The SIG is looking to harmonise the CRM with archival standards, which should take the cultural heritage sector a step further towards working together to share our resources.
I will be interested to learn more about the Model and I would like to consider how the CRM relates to what is going on in the wider environment, and particularly with reference to Linked Data and, more basically, the increasing recognition of web architecture as the core means to disseminate information. Initiatives to bring data together, to interconnect, should move us closer to integrated information systems, but we want to make sure that we have complimentary approaches.
You can read more about the Conceptual Reference Model on the CIDOC CRM website.

English Language — subjectless constructions

This is (probably) a final blog post referring to the recent survey by the UK Archives Discovery Network (UKAD) Working Group on Indexing and Name Authorities. Here we look in particular at subject indexing.

We received 82 responses to the question asking whether descriptions are indexed by subject. Most (42) do so, and follow recognised rules (UKAT, Unesco, LCSH, etc.). A significant proportion (29) index using in-house rules and some do not index by subject (18). Comments on this question indicated that in-house rules often supplement recognised standards, sometimes providing specialised terms where standards are too general (although I wonder whether these respondents have looked at Library of Congress headings, which are sometimes really quite satisfyingly specific, from the behaviour of the great blue heron to the history of music criticism in 20th century Bavaria).

Reasons given for subject indexing include:
  • it is good practice
  • it is essential for resource discovery
  • users find it easier than full-text searching
  • it gives people an indication of the subject strengths of collections
  • it imposes consistency
  • it is essential for browsing (for users who prefer to navigate in this way)
  • it brings together references to specific events
  • it brings out subjects not made explicit in keyword searching
  • it enables people to find out about things and about concepts
  • it may provide a means to find out about a collection where it is not yet fully described
  • it maximises the utility of the catalogues
  • it helps users identify the most relevant sources
  • it can indicate useful material that may not otherwise be found
  • it enables themes to be drawn out that may be missed by free-text searching
  • it can aid teachers
  • it helps with answering enquiries
  • it facilitate access across the library and archive
  • it meets the needs of academic researchers
The lack of staff resources was a significant reason given where subject searching was not undertaken. Several respondents did not consider it to be necessary. Reasons given for this were:
  • the scope of the archive is tightly defined so subject indexing is less important
  • the benefits are not clear
  • the lack of a thesaurus that is specific enough to meet needs
  • a management decision that it is ‘faddy’
  • the collections are too extensive
  • the cataloguing backlog is the priority
Name indexing is considered more important than subject indexing only by a small margin, and some respondents did emphasise that they index by name but not by subject. Comments here included the observation that subject indexing is more problematic because it is more subjective, that subjects may more easily be pulled out via automated means and that it depends upon the particular archive (collection). As with name and place indexing, subject indexing happens at all levels of description, and not predominantly at collection-level. Comments suggest that subjects are only added at lower-levels if appropriate (and not appropriate to collection-level).
For subjects, the survey asked how many terms are on average applied to each record. According to the options we gave, the vast majority use between one and six. However, some respondents commented that it varies widely, and one said that they might use a few thousand for a directory, which seems a little generous (possibly there is a misunderstanding here?)
Sources used for subjects included the usual thesauri, with UKAT coming out strongest, followed by Unesco and Library of Congress. A few respondents also referred to the Getty Art and Architecture Thesaurus. However, as with other indexes, in-house lists and a combination approach also proved common. It was pointed out in one comment that in-house lists should not be seen as lesser sources; one respondent has sold their thesaurus to other local archives. There were two comments about UKAT not being maintained, and hopes that the UKAD Network might take this on. And, indeed, when asked about the choice of sources used for subject indexing, UKAT again came up as a good thesaurus in need of maintenance.
Reasons given for the diverse choice of sources used included:
  • being led by what is within the software used for cataloguing
  • the need to work cross-domain
  • the need to be interoperable
  • the need to apply very specific subject terms
  • the need to follow what the library does
  • the importance of an international perspective
  • the lack of forethought on how users might use indexes
  • the lack of a specialist thesaurus in the subject area the repository represents (e.g. religious orders)
  • following the recommendations of the Archives Hub and A2A
Image courtesy of Flickr Creative Commons licence, Luca Pedrotti’s photostream

* the title of this blog post is a Library of Congress approved subject heading

English language — subjectless constructions*

This is (probably) a final blog post referring to the recent survey by the UK Archives Discovery Network (UKAD) Working Group. Here we look in particular at subject indexing.

We received 82 responses to the question asking whether descriptions are indexed by subject. Most (42) do so, and follow recognised rules (UKAT, Unesco, LCSH, etc.). A significant proportion (29) index using in-house rules and some do not index by subject (18). Comments on this question indicated that in-house rules often supplement recognised standards, sometimes providing specialised terms where standards are too general (although I wonder whether these respondents have looked at Library of Congress headings, which are sometimes really quite satisfyingly specific, from the behaviour of the great blue heron to the history of music criticism in 20th century Bavaria).

Reasons given for subject indexing include:
  • it is good practice
  • it is essential for resource discovery
  • users find it easier than full-text searching
  • it gives people an indication of the subject strengths of collections
  • it imposes consistency
  • it is essential for browsing (for users who prefer to navigate in this way)
  • it brings together references to specific events
  • it brings out subjects not made explicit in keyword searching
  • it enables people to find out about things and about concepts
  • it may provide a means to find out about a collection where it is not yet fully described
  • it maximises the utility of the catalogues
  • it helps users identify the most relevant sources
  • it can indicate useful material that may not otherwise be found
  • it enables themes to be drawn out that may be missed by free-text searching
  • it can aid teachers
  • it helps with answering enquiries
  • it facilitate access across the library and archive
  • it meets the needs of academic researchers
The lack of staff resources was a significant reason given where subject searching was not undertaken. Several respondents did not consider it to be necessary. Reasons given for this were:
  • the scope of the archive is tightly defined so subject indexing is less important
  • the benefits are not clear
  • the lack of a thesaurus that is specific enough to meet needs
  • a management decision that it is ‘faddy’
  • the collections are too extensive
  • the cataloguing backlog is the priority
Name indexing is considered more important than subject indexing only by a small margin, and some respondents did emphasise that they index by name but not by subject. Comments here included the observation that subject indexing is more problematic because it is more subjective, that subjects may more easily be pulled out via automated means and that it depends upon the particular archive (collection). As with name and place indexing, subject indexing happens at all levels of description, and not predominantly at collection-level. Comments suggest that subjects are only added at lower-levels if appropriate (and not appropriate to collection-level).
For subjects, the survey asked how many terms are on average applied to each record. According to the options we gave, the vast majority use between one and six. However, some respondents commented that it varies widely, and one said that they might use a few thousand for a directory, which seems a little generous (possibly there is a misunderstanding here?)
Sources used for subjects included the usual thesauri, with UKAT coming out strongest, followed by Unesco and Library of Congress. A few respondents also referred to the Getty Art and Architecture Thesaurus. However, as with other indexes, in-house lists and a combination approach also proved common. It was pointed out in one comment that in-house lists should not be seen as lesser sources; one respondent has sold their thesaurus to other local archives. There were two comments about UKAT not being maintained, and hopes that the UKAD Network might take this on. And, indeed, when asked about the choice of sources used for subject indexing, UKAT again came up as a good thesaurus in need of maintenance.
Reasons given for the diverse choice of sources used included:
  • being led by what is within the software used for cataloguing
  • the need to work cross-domain
  • the need to be interoperable
  • the need to apply very specific subject terms
  • the need to follow what the library does
  • the importance of an international perspective
  • the lack of forethought on how users might use indexes
  • the lack of a specialist thesaurus in the subject area the repository represents (e.g. religious orders)
  • following the recommendations of the Archives Hub and A2A
* the title of this blog post is a Library of Congress approved subject heading

Place names: we would be lost without them

According to the recent Indexing and Authority Records Survey (which I have been blogging about recently), archivists have a number of reasons why they think it is important to undertake place indexing:

  • to facilitate access
  • it is essential to resource discovery
  • users frequently request information about places
  • it is very important for local historians
  • it is good practice
  • to tackle inconsistencies in spelling and place name changes
  • to distinguish between places that have the same name
  • as a source of statistics (e.g. how many collections relate to individual countries)
  • it is an important part of the University’s diversity plan – many students are from other countries – shows that the collections are international
  • the records are arranged by place
  • it is a way to bring together disparate material in diverse collections
  • it helps identify and track boundary changes over time
  • it is used by national network sites (e.g. the Archives Hub)
The main reason not to index by place was given as a lack of staff resources, but some did also feel that it is not necessary. Other reasons were:
  • the search engine can pull out the place name
  • would need to index at item level for place entries to be useful and this is not practical to do
  • cataloguing and name indexing are the priorities
  • collections cover a small geographical area
  • collections are more thematic and name indexing works better than place indexing
  • not appropriate for the material (e.g. cartoons)
  • it has never been done
  • names are standardised to facilitate keyword searching
For those that do index by place, just as with names, the spread between collection-level, series-level, file and item-level indexing was pretty even, and the percentage of collections indexed by place varied enormously. The sources used for place names were varied, although most do seem to use the recognised gazetteers and guides. Others referred to the Library of Congress, local people and the documents themselves.
Many do use the NCA Rules, but there were some comments about the drawbacks of these -they do not recognise the three Yorkshire Ridings, they were created by a previous generation of archivists and are outdated.
We did ask whether any repositories use a co-ordinates based system, and only 3 responses were in the affirmative, though a couple stated that they were going to look into this.
Finally, when asked about reasons for the choice of rules or sources for place names, there were some varied responses:
  • being part of a set-up with other contributors
  • familiarity
  • ease
  • internationally accepted [standard], widely known and used
  • indexing was done before standards were introduced
  • it appears that no real thought has been given to this
  • standards were not precise enough when the decision was made
Place name indexing: is it necessary? One respondent said: ‘To put it bluntly we would be lost without it.’
Image: Flickr Creative Commons JMC Photos

“I’m Spartacus!’ (or giving a name authority)


This is the second blog post about the recent UKAD survey on indexing and name authorities (as stated previously a report on the survey will be made available shortly).


It seems to me that there is some confusion over what authority records actually are. When we came up with our survey it was clear that defining these terms is not always that straightforward and we often make assumptions that are not necessarily shared . We created a glossary for the survey, and defined a name authority record as:

“An entry for a person or corporate body that includes additional elements about the entity, providing contextual information as well as a name index entry.”

However, it is clear that some respondents were thinking of name index entries rather than more complete authority records. According to our survey, which received 93 responses, 34 maintain authority records that follow recognised rules or sources (although comments indicate that the number of these records may be very limited), 14 follow local practice and 29 do not maintain authority records. Bear in mind that responses were not per institution, so the figures can only tell us so much. But what they do indicate is: (i) there is some confusion about what authority records are (ii) some repositories maintain authority records that follow their own in-house practice rather than recognised standards (iii) it is important for archivists that the software cataloguing systems they use support the creation of authority records.

Many repositories use the original records to create authority records, which is one reason why archivists are in the best position to provide this kind of detailed and useful information to researchers. The original records can give a real insight into individuals, particularly lesser-known individuals. Many archivists base their name authority records on ISAAR(CPF), which gives a level of consistency, but many do not, maybe reflecting the fact that ISAAR is a recent standard (first edition 1996), and cataloguing is not a recent phenomenon.

If the authorised form of the actual name is following recognised rules, this provides for effective resource discovery. But in reality we know that there are often many versions of an individual out there. Here are the entries on the Archives Hub for David Lloyd George:

  • George David Lloyd
  • George David Lloyd 1863-1945 1st Earl Lloyd George Of Dwyfor Statesman
  • George David Lloyd 1863-1945 1st Earl Lloyd George Of Dwyfor Statesman And Prime Minister
  • George David Lloyd 1863-1945 Emph Altrender Epithet Prime Minister
  • George David Lloyd 1863-1945 First Earl Lloyd-george Of Dwyfor Prime Minister
  • Lloyd George David
  • Lloyd George David 1863-1945
  • Lloyd George David 1863-1945 1st Earl Lloyd George Of Dwyfor Statesman
  • Lloyd George David 1863-1945 1st Earl Lloyd-george Of Dwyfor Statesman
This illustrates quite nicely the problems of including an epithet, and even more clearly the problems of NCA Rules insisting on using the last element of a surname, even if it is a compound or hyphenated surname. I will never understand that one…sigh.


I love one of the responses to the question of which sources are used for authority records: ‘books, the internet, people’. In a way this reflects the diversity of sources used, which include encylopaedias, directories, books, journals and registers as well as donor knowledge. This shows how important the expertise of archivists is in using various sources to bring together valuable information about individuals, families and corporate bodies. Authority records maximise the benefits of the information archivists gather together for their work, bringing it to researches and giving them new ways into collections.

Archivists have to work with the software that they have, and sometimes this imposes certain limitations. One respondent mentioned the need to avoid using the ampersand, for example. Many repositories use CALM, and this is compliant with ISAAR(CPF), which should provide a great boost to archivists wanting to create authority records.

I do think that archivists should really be starting to think more carefully about the benefits of name authority records, and we need to have a more co-ordinated and collective approach to this. As one respondent put it, ‘We don’t create these at present, and I wonder whether we ever ought to? Surely this is most sensible as a global resource that we can contribute to and share.’ For my part, I would be very keen for the Archives Hub to facilitate this, and I hope that this is something we can look to in the future.

Image: Flickr Creative Commons Steeljam photostream


What’s in a Name?

I have just been taking a look through the results of a recent survey by the UK Archives Discovery Network (UKAD) Working Group. The Working Group are getting together this week and will be looking at making the results public.
The main thing that struck me was the variety of responses. If we thought that this survey might clarify the situation, I’m beginning to wonder if all that it clarifies is that the situation is not clear!
I’m just going to concentrate on name indexing here, and leave place and subject for another blog post.
It seems that only a small proportion of archivists (as reflected in this survey) do not think that indexing is important. Of the 80 responses, 49 indexed to recognised rules and 23 indexed in line with local practice; 13 did not index and 23 went for ‘other’, which tended to mean they were in the process of creating an index, moving to an index following recognised standards or had legacy data with some indexes.
The survey revealed many reasons to create a name index as a means to access archives:
  • for enhanced resource discovery
  • many users want to search by name (respondents indicated it is a very popular search option)
  • it brings together collections that reference the same people
  • it is a way researchers look for connections
  • it aids interdisciplinary research
  • to identify people involved in particular works and their roles
  • it helps researchers to narrow down larger numbers of hits to just relevant collections
  • it promotes interoperability
  • it addresses problems with variants of the name, name changes, or different people with the same name (aids reliability)
  • it is at the heart of family history research
  • it is useful for answering enquiries
  • it is useful for selecting material, e.g. for exhibitions
When asked why name indexing is not carried out, there were a number of reasons:
  • free text retrieval makes name indexing redundant
  • lack of funding
  • lack of training
  • lack of staff resource
  • the current system does not support indexing
  • it has never been done
  • uncertainty about how to index effectively
  • uncertainty about benefits
Out of 100 responses, 46 felt that name indexing is very important, 33 felt it is reasonably important and 11 felt it is a low priority. The main reason given for name indexing being a low priority was the pressing need to deal with cataloguing backlogs and actually get some kind of description out there. It also seems that archivists do not always feel that they have the evidence to suggest that indexing is of benefit to researchers (or enough benefit to warrant the time involved).
The level at which collections are indexed was often given as ‘whatever is appropriate’ and clearly varied widely. I had expected it to be much higher for collection-level descriptions, but this was not the case.
We asked which sources are used for names, and again the answers were varied. Many people clearly do use the original records, with the National Register of Archives and Dictionary of National Biography coming in close behind. There was mention of Wikipedia, and even Google. In terms of rules, a majority do use the NCA Rules, and more use in-house rules than use AACR2. Several respondents said they use ISAAR(CPF), which is curious, as this standard is for name authority records and states that the main name entry should follow recognised rules (e.g. NCA Rules). I wonder if people were thinking of name authority records rather than basic index entries.
More on the survey to follow. And the UKAD Network will be publishing the results via the listserv, archives-discovery-network@jiscmail.ac.uk Make sure you sign up to this if you are interested in these kind of activities: https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=ARCHIVES-DISCOVERY-NETWORK