Blowing the dust off Special Collections

Guest Blog Post by John Hodgson

Mimas works on exciting and innovative projects all the time and we wanted Hub blog readers to find out more about the SCARLET project, where Mimas staff, academics from the University of Manchester and the archive team at John Rylands University Library are exploring how Augmented Reality can bring resources held in special collections to life by surrounding original materials with digital online content.

The Project

Special Collections using Augmented Reality to Enhance Learning and Teaching (SCARLET)

SCARLET addresses one of the principal obstacles to the use of Special Collections in teaching and learning – the fact that students must consult rare books, manuscripts and archives within the controlled conditions of library study rooms. The material is isolated from the secondary, supporting materials and the growing mass of related digital assets. This is an alien experience for students familiar with an information-rich, connected wireless world, and is a barrier to their use of Special Collections.

The SCARLET project will provide a model that other Special Collections libraries can follow, making these resources accessible for research, teaching and learning. If you are interested in creating similar ‘apps’ and using the toolkit created by the team then please get in touch.

SCARLET Blog: http://teamscarlet.wordpress.com/

SCARLET Twitter: twitter.com/team_scarlet

The Blog Post

Blowing the dust off Special Collections

The academic year is now in full swing and JRUL Special Collections staff are busy delivering ‘close-up’ sessions and seminars for undergraduate and postgraduate students.

A close-up session typically involves a curator and an academic selecting up to a dozen items to show to a group of students. The items are generally set out on tables and everyone gathers round for a discussion. It is a real thrill for students to see Special Collections materials up close, and in some circumstances to handle the items themselves. The material might be papyri from Greco-Roman Egypt, medieval manuscripts, early printed books, eighteenth-century diaries and letters, or modern literary archives: the range of our Special Collections is vast.

Dante Seminar

Dr Guyda Armstrong shows her students a selection of early printed editions of Dante.

From our point of view, it’s really rewarding and enlightening to work alongside enthusiastic teachers such as Guyda Armstrong, Roberta Mazza and Jerome de Groot. The ideal scenario is a close partnership between the academic and the curator. Curators know the collections well, and we can discuss with students the materiality of texts, technical aspects of books and manuscripts, the context in which texts and images were originally produced, and the afterlife of objects – the often circuitous routes by which they have ended up in the Rylands Library. Academics bring to the table their incredible subject knowledge and their pedagogical expertise. Sparks can fly, especially when students challenge what they are being told!

This week I have been involved in close-up sessions for Roberta Mazza’s ‘Egypt in the Graeco-Roman World’ third-year Classics course, and Guyda Armstrong’s ‘Beyond the Text’ course on Dante, again for third-year undergraduates. Both sessions were really enjoyable, because the students engaged deeply with the material and asked lots of questions. But the sessions also reinforced my belief that Augmented Reality will allow us to do so much more. AR will make the sessions more interactive, moving towards an enquiry-based learning model, where we set students real questions to solve, through a combination of close study of the original material, and downloading metadata, images and secondary reading, to help them interrogate and interpret the material. Already Dr Guyda Armstrong’s students have had a sneak preview of the Dante app, and I’m look forward to taking part in the first trials of the app in a real teaching session at Deansgate in a few weeks’ time.

For many years Special Collections have been seen by some as fusty and dusty. AR allows us to bring them into the age of app.

The long tail of archives

For many of us, the importance of measuring use and impact are coming more to the fore. Funders are often keen for indications of the ‘value’ of archives and typically look for charts and graphs that can provide some kind of summary of users’ interaction with archives. For the Hub, in the most direct sense this is about use of the descriptions of archives, although, of course, we are just as interested in whether researchers go on to consult archives directly.

The pattern of use of archives and the implications of this are complex. The long tail has become a phrase that is banded around quite a bit, and to my mind it is one of those concepts that is quite useful. It was popularised by Chris Anderson, more in relation to the commercial world, relating to selling a smaller number of items in large quantities and a large number of items in relatively small quantities, and you can read more about it in Wikipedia: Long Tail.

If we think about books, we might assume that a smaller number of popular titles are widely used and use gradually declines until you reach a long tail of low use.  We might think that the pattern, very broadly speaking, is a bit like this:

I attended a talk at the UKSG Conference recently, where Terry Bucknell from the University of Liverpool was talking about the purchase of e-books for the University. He had some very whizzy and really quite absorbing statistics that analysed the use of packages of e-books. It seems that it is hard to predict use and that whilst a new package of e-books is the most widely used for that particular year, the older packages are still significantly used, and indeed, some books that are barely used one year may be get significant use in subsequent years. The patterns of use suggested that patron-driven acquisition, or selection of titles after one year of use, were not as good value as e-book packages, although you cannot accurately measure the return on investment after only one year.

Archives are kind of like this only a whole lot more tricky to deal with.

For archives, my feeling is that the graph is more like this:

No prizes for guessing which are the vastly more used collections*. We have highly used collections for popular research activities, archives of high-profile people and archives around significant events, and it is often these that are digitised in order to protect the originals.  But it is true to say that a large proportion of archives are in the ‘long tail’ of use.

I think this can be a problem for us. Use statistics can dominate perceptions of value and influence funding, often very profoundly. Yet I think that this is completely the wrong way to look at it. Direct use does not correlate to value, not within archives.

I think there are a number of factors at work here:

  • The use of archives is intimately bound up with how they are catalogued. If you have a collection of letters, and just describe it thus, maybe with the main author (or archival ‘creator’), and covering dates, then researchers will not know that there are letters by a number of very interesting people, about a whole range of subjects of great interest for all sorts of topics. Often, archivists don’t have the time to create rich metadata (I remember the frustrations of this lack of time). Having worked in the British Architectural Library, I remember that we had great stuff for social history, history of empire, in particular the Raj in India, urban planning, environment, even the history of kitchen design or local food and diet habits. We also had a wonderful collection of photographs, and I recall the Photographs Curator showing me some really early and beautiful photographs of Central Park in New York. Its these kind of surprises that are the stuff of archives, but we don’t often have time to bring these out in the cataoguing process.
  • The use of a particular archive collection may be low, and yet the value gained from the insights may be very substantial. Knowledge gained as a result of research in the archives may feed into one author’s book or article, and from there it may disseminate widely. So, one use of one archive may have high value over time. If you fed this kind of benefit in as indirect use, the pattern would look very different.
  • The ‘value’ of archives may change over time. Going back to my experience at the British Architectural Library, I remember being told how the drawings of Sir Edwin Lutyens were not considered particularly valuable back in the 1950s – he wasn’t very fashionable after his death. Yet now he is recognised as a truly great architect, and his archives and drawings are highly prized.
  • The use of archives may change over time. Just because an archive has not been used for some time – maybe only a couple of researchers have accessed it in a number of years – it doesn’t mean that it won’t become much more heavily used. I think that research, just like many things, is subject to fashions to some extent, and how we choose to look back at our past changes over time. This is one of the challenges for archivists in terms of acquisitions. What is required is a long-term perspective but organisations all too often operate within short-term perspectives.
  • Some archives may never be highly used, maybe due to various difficulties interpreting them. I suppose Latin manuscripts come to mind, but also other manuscripts that are very hard to read and those pesky letters that are cross-written. Also, some things are specialised and require professional or some kind of expert knowledge in order to understand them. This does not make them less valuable. It’s easy to think of examples of great and vital works of our history that are not easy for most people to read or interpret, but that are hugely important.
  • Some archives are very fragile, and therefore use has to be limited. Digitising may be one option, but this is costly, and there are a lot of fragile archives out there.

I’m sure I could think of some more – any thoughts on this are very welcome!

So, I think that it’s important for archivists to demonstrate that whilst there may be a long tail to archives, the value of many of those archives that are not highly used can be very substantial. I realise that this is not an easy task, but we do have one invention in our favour: The Web. Not to mention the standards that we have built up over time to help us to describe our content. The long tail graph does demonstrate to us that the ‘long tail of use’ can be just as much, or more, than the ‘high column of use’. The use of the Web is vital in making this into a reality, because researchers all over the world can discover archives that were previously extremely hard to surface.  That does still leave the problems of not being able to catalogue in depth in order to help surface content…the experiments with crowd-sourcing and user generated content may prove to be one answer. I’d like to see a study of this – have the experiments with asking researchers to help us catalogue our content proved successful if we take a broad overview? I’ve seen some feedback on individual projects, such as OldWeather:

“Old Weather (http://www.oldweather.org) is now more than 50% complete, with more than 400,000 pages transcribed and 80 ships’ logs finished. This is all thanks to the incredible effort that you have all put in. The science and history teams are constantly amazed at the work you’re all doing.” (a recent email sent out to the contributors, or ‘ship captains’).

If anyone has any thoughts or stories about demonstrating value, we’d love to hear your views.

* family history sources

Optimistic outcome for optimising the Hub


Paddy, Steve and I (Jane) have spent the last 4 months working on an interesting JISC project to optimise Archives Hub pages for search engines, as part of the Strategic Content Alliance

Initiative.

Search Engine Optimisation (SEO) is a process that aims to increase the visibility of a Website in important search engines like Google. SEO works by modifying the content, the layout, and the architecture of web pages, in addition to using community building techniques to enhance the popularity of a website.

As part of this project, an SEO expert is tracking and recording our current web traffic. We are implementing recommended changes and looking for changes to the website traffic after the changes are made.

Recommendations we have implemented
1. A Search Engine Sitemap

This is something that was developed by Google and is used by other search engines. An XML sitemap is a recommended way of organising a Website and identifying the URLs for the purpose of indexing the site by search engine bots, allowing them to find content and data faster and more efficiently. It is a means for us to tell the search engine what the important pages are, and we can also put a date into the sitemap as an indication of how often the page is updated. The sitemap should help the pages get indexed faster.

The sitemap was relatively easy to create, although it probably needs a bit more work from us in terms of grading pages for priority.

2. Metadata

We have been working on the page metadata. In particular we have minimised duplicate title and description tags, ensured all pages have title tags and thought a bit more about the content of the title and description tags – does the title properly represent the page? Is the description an effective summary of the content with important keywords? It is important to think about this from the perspective of the robots – what are the words that will be most useful for them, in terms of search engine searches?

For example, where we had a metadata title ‘Archives Hub: For Archivists’, we had a heading for the same page ‘Contributing to the Archives Hub’. Ideally these should be the same and we should decide which terms are most important – should ‘archivists’ be in the main heading? Should ‘contributing’ be in the title tag? We have also started to reverse our page titles so that the subject of the page is entered first of all, so not ‘Archives Hub: Contributors’ but ‘Contributors to the Archives Hub’.

3. Headings

As stated above, we are getting the metadata title and page title to correspond, and we are also thinking about the importance of the page headers for search engines. In the past we have had monthly features with titles like ‘Wabsters and Shewsters’. Whilst this might work as an intreguing title for a user, it will not help a user searching for Scottish textile history.

4. URLs

It is worth ensuring that at least one of the important keywords is in the URL for a page. So, a page on railway history should have a URL like http://www.archiveshub.ac.uk/railways.shtml and the title ‘Railway history: 200 years of the steam locomotive’.

5. Work on those keywords

We have worked on including keywords throughout the text, and especially in the first few lines. The inclusion of suggested websites and suggested reading provides a legitimate excuse to repeat keywords, both in their titles and in the annotations.

Other recommendations

There were other recommendations that we intend to implement over time, but did not have the resources to implement immediately – and some of them will more rationally fit into a redesign of our webiste (which is happening over the next 6 months).

1. Minimise use of tables

2. Change directory names to something more meaningful, e.g. ‘institution’ instead of ‘inst’, or ‘archivist’ instead of ‘arch’

3. Encourage external sites to link to the Hub site. This is an ongoing activity, but it should be easier with our new Website, and with our new approach to monthly features. We will also be able to link to Hub descriptions from sites like the National Register of Archives because we will have persistent URLs for all descriptions.

Web ranking reports

We have been working with Alan K’necht, an SEO expert, and Thierry Arsenault from the The Canadian Heritage Information Network (CHIN). Alan has provided us with weekly Web ranking reports. These reports are based upon some agreed search terms that we are using. We created three pages for three subject areas where the Archives Hub has strong collection representation: fairs and circus history, history of textiles and british railway history. For all of these subjcts we already had a monthly feature that we had created, so we could use the pages that already existed and just work on them to make them more optimised for search engines.

Conclusions so far

So, has it worked? If I take ‘fairground history’ as an example. On April 13th, this was at 30 in the Google rankings and at 14 in Google UK rankings. By May 11th it was at 11 in Google and 7 in Google UK. By June 6th it had moved to 6 in the rankings, and a quick search on Google UK now (17th June) puts it at number 3.

Railway history is maybe a more challenging topic, as we are competing with a huge amount of information. ‘Railway history UK’ was not ranked at the start of the project, but by 15th June it was at 15 in the rankings for Google, and at 11 for Google UK. A search on Google of just pages from the UK currently brings the page up to number 6 in the list.

Of course, the challenge with Google is to get the URL in the first page of results, and it is always a moveable feast, so if the page ranks highly one week, it may not do so the next. However, the work that we have done has clearly made an improvement to our rankings, and if we apply the lessons learnt to our other feature pages, we should be able to attract more people to the Archives Hub Website.

The principle of the JISC study was that ‘implementing a few simple and inexpensive Search Engine Optimisation (SEO) techniques can increase an organisation

International Archives Day 9th June 2009

Did you know that today is International Archives Day?

This is the 2nd International Archives Day ever held and 9th June was chosen because the International Council on Archives (ICA) was founded on 9th June 1948. Last year was the First International Archives Day, coinciding with the 60th Anniversary of ICA.

For more information about this and the history of ICA, go to the Unesco Archives website.


Over the last year the Archives Hub has had over 120,000 visits from over 184 countries. The map above gives an indication of international use.

One of our contributors, Glasgow University Archive Services, is celebrating International Archives Day by launching an online resource highlighting the international scope and reputation of Glasgow University and its archive collections.

The exhibition, searchable by region, will demonstrate the involvement of Scottish businesses on the development of the world economy and the influence that University of Glasgow and staff and students have had on the development of education around the world and on the history of many countries.

To go to the resource please see the following link: http://www.gla.ac.uk/services/archives/collections/internationalarchiveday/

If you are interested in international archives you could try the following websites and blogs:

Websites:
ArchiveGrid: A subscription site where you can find historical documents, personal papers, and family histories held in archives around the world.

European Archive: A freely available digital library of archives, with an emphasis on audio-visual materials.

MICHAEL UK: MICHAEL aims to provide simple and quick access to the digital collections of museums, libraries and archives from different European countries.

Unesco Archives Portal: a gateway to international archive collection websites

OCLC WorldCat (Manuscript materials): nearly 1.5 million catalogue records describing archival and manuscript collections and individual manuscripts in public, college and university, and special libraries located throughout North America and around the world.

Blogs:
Archiefforum.be: An online community which aims to support students and young archivists in their studies and profession by peer help and advise. (Flemish language)

ArchivesBlogs: a US blog which is a syndicated collection of blogs by and for archivists.

@rchivista: Spanish language blog written by Paco Fern

Thoughts on context and bias

The importance of context is always emphasised when thinking about how to present archives to researchers. At a recent seminar series I attended in the beautiful town of Lewes, East Sussex (pictured), Mike Savage of the University of Manchester talked about a well-known social survey by Elizabeth Bott, carried out in the 1950s, where 32 couples were interviewed about their relationships. Much of the contextual material was left out of the resultant book, so it was effectively stripped away from the findings. But closer analysis of the survey shows that the selection of the couples themselves was significant – the notes (unpublished) reveal why people volunteered for the study. There was quite a long process of application and most people who ended up taking part had interest in the research as a social activity. This is an important piece of the whole picture and would have had an effect on the findings. The research process itself is an important part of the whole picture.

Social scientists need to find methods to extract key findings from diverse archive sources, often covering long periods. Mike referred to the need to avoid the ‘juicy quotes syndrome’ and talked in detail about sampling methods, all of which have their pros and cons. He referred, for example, to ‘trend analysis’, which strips out the contextual detail (e.g. economic indicators, studies of changing attitudes). Processes and methods get forgotten about.

Archived qualitative data does not allow this abstraction from context and hence cannot deploy representative or aggregate findings. In this sense, qualitative data may have something to teach the social scientist in terms of the importance of context.

Archivists need to think carefully about the whole picture: what they are presenting to users and what they are leaving out. The whole question of subjectivity is a complex one. The social scientist must build the biases of inquiry into their analysis of qualitative data, and this distinguishes it from quantitative data. There is a need to develop clear analytical strategies to allow rigorous yet partial examination of such data – it is important not to give a false sense of the completeness of the data.

At the seminar, there was a great deal of discussion about methodology, the bias of the archive and the life of the archive itself. A particularly interesting talk from Carolyn Hamilton of the University of Cape Town referred to ways of using archival sources to study pre-colonial South Africa. The colonial archive is itself an expression of the power and dominance of the ruling elite – so what can it meaningfully say about the indigenous population? It is profoundly contaminated as evidence, and yet by the very act of proclaiming their dominance, the rulers shed light on those they claim the right to rule. In fact, the colonial archive brims with material germane to the pre-colonial past, but it is important to think about how to approach it and analyse it. Historians tend to study the archive ‘against the grain’ in order to mine it against its basic bias.

A similar situation of bias, although in a very different context, occurs with a community ‘archive’ website such as MyBrightonAndHove: www.mybrightonandhove.org.uk. Jack Latimer of QueenSpark Books talked about how this Website has become a very successful community website where people post images, stories and comments about their local community and history. It is very active, with around 1,300 visits per day and around 10-20 comments put up per day. But of course, this is also a skewed history – maybe a history that is born out of nostalgia, and obviously a self-selecting group of people.

John Hay, of the University of Wolverhampton, gave us a very engaging presentation about archives relating to deaf people and deaf culture. One thing that struck me was his wish to have an archive that represents the achievements of deaf people within society – here we come to another sort of bias. This does, of course, sound like a very worthwhile idea, especially, as John explained, when you consider how the deaf have been treated in the past, pretty much as second class citizens and victims of an affliction. But it does raise the question of whether an archive should have a goal of celebration or creating a certain image. Should it actually seek to gather any and all materials and artefacts that reflect the history of deaf people in the UK? Or is it perfectly valid to want to create something that is intended to be positive and affirming?

Archives may be a result of discourses and may in turn mould discourses, which in turn may give shape to practices that shape the archive. This, as Ann Cvetkovich of the University of Texas postulated, could be thought of as the public life of archive. If we accept that the archive has public life, then maybe it requires methodologically its own biography. The Archive acquires a provenance, is a part of the history of institution housing it. The Archive itself could be seen as a biographical subject.


Use of archives by social scientists

I have just attended two seminars as part of a project on Archiving and Reusing Qualitative Data: Theory, Methods and Ethics Across Disciplines. They provided a great deal of food for thought, as seminars like this so often do. These seminars were particularly valuable because they drew together academics, particularly social scientists and archivists. Many of the participants were oral historians, and the challenges of oral history ran through many of the talks.

When archivists think about archival theory and description, they are generally thinking about archives as materials ‘created by an individual or organisation in the course of their life or work and considered worthy of permanent preservation’ (my quotes, to indicate that this is a classic definition of archives). But if we think about archives as any records considered worthy of preservation and with value for future researchers, then we can expand the definition to include records that social scientists refer to as archives. For them, archives are often data sets, created by researchers in the course of their research and then, possibly, reused.

Social scientists do not necessarily think in terms of business records or personal letters, or archives as a reflection of personal or organisational activity. They think in terms of longitudinal studies and oral histories; quantitative and qualitative data. These are archives that generally are created for the purposes of research, and so the perspective is rather different to those created in the course of individual or organisational activity. We have the UK Data Archive which has ‘the largest collection of digital data in the social sciences and humanities in the UK’, and this houses the History Data Service which ‘promotes the use of digital resources, which result from or support historical research, learning and teaching’, but I don’t think that there is a general sense amongst archivists that these are part of the archive community, in the sense that trainee archivists don’t really think about working for a data archive, and arhcival theory doesn’t appear to really encompass this type of archive. Certainly social scientists clearly see archives as both data archives (data sets) and traditional archives (archives as reflections of past activity), and the fact that the two were not explicitly distinguished during the seminars was striking in itself.

It may be that data archives require different ways of thinking to ‘historical archives’, in terms of how they are organised and managed, but now that archives are increasingly digital, and as all archives are a valuable source for research, surely there is sense in the two communities moving closer together?

Historians’ use of archives

I have recently read an interesting article by Wendy Duff, Barbara Craig and Joan Cherry in Archivaria (58), published by the Canadian Society of Archivists in 2004 (sorry, I have a tendency to get to some articles a bit late!). It looked at historians’ use of archives (using 173 responses to a questionnaire). Whilst the study was carried out in a Canadian context, many of the observations and conclusions have wider resonance. Here I just draw out some of the points made in the study that are relevant in some way to the Archives Hub.

Historians were chosen for this survey because ‘their work has an impact far beyond their own academic communities, saturating text books used in public education and influencing new generations of undergraduate and graduate students’. I think this point is worth making more often because sometimes the academic users of archives are not recognised as a substantial group, but if we measure that level of use in terms of their overall influence, their impact would be seen as far greater. A study in 2003 (Helen R. Tibbo, American Archivist 66, no.1) surveyed 700 historians and found that 43% used the Internet to locate material. The survey suggests that historians may be characterised as users who consult a number of archival repositories rather than maybe just visiting one or two. The study also suggested that ‘university archives play a vital role in historical research’ and went on to say:

‘Perhaps what university archives lose in breadth [compared to government and local archives] they make up in availability, or their collections may be particularly valuable to the study of social history’.

One of the points that caught my attention was the observation that ‘historians tend to depend upon an informal network for finding material for their research.’ This network may include archivists as well as colleagues, but certainly if it is the case that this observation carries over to the UK (which I believe it does) then it does highlight one of the difficulties of making academics and historians aware of the resources that are available to them, such as the Archives Hub.

One of the questions asked about barriers to archival research. This threw up the lack of a finding aid as the second largest barrier (47%), the lack of detail of a finding aid (31%) and problems with finding aids being out of date (19%). In terms of formats, 92% liked the original format the most, so no surprises there, and only 2% liked digital reproductions the most. Indeed, there is a continuing tendency to print out documents for use. However, the conclusion to the study certainly emphasises that historians would benefit from not only having speedy access to good, detailed finding aids from their computers, which appears to be pretty much the main priority of the respondents, but also having links to digitised historical documents. It does point out that the historian’s preference for completeness ‘suggests that the digitization of selections of materials might meet some of their needs, but only if such selections are provided with explicit descriptions of what has been selected and why.’

This sort of study, examining use and user needs, raises the question in my mind of whether we should give people what they say they want or what we think they want…or maybe even what we think they will start to want at some point. Let’s face it, how many of us would have actually asked for many of the features that we now get from sites like Amazon? We may not have explicitly wanted them, but once they are there many of us certainly do use them and find them valuable. So maybe its a careful balance of understanding and anticipation when it comes to meeting the needs of the user.