Archives and the Researchers of Tomorrow

“In 2009, the British Library and JISC commissioned the three-year Researchers of Tomorrow study, focusing on the information-seeking and research behaviour of doctoral students in ‘Generation Y’, born between 1982 and 1994 and not ‘digital natives’. Over 17,000 doctoral students from more than 70 higher education institutions participated in the three annual surveys, which were complemented by a longitudinal student cohort study.” (Taken from http://www.jisc.ac.uk/publications/reports/2012/researchers-of-tomorrow#exec_sum).

This post picks up on some aspects of the study, particularly those that are  relevant to archives and archivists. I am assuming that archivists come into the category of libraries as being ‘library professionals’, at least to an extent, though our profession is not explicitly mentioned. I would recommend reading the report in full as it offers some useful insights into the research behaviour of an important group of researchers.

What is heartening about this study is that the findings confirm that generation Ydoctoral students are

Image from: http://www.freedigitalphotos.net

sophisticated information-seekers and users of complex information sources“.  The study does conclude that information seeking behaviour is becoming less reliant on the support of libraries and library staff, which may have implications for the role of libraries and archive professionals, but “library staff assistance with finding/retrieving difficult-to-access resources” was seen as one of the most valuable research support resources available to students, although it was a relatively small proportion of students that used this service. There was a preference for this kind of on-demand 1-2-1 support rather than formal training sessions. One of the students said ” the librarians are quite possibly the most enthusiastic and helpful people ever, and I certainly recommend finding a librarian who knows their stuff, because I have had tremendous amounts of help with my research so far, just simply by asking my librarian the right question.

The survey concentrated on the most recent information-seeking activity of students, and found that most were not seeking primary sources, but rather secondary sources (largely journal articles and books).

This apparent and striking dependence on published research resources implies that, as the basis for their own analytical and original research, relatively few doctoral students in social sciences and arts and humanities are using ‘primary’ materials such as newspapers, archival material and social data.

This finding was true across all subject disciplines and all ages. The study found that about 80% of arts and humanities students were looking for any bibliographic references on their topic or specific publications, while only 7% were looking for non-published archival material. It seems that this reliance on published information is formed early on in their research, as students lay the groundwork for their PhD. Most students said they used academic libraries more in their first year of study – whether visiting or using the online services, so maybe this is the time to engage with students and encourage them to use more diverse sources in the longer-term.

A point that piqued my interest was that the arts and humanities students visiting other collections in order to use archival sources would do so even if  “many of the resources they required had been digitised“, but this point was not explained further, which was frustrating. Is it because they are likely to use a mixture of digital and non-digital sources? Or maybe if they locate digital content it stimulates their interest to find out more about using primary sources?

Around 30% used Google or Google Scholar as their main information gathering tool,  although arts and humanities students sourced their information from a wider spread of online and offline sources, including library catalogues.

One thing that concerned me, and that I was not aware of, was the assertion that “current citation-based assessment and authenticity criteria in doctoral and academic research discourage the citing of non-published or original material“. I would be interested to know why this is the case, as surely it should be actively encouraged rather than discouraged? How does this fit with the need to demonstrate originality in research?

Students rely heavily on help from their supervisors early on in their research, and supervisors have a great influence on their information and resource use. I wonder if they actively encourage the use of primary sources as much as we would like?  I can’t help thinking that a supervisor enthusiastically extolling the importance and potential impact of using archives would be the best way to encourage use.

There continues to be a feeling amongst students, according to this study, that “using social media and online forums in research lacks legitimacy” and that these tools are more appropriate within a social context. The use of twitter, blogs and social bookmarking was low (2009 survey: 13% of arts and humanities students had used and valued Twitter) and use was more commonly passive than active. There was a feeling that new tools and applications would not transform the way that the students work, but should complement and enhance established research practices and behaviour. However, it should be noted that use of ‘Web 2.0’ tools increased over the 3 years of the study, so it may be that a similar study carried out in 5 years time would show significantly different behaviour.

Students want training in research geared towards their own subject area, preferably face-to-face. Online tutorials and packages were not well used. The implication is that training should be at a very local level and done in a fairly informal way. Generic research skills are less appealing. Research skills training days are valued, but if they are poor and badly taught, the student feels their time is wasted and may be put off trying again. Students were quite critical of the quality and utility of the training offered by their university mainly because (i) it was not pitched at the right level (ii) it was too generic or (iii) it was not available on demand. Library-led training sessions got a more positive response, but students were far less likely to take up training opportunities after the first year of their PhD.  Training in the use of primary sources was not specifically covered in the report, though it must be supposed this would be (should be!) included in library-led training.

The study indicated that students dislike reading (as opposed to scanning) on screen. This suggests that it is important to provide the right information online, information that is easy to scan through, but worth providing a PDF for printout, especially for detailed descriptions.

One quote stood out for me, as it seems to sum up the potential danger of modern ways of working in terms of approaches to more in-depth analysis:

“The problem with the internet is that it’s so easy to drift between websites and to absorb information in short easy bites that at times you forget to turn off the computer, rest your eyes from screen glare and do some proper in-depth reading. The fragments and thoughts on the internet are compelling (addictive, even), and incredibly useful for breadth, but browsing (as its name suggests) isn’t really so good for depth, and at this level depth is what’s required.” (Arts and humanities)

We do often hear about the internet, or computers, tending to reduce levels of concentration. I think that this point is subtly different though – it’s more about the type of concentration required for in-depth work, something that could be seen as essential for effective primary source research.

Conclusions

We probably all agree that we can always do more to to promote the importance of archives to all potential users, including doctoral students. Certainly, we need to make it easier for them to discovery sources through the usual routes that they use, so for one thing ensuring we have a profile via Google and Google Scholar. Too many archives still resist this requirement, as if it is somehow demeaning or too populist, or maybe because they are too caught up in developing their own websites rather than thinking about search engine optimisation, or maybe it is just because archivists are not sure how to achieve good search engine rankings?

Are we actively promoting a low-barrier, welcome and open approach? I recall many archive institutions that routinely state that their archives are ‘open to bone fide researchers only’. Language like that seems to me to be somewhat off putting. Who are the ‘non-bone fide’ researchers that need to be kept out? This sort of language does not seem conducive to the principle of making archives available to all.

The applications we develop need to be relatively easy to absorb into existing research work practices, which will only change slowly over time. We should not get too caught up in social networks and Web 2.0 as if these are ‘where it’s at’ for this generation of students. Maybe the approaches to research are generally more traditional than we think.

The report itself concludes that the lack of use of primary sources is worrying and requires further investigation:

There is a strong case for more in-depth research among doctoral students to determine whether the data signals a real shift away from doctoral research based on primary sources compared to, say, a decade ago. If this proves to be the case there may be significant implications for doctoral research quality related to what Park described as “widely articulated tensions between product (producing a thesis of adequate quality) and process (developing the researcher), and between timely completion and high quality research“.

 

 

 

 

 

Online Survey Results (2011)

We would like to share some of the results of our annual online survey, which we run each year, over a 3-4 week period. We aim for about 100 responses (though obviously more would be very welcome!), and for this survey we got 92 responses. We create a pop-up invitation to fill out the survey – something we do not like to do, but we do feel that it attracts more responses than a simple link.

Context

We have a number of questions that are replicated in surveys run for Zetoc and Copac, two bibliographic JISC-funded Mimas services, and this provides a means to help us (and our funders) look at all three services together and compare patterns of use and types of user.

This year we added four questions specifically designed to help us with understanding users of the Hub and to help us plan our priorities.

We aim to keep the number of questions down to about 12 at the most, and ensure that the survey will take no longer than 10 minutes to complete. But we also want to provide the opportunity for people to spend longer and give more feedback if they wish, so we combine tick lists and radio boxes with free text comments boxes.

We take the opportunity to ask whether participants would be willing to provide more feedback for us, and if they are potentially willing, they provide their email address. This gives us the opportunity to ask them to provide more feedback, maybe by being part of a focus group.

Results of the Survey

Profile

  • The vast majority of respondents (80%) are based in the UK for their study and/or work.
  • Most respondents are in the higher education sector (60%). A substantial number are in the Government sector and also the heritage/museum sector.
  • 20% of those using the Hub are students – maybe less than we would hope, but a significant number.
  • 10% are academics – again, less than we would hope, but it may be that academics are less willing to fill in a survey.
  • 50% are archivists or other information professionals. This is a high number, but it is important to note that it includes use of the Hub on behalf of researchers, to answer their enquiries, so it could be said to represent indirect use by researchers.
  • The majority of respondents use the service once or twice a month, although usage patterns were spread over all options, from daily to less than once a month, and it is difficult to draw conclusions from this, as just one visit to the Hub website may prove invaluable for research.

graph showing value of the HubUse and Recommendation

  • A significant percentage – 26% – find the Hub ‘neither easy nor difficult’ to use, and 3% of the respondents found it difficult to use, indicating that we still need to work on improving usability (although note that a number of comments were positive about ease of use) .
  • 73% agree their work would take longer without the Hub, which is a very positive result and shows how important it is to be able to cross-search archives in this way.
  • A huge majority – 93% – would recommend the Hub to others, which is very important for us. We aim to achieve 90% positive in this response, as we believe that recommendations are a very important means for the Hub to become more widely known.

Subject Areas

We spent a significant amount of time creating a list of subjects that would give us a good indication of disciplines in which people might use the Hub. The results were:

    • History 47
    • Library & Archive Studies 33
    • English Literature 17
    • Creative & Performing Arts 16
    • Education & Research Methods 10
    • Predominantly Interdisciplinary 9
    • Geography & Environment 5
    • Political Studies & International Affairs 5
    • Modern Languages and Linguistics 4
    • Physical Sciences 4
    • Special Collections 4
    • Architecture & Planning 3
    • Biological & Natural Sciences 3
    • Communication & Media Studies 3
    • Medicine 3
    • Theology & Philosophy 3
    • Archaeology 2
    • Engineering 2
    • Psychology & Sociology 2
    • Agriculture 1
    • Law 1
    • Mathematics 1
    • Business & Management Studies 0
  • History is, not surprisingly, the most common discipline, but literature, the arts, education and also interdisciplinary work all feature highly.
  • There is a reasonable amount of use from the subjects that might be deemed to have less call for archives, showing that we should continue to promote the Hub in these areas and that archives are used in disciplines where they do not have a high profile. It would be very valuable to explore this further.

graph showing use of archival websites

  • The Hub is often used along with other archival websites, particularly The National Archives and individual record office websites, but a significant number do not use the websites listed, so we cannot assume prior knowledge of archives.
  • It would be interesting to know more about patterns of use. Do researchers try different websites, and in what order to they visit them? Do they have a sense of what the different sites offer?
  • There is still low use of the European aggregators, Europeana and APENet, although at present UK archives are not well represented on these services and arguably they do not have a high profile amongst researchers (the Hub is not yet represented on these aggregators).

Subsequent activities

  • It is interesting to note that 32% visit a record office as a result of using the Hub, but 68% do not. It would be useful to explore this further, to understand whether the use of the Hub is in itself enough for some researchers. We do know that for some people, the description holds valuable information in and of itself, but we don’t know whether the need to visit a record office, maybe some distance away, prevents use of the archives when they might be of value to the researcher.

What is of most value?

  • We asked about what is important to researchers, looking at key areas for us. The results show that comprehensive coverage still tops the polls, but detailed descriptions also continue to be very important to researchers, somewhat in opposition tograph showing what is most valuable to researchers the idea of the ‘quick and dirty’ approach. More sophisticated questioning might draw out how useful basic descriptions are compared with no description and what sort of level of detail is acceptable.
  • Links to digital content and information on related material are important, but not as important as adding more descriptions and providing a level of detail that enables researchers to effectively assess archives.
  • Searching across other cultural heritage resources at the same time is maybe surprisingly less of a priority than content and links. It is often assumed that researchers want as much diverse information as possible in a ‘one-stop shop’ approach, but maybe the issues with things like the usability of the search,  navigation, number of results and relevance ranking of results illustrate one of the main issues – creating a site that holds descriptions and links to very varied content and still ensuring it is very easily understandable and researchers know what they are getting.
  • The regional search was not a high priority but a significant medium priority, and it might be argued that not all researchers would be interested in this, but some would find it particularly useful, and many archivists would certainly find it helpful in their work
  • We provided a free text box for participants to say what they most valued. The ability to search across descriptions, which is the most basic value proposition of the Hub, came out top, and breadth of coverage was also popular, and could be said to be part of the same selling point.
  • It was interesting to see that some respondents cited the EAD Editor as the main strength for them, showing how important it is to provide ways for archivists to create descriptions (it may be thought that other means are at their disposal, but often this is not the case).
  • Six people referred to the importance of the Hub for providing an online presence, indicating that for some record offices, the Hub is still the only way that collections are surfaced on the Web.

What would most improve the Hub?

  • We had a diversity of responses to the question about what would most improve the Hub, maybe indicating that there are no very obvious weaknesses, which is a good thing. But this does make it difficult for us to take anything constructive from the answers, because we cannot tell whether there is a real need for a change to be made. However, there were a few answers that focused on the interface design, and some of these issues should be addressed by our new ‘utility bar’ which is a means to more clearly separate the description from the other functions that users can then perform, and should be implemented in the next six months.

Conclusions

The survey did not throw up anything unexpected, so it has not materially affected our plans for development of the Hub. But it is essentially an endorsement of what we are doing, which is very positive for us. It emphasised the importance of comprehensive coverage, which is something we are prioritising, and the value of detailed descriptions, which we facilitate through the EAD Editor and our training opportunities and online documentation. Please contact us if you would like to know more.

A Web of Possibilities

“Will you browse around my website”, said the spider to the fly,image of spider from Wellcome images
‘Tis the most attractive website that you ever did spy”

All of us want to provide attractive websites for our users. Of course, we’d like to think its not really the spider/fly kind of relationship! But we want to entice and draw people in and often we will see our own website as our key web presence; a place for people to come to to find out about who we are, what we have and what we do and to look at our wares, so to speak.

The recently released ‘Discovery’ vision is to provide UK researchers with “easy, flexible and ongoing access to content and services through a collaborative, aggregated and integrated resource discovery and delivery framework which is comprehensive, open and sustainable.”  Does this have any implications for the institutional or small-scale website, usually designed to provide access to the archives (or descriptions of archives) held at one particular location?

Over the years that I’ve been working in archives, announcements about new websites for searching the archives of a specific institution, or the outputs of a specific project have been commonplace.  A website is one of the obvious outputs from time-bound projects, where the aim is often to catalogue, digitise or exhibit certain groups of archives held in particular repositories. These websites are often great sources of in-depth information about archives. Institutional websites are particularly useful when a researcher really wants to gain a detailed understanding of what a particular repository holds.

However, such sites can present a view that is based more around the provider of the information rather than the receiver. It could be argued that a researcher is less likely to want to use the archives because they are held at a particular location, apart from for reasons of convenience, and more likely to want archives around their subject area, and it is likely that the archives which are relevant to them will be held in a whole range of archives, museums and libraries (and elsewhere). By only looking at the archives held at a particular location, even if that location is a specialist repository that represents the researcher’s key subject area, the researcher may not think about what they might be missing.

Project-based websites may group together archives in ways that  benefit researchers more obviously, because they are often aggregating around a specific subject area. For example, making available the descriptions and links to digital archives around a research topic. Value may be added through rich metadata, community engagement and functionality aimed at a particular audience. Sometimes the downside here is the sustainability angle: projects necessarily have a limited life-span, and archives do not. They are ever-changing and growing and descriptions need to be updated all the time.

So, what is the answer? Is this too much of a silo-type approach, creating a large number of websites, each dedicated to a small selection of archives?

Broader aggregation seems like one obvious answer. It allows for descriptions of archives (or other resources) to be brought together so that researchers have the benefit of searching across collections, bringing together archives by subject, place, person or event, regardless of where they are held (although there is going to be some kind of limit here, even if it is at the national level).

You might say that the Archives Hub is likely to be in favour of aggregation! But it’s definitely not all pros and no cons. Aggregations may offer a powerful search functionality for intellectually bringing together archives based on a researcher’s interests, but in some ways there is a greater risk around what is omitted. When searching a website that represents one repository, a researcher is more likely to understand that other archives may exist that are relevant to them. Aggregations tend to promote themselves as comprehensive – if not explicitly then implicitly – which this creates expectation that cannot ever fully be met. They can also raise issues around measuring impact and around licensing. There is also the risk of a proliferation of aggregation services, further confusing the resource discovery landscape.

Is the ideal of broad inter-disciplinary cross-searching going to be impeded if we compete to create different aggregations? Yes, maybe it will be to some extent, but I think that it is an inevitability, and it is valid for different gateways to service different audiences’ needs. It is important to acknowledge that researchers in different disciplines and at different levels have their own needs, their own specific requirements, and we cannot fulfill all of these needs by only presenting data in one  way.

One thing I think is critical here is for all archive repositories to think about the benefits of employing recognised and widely-used standards, so that they can effectively interoperate and so that the data remains relevant and sustainable over time. This is the key to ensuring that data is agile, and can meet different needs by being used in different systems and contexts.

I do wonder if maybe there is a point at which aggregations become unwieldy, politically complicated and technically challenging. That point seems to be when they start to search across countries. I am still unsure about whether Europeana can overcome this kind of problem, although I can see why many people are so keen on making it work. But at present, it is extremely patchy, and , for example, getting no results for texts held in Britain relating to Shakespeare is not really a good result. But then, maybe the point is that Europeana is there for those that want to use it, and it is doing ground-breaking work in its focus on European culture; the Archives Hub exists for those interested in UK Archives and a more cross-disciplinary approach; Genesis exists for those interested in womens studies; for those interested in the Co-operative movement, there is the National Co-operative Archive site; for those researching film, the British Film Institute website and archive is of enormous value.

So, is the important principle here that diversity is good because people are diverse and have diverse needs? Probably so. But at the same time, we need to remember that to get this landscape, we need to encourage data sharing and  avoid duplication of effort. Once you have created descriptions of your archive collections you should be able to put them onto your own website, contribute them to a project website, and provide them to an aggregator.

Ideally, we would be looking at one single store of descriptions, because as soon as you contribute to different systems, if they also store the data, you have version control issues. The ability to remotely search different data sources would seem to be the right solution here. However, there are substantial challenges. The Archives Hub has been designed to work in a distributed way, so that institutions can host their own data. The distributed searching does present challenges, but it certainly works pretty well. The problem is that running a server, operating system and software can actually be a challenge for institutions that do not have the requisite IT skills dedicated to the archives department.  Institutions that hold their own data have it in a great variety of formats. So, what we really need is the ability for the Archives Hub to seamlessly search CALM, AdLib, MODES, ICA AtoM, Access, Excel, Word, etc. and bring back meaningful results. Hmmm….

The business case for opening up data seems clear. Project like Open Bibliographic Data have helped progress the thinking in this arena and raised issues and solutions around barriers such as licensing.   But it seems clear that we need to understand more about the benefits of aggregation, and the different approaches to aggregation, and we need to get more buy-in for this kind of approach.  Does aggregation allow users to do things that they could not do otherwise? Does it save them time? Does it promote innovation? Does it skew the landscape? Does it create problems for institutions because of the problems with branding and measuring impact?  Furthermore, how can we actually measure these kinds of potential benefits and issues?

Websites that offer access to archives (or descriptions of archives) based on where they are located and based on they body that administers them have an important role to play. But it seems to me that it is vital that these archives are also represented on a more national, and even international stage. We need to bring our collections to where the users are. We need to ensure that Google and other search engines find our descriptions. We need to put archives at the heart of research, alongside other resources.

I remember once talking about the Archives Hub to an archivist who ran a specialist repository. She said that she didn’t think it was worth contributing to the Hub because they already had their own catalogue. That is, researchers could find what they wanted via the institute’s own catalogue on their own system, available in their reading room. She didn’t seem to be aware that this could only happen if they knew that the archive was there, and that this view rested on the idea that researchers would be happy to repeat that kind of search on a number of other systems. Archives are often about a whole wealth of different subjects – we all know how often there are unexpected and exciting finds. A specialist repository for any one discipline will have archives that reach way beyond that discipline into all sorts of fascinating areas.

It seems undeniable that data is going to become more open and that we should promote flexible access through a number of discovery routes, but this throws up challenges around version control, measuring impact, brand and identity. We always have to be cognisant of funding, and widely disseminated data does not always help us with a funding case because we lose control of the statistics around use and any kind of correlation between visits to our website and bums on seats. Maybe one of the challenges is therefore around persuading top-level managers and funders to look at this whole area with a new perspective?

Reinventing the wheel: the new Hub website

promotional postcard On 1st April 2010 the Archives Hub website changed. It was not just about a new look and feel, but a whole new site. The Hub team spent several months planning the new architecture, navigation and content. Most of the content was rewritten and this gave us a great opportunity to think about a coherent approach where we could be consistent in our tone and terminology and really think about what each page should say. We wanted the site to be intuitive and for each page to be useful and attractive, and not give an overwhelming amount of information.

We decided to introduce plenty of images, to lift the site visually, and we wanted to keep plenty of whitespace, to make it easy on the eye. In addition, the website designers, True North, helped us to think about our identity and the importance of presenting the Archives Hub in a way that conveys confidence, self-belief, professionalism and warmth.

The Archives Hub has getting on for 200 contributors now, which is quite an achievement, and we are very appreciative of the effort that our contributors put into creating descriptions for the Hub. We want to continue to develop the site with a focus on archivists as well as on researchers, as we see both groups of users as vital to us, and in fact they often overlap. We hope that our ‘Archivists’ section is helpful and informative for contributors and other information professionals interested in what we do and in issues around online data and interoperability.

Our Features section takes over from the old ‘Collections of the Month’ idea, bringing the same message about the breadth and depth of Hub content and enabling us to showcase contributors and wonderful collections.

Our ‘Researchers’ section is going to be expanded, although we are keen to keep it focussed and easy to scan and digest. We are looking at ways that we can continue to support researchers in using the Hub to the greatest advantage. Of course, the main way is to provide an effective search interface and to continue to expand the content.  And this brings us on to the search – as well as a whole new information site, we have upgraded our software. We are now using ‘Cheshire 3’, which enables us to provide functionality that we could not provide before. We will be talking more about that in subsequent blogs. The new software is running on all-new hardware, so in fact we really have fundamentally changed the whole Archives Hub, but we hope that we have retained what is good about the site and about our service.