Facing the Music: are researchers and information professionals dancing to different tunes?

Still of presentation at ELAG 2013
What are the chief weapons we need to use to improve the user experience?

At ELAG 2013 I gave a presentation with a colleague from The University of Amsterdam, Lukas Koster. We wanted to do something entertaining, but with a worthwhile message that we both feel strongly about. We believe that more needs to be done to integrate resources and provide them to researchers in a way that suits end-user needs. We gave a presentation where we urged our colleagues to ‘mind the gap’ between the perspective of the information professional – their jargon and their complicated systems, which often fail to link resources adequately – and the researcher, who wants an integrated approach, language that is not a barrier to use and expects the power of the Web to be used within a library context, just as they might when looking for music online.

Still of a presentation where a librarian is explaining the library system to a researcher
A researcher tries to make sense of the library systems

Our presentation included two sketches: one in a music shop, where a punter (the ‘seeker’) expects the shop owner (the ‘pusher’) to know who else bought this music and what they thought of if; and one in a library, where the seeker wants an overview of everything available, and they want to look at research data and other resources without struggling with different catalogue systems and terminology.

In our presentation we referred to the ‘seeker’ wanting a discipline-focussed approach (not format based), and access regardless of location. I highlighted one of the problems with searching by showing examples of search terms used on the Archives Hub where the researchers were confused by the results. The terms researchers use don’t always fit into our approach, using controlled vocabularies.  We talked about the importance of connections between information. Our profession is making headway here, but there is a long way to go before researchers can really pull things together across different systems.

I spoke about the danger of making assumptions about our users and showed some examples of the Archives Hub survey results. Researchers don’t always come to our websites knowing what they are or what they want; they don’t necessarily have the same understanding of ‘archives’ as we do. Lukas expanded more on our musical theme. We can learn from some of the initiatives in this area – such as the ability people have to explore the musical world in so many different ways though things like MusicBrainz. Lukas also showed examples of researcher interfaces, looking to pull things together for the end user. Isn’t the idea of giving the researcher the ability to manage all of their research in this way  something libraries should be spearheading?

Image of a woman at a desk surrounded by books
A librarian contemplates the end of the index card…

We concluded that the vision of integrated, interconnected data is not easy. As information professionals we may have to move out of our comfort zones. But we don’t have any choice unless we want to be sidelined. This means that we need to change our mindsets (we talked about a ‘librarian lobe’!) and we need to actually think about whether it is us that needs to learn information literacy because we need to learn to think more like the end user!

Still of a scence in which the librarian cuts up a book for the researcher
The librarian has a frustrating time with a researcher who only wants one chapter!

See the slides on Slideshare.

The presentation is on You Tube, but be warned there are scenes of book cutting that may be upsetting to some!

 

ICT and the Student Experience

A HEFCE study from 2010 states that “96% of students use the internet as a source of information” (1). This makes me wonder about the 4% that don’t; it’s not an insignificant number. The same study found that “69% of students use the internet daily as part of their studies”, so 31% don’t use it on a daily basis (which I take to mean ‘very frequently’).

There have been many reports on the subject of technology and its impact on learning, teaching and education. This HEFCE/NUS study is useful because it concentrates on surveying students rather than teachers or information professionals. One of the key findings is that it is important to think about the “effective use of technology” and “not just technology for technology’s sake”. Many students still find conventional methods of teaching superior (a finding that has come up in other studies), and students prefer a choice in how they learn. However, the potential for use of ICT is clear, and the need to engage with it is clear, so it is worrying that students believe that a significant number of staff “lack even the most rudimentary IT skills”. It is hardy surprising that the experiences of students vary considerably when they are partly dependent upon the skills and understanding of their teachers, and whether teachers use technology appropriately and effectively.

At the recent ELAG conference I gave a joint presentation with Lukas Koster, a colleague from the University of Amsterdam, in which we talked about (and acted out via two short sketches) the gap between researchers’ needs and what information professionals provide. Thinking simply about something as seemingly obvious as the understanding and use of

Examples of interface terminology from archives sites
Random selection of interface terminology from archives sites.

the term ‘archives’ is a good case in point. Should we ensure that students understand the different definitions of archives? The distinction between archives that are collections with a common provenance and archives that are artificial collections? The different characters of archives that are datasets, generally used by social scientists? The “abuse” of the term archives for pretty much anything that is stored in any kind of longer-term way? Should users understand archival arrangement and how to drill down into collections? Should they understand ‘fonds’, ‘manuscripts’, ‘levels’, ‘parent collection’? Or is it that we should think more about how to translate these things into everyday language and simple design, and how to work things like archival hierarchy into easy-to-use interfaces?  I think we should take the opportunities that technology provides to find ways to present information in such a way that we facilitate the user experience. But if students are reporting a lack of basic ICT skills amongst teachers, you have to wonder whether this is a problem within the archive and library sector as well. Do information professionals have appropriate ICT skills fit for ensuring that we can tailor our services to meet the needs of the technically savvy user?

Should we be teaching information literacy to students? One of the problems with this idea is that they tend to think they are already pretty literate in terms of use of the internet. In the HEFCE report, a survey of 213 FE students found that 88% felt they were effective online researchers and the majority said they were self-taught. They would not be likely to attend training on how to use the internet. And there is a question over whether they need to be taught how to use it in the ‘right’ way, or whether information professionals should, in fact, work with the reality of how it is being used (even if it is deemed to be ‘wrong’ in some way).  Students are clear that they do want training “around how to effectively research and reference reliable online resources”, and maybe this is what we should be concentrating on (although it might be worth considering what ‘effective use of the internet’ and ‘effective research using the internet’ actually mean). Maybe this distinction highlights the problem with how to measure effective use of the internet, and how to define online or discovery skills.

A British Library survey from 2010 found that “only a small proportion [of students] …are using technology such as virtual-research environments, social bookmarking, data and text mining, wikis, blogs and RSS-feed alerts in their work.”  This is despite the fact that many respondents in the survey said they found such tools valuable. This study also showed that students turn to their peers or supervisors rather than library staff for help.

Part of the problem may be that the vast majority of users use the internet for leisure purposes as well as work or study, so the boundaries can become blurred, and they may feel that they are adept users without distinguishing between different types of use. They feel that they are ‘fine with the technology’, although I wonder if that could be because they spend hours playing World of Warcraft, or use Facebook or Twitter every day, or regularly download music and watch YouTube. Does that mean they will use technology in an effective way as part of their studies? The trouble is that if someone believes that they are adept at searching, they may not go that extra mile to reflect on what they are doing and how effective it really is. Do we need to adjust our ways of thinking to make our resources more user-friendly to people coming from this kind of ‘I know what I’m doing’ mindset, or do we have to disabuse them of this idea and re-train them (or exhort them to read help pages for example…which seems like a fruitless mission)? Certainly students have shown some concern over “surface learning” (skim reading, learning only the minimum, and not getting a broader understanding of issues), so there is some recognition of an issue here, and the tendency to take a superficial approach might be reinforced if we shy away from providing more sophisticated tools and interfaces.

The British Library report on the Information Behaviour of the Researcher of the Future reinforces the idea that there is a gulf between students’ assumptions regarding their ICT skills versus the reality, which reveals a real lack of understanding. It also found a significant lack of training in discovery and use of tools for postgraduate students. Studies like this can help us think about how to design our websites, and provide tools and services to help researchers using archives. We have the challenges of how to make archives more accessible and easy to discover as well as thinking about how to help students use and interpret them effectively: “The college students of the open source, open content era will distinguish themselves from their peers and competitors, not by the information they know, but by how well they convert that knowledge to wisdom, slowly and deeply internalized.” (Sheila Stearns, “Literacy in the University of 2025: Still A Great Thing‟, from The Future of Higher Education , ed. by Gary Olson & John W Presley, (Boulder: Paradigm Publishers, 2009) pp. 98-99).

What are the Solutions?

We should make user testing more integral to the development of our interfaces. It requires resource, but for the Archives Hub we found that even carrying out 10 one-hour interviews with students and academics helped us to understand where we were making assumptions and how we could make small modifications that would improve our site. And our annual online survey continues to provide really useful feedback which we use to adjust our interface design, navigation and terminology. We can understand more about our users, and sometimes our assumptions about them are challenged.

graph showing where people came from who visited the Hub
Archives Hub survey 2013: Why did you come to the Hub today?

User groups for commercial software providers can petition to ensure that out-of-the-box solutions also meet users’ needs and take account of the latest research and understanding of users’ experiences, expectations and preferences in terms of what we provide for them. This may be a harder call, because vendors are not necessarily flexible and agile; they may not be willing to make radical changes unless they see a strong business case (i.e. income may be the strongest factor).

We can build a picture of our users via our statistics. We can look at how users came into the site, the landing pages, where they went from there, which pages are most/least popular, how long they spent on individual pages, etc. This can offer real insights into user behaviour. I think a few training sessions on using Google Analytics on archive sites could come in handy!

We can carry out testing to find out how well sites rank on search engines, and assess the sort of experience users get when they come into a specialist site from a general search engine. What is the text a Google search shows when it finds one of your collections? What do people get to when they click on that link? Is it clear where they are and what they can do when they get to your site?

 * * *

This is the only generation where the teachers and information professionals have grown up in a pre-digital world, and the students (unless they are mature students) are digital natives. Of course, we can’t just sit back and wait a generation for the teachers and information professionals to become more digitally minded! But it is interesting to wonder whether in 25 years time there will be much more consensus in approaches to and uses of ICT, or whether the same issues will be around.

Nigel Shadbolt has described the Web as “one of the most disruptive innovations we have ever witnessed” and at present we really seem to be struggling to find out how best to use it (and not use it), how and when to train people to use it and how and when to integrate it into teaching, learning and research in an effective way.

It seems to me that there are so many narratives and assessments at present – studies and reports that seem to run the gamut of positive to negative. Is technology isolating or socialising? Are social networks making learning more superficial or enabling richer discussion and analysis? Is open access democratising or income-reducing? Is the high cost of technology encouraging elitism in education? Does the fact that information is so easily accessible mean that researchers are less bothered about working to find new sources of information?  With all these types of debates there is probably no clear answer, but let us hope we are moving forward in understanding and in our appreciation of what the Web can do to both enhance and transform learning, teaching and research.

In With the New: open, flexible, user-centered

The 2013 Eduserv Symposium, was held in the impressive (and very much ‘keep in with the old’) surroundings of One Great George Street in Westminster, the home of the Institute of Civil Engineers.

‘In with the New’ covered new skills sets, new modes of engagement and new ways of working.  With such a wide topic area, the conference took quite a broad-brush approach. Andy Powell of Eduserv introduced the day and talked about dealing with change, change that may be imposed upon us from the outside, as well as being driven internally.

image from Digital Govt ServiceDavid Cotterill from the Government Digital Service gave the opening keynote, which is what I want to focus on here. He said his talk was about ‘my exciting life as a civil servant’….the audience weren’t convinced about this at the outset, but maybe for those interested in open data, there was some shift of opinion by the end!

He talked about the old consensus, which was built around long-term contracts for IT in government; contracts that were consistently awarded to a limited number of suppliers and not to smaller and more innovative suppliers.  IT was not defined as a core function, so out-sourcing was considered appropriate. But in the 21st century things have changed. There is recognition that IT covers very diverse areas. For Government (and for many other organisations), it covers digital public services, mission IT systems (i.e. more niche or specialised systems for government departments), desktop, infrastructure, connectivity, etc. (the more general IT), and, within government, there are also ‘shared services’ (such as for financial systems). David talked about the need to structure mission IT systems and digital public services so that they can run on different desktops or infrastructures and not be tied down (as often used to be the case).

David went on to argue that the Government really has taken up the open agenda, and showed some quotes: “The latest step is the publications of this report on open standards. And once again the government has got it right.” (Wall Street Journal).  He argued that in order to have flexibility to progress, to upgrade, to move forwards, you need open and standards based systems. You also need to look at specific needs in specific areas and not think of IT as some kind of monolithic thing.

It was surprising to hear him say that “this is a great time to be a supplier”, but he said that many of the current deals within government come to an end over the next few years, so there is opportunity for new suppliers and creating a more diverse set-up.

What is 21st century governmentgov.uk screenshot about? David said it’s about things like www.gov.uk/, built using a platform approach (rather than a CMS) which allows the Government Digital Service (GDS) to build products onto it that meet user needs; products that enable the government to engage with citizens. David gave a sense of how this approach is working across UK government, with multi-disciplinary teams including developers, designers, product and service managers, policy, communications, etc.

His core message was to start with the user need. Of course, this is something that we can all agree with, although whether it always happens in reality is debatable, even if it is the intention. We need to shape things in terms of user requirements  right from the start, and not bring it in once all the policy, requirements and  development work is done. We should think about capturing requirements and developing alpha and then beta versions before going live. This may mean that what is initially developed is chucked out after the alpha stage, because it doesn’t meet needs, and then there is a need to start again. I think one of the problems with this approach is that funders do not necessarily facilitate it. How easy would it be to get funding for a project where the iterative process may go on for quite some time, and there is a risk of starting again several times in order to get it right? A further difficulty with this from a funding point of view is that it is much harder to specify what you are going to end up with, because you necessarily need to keep an open mind; you’ll end up (hopefully) with what users want, but it might be different to what was envisaged and you’ll only know after the testing and refining process.

It makes we think about archival software systems, for example.  Surely you should put the user needs at the heart of the development of your system? Ideally you would start out by gathering user requirements for a system, maybe looking at other research done in this area. You’d end up with a specification, listing priorities for your system. Most archives can’t then build it themselves, so they would go out and look at what meets these needs. But would it be possible to test a system out with users, to see if it really does fulfill their needs, and if it doesn’t go back and try something else? The problem here is that if you are buying a system, its hard to apply an iterative approach. However, it may be possible to move to a more user-centered approach. You should have clear evidence that the system does meet key user needs, and, in the absence of an ability to chop and change, you should ensure that the system does not tie you down and that it provides the flexibility to build and modify, so that changing priorities can be met.

It’s good to see Government leading the way. David showed previews of some services that are being developed, working towards a more transparent approach to things like transactional services and he highlighted a government manual about building services that people want to use.  There is now a ‘Standards Hub‘, to promote open standards and also to encourage wider participation in solving data challenges. It is amazing to see Government code onimage of keyboard 'save' key GitHub. Somehow that really brought home to me home how different things are now to 10-15 years ago. David, as well as other speakers at the conference, believes that open standards encourage a more efficient approach, so it becomes a cost-saving venture as well as encouraging public engagement and transparency.

Interoperability, data sharing and standards

I recently spoke at the CILIP MmIT group conference, where I inflicted EAD on a group of unsuspecting librarians. Not just EAD, but MARC and MODS XML and even some Linked Data. They may have said it was a bit like going back to library school, but no-one ran away.

I was talking to them about data sharing and interoperability, and asked them to look at resources described using different schema, to think about appropriateness: how well does the data format allow you to describe the resource? How machine-readable is it? How human-readable is it? How human/machine readable does it need to be? Is the format robust? Transformable? Sustainable? Interoperable?

These are all things you need to consider when you’re deciding which format to put your data in – except, of course, we often don’t think about these things much at all. These decisions might have been effectively made for you by the community. If all of your peer institutions use a certain data format, then you’re more likely to use it too. And if you want to share your data with the community, using the same format as they do is important.

But this means that you’re relying on other people to make these decisions about the best format for your data. Those people might know the sector and the issues involved in general, but they might not know your specific circumstances or users. Their decision might have been made a long time ago, before advances in theory and technology (MARC was first developed in the 1960s, and EAD in the 1990s). The choice of format might have been based on available tools, rather than underlying principles.

The same goes for cataloguing standards. Is sticking strictly to ISAD(G) really the best way to describe your collections to meet the needs of a global audience? (This is a topic that’s up for discussion at the Descriptive Standards Roundtable at the 2013 ARA Conference )

Of course, standards only work as standards if there’s sufficient community take-up, and a consensus on how to apply them.

XKCD on standards http://xkcd.com/927/

But progress isn’t made by blindly following rules, and ‘there’s already a standard for that’ is no reason not to think about whether there could be a better standard for it.

Standards should be developed from needs. What do people need to know? What do they need to be able to do with the data? What do we need to be able to tell them? And, if we’re looking to the future, what might they want to be able to do in the future? What do we need to do to the data now, to allow for future wants?

We can only work with what’s available, and it is important to have shared standards and points of reference. But if you don’t take time to consider these points when you’re choosing a standard, you’re not really choosing at all. You’re just perpetuating the status quo.

So take the time to think about what you’re doing with your data. Know why you’re using a particular standard, even if it’s because it’s the best of a bad bunch, or closest to what you want to do. Think about what it can and can’t do. Talk to others who are using it. Look for chances to comment on proposed revisions. The future of standards is the future of your data, and your data is valuable. Don’t let it decay.

Supporting Historians: responding to changing research practices

image of camera lensThis post picks out some highlights from a report from Ithaka S+R, “Supporting the Changing Research Practices of Historians” by Roger C Schonfeld and Jennifer Rutner (December 2012). It concentrates on findings that are of particular relevance for archivists and for discovery. The report is recommended reading. It is a US study, but clearly there are strong similarities with other countries.

The report finds that underlying research methods are still broadly as they were but practices have changed considerably: “Based on interviews with dozens of historians, librarians, archivists, and other support services providers, this project has found that the underlying research methods of many historians remain fairly recognizable even with the introduction of new tools and technologies, but the day to day research practices of all historians have changed fundamentally.”

It goes on to summarise the improvements that archives might make to meet changing needs, none of which are unexpected: “For archives, we recommend ongoing improvements to access through improved finding aids, digitization, and discovery tool integration, as well as expanded opportunities for archivists to help historians interpret collections, to build connections among users, and to instruct PhD students in the use of archives.”

It is very encouraging to see the positive comments about researchers’ interactions with archivists: “Having a meeting with the archivist and librarian is really fantastic, because they help you understand what is in the archive, and what you might be able to use.” It is clear from the study that archivists have a vital role to play as key collaborators and colleagues of historians, and their value is clear: “Archivists are often able
 to hone and direct an inquiry, bringing to light items and collections that the researcher may have been unaware of.”

The study does highlight the changing nature of interactions with archival material, as a result of the use of digital cameras in particular, which enables the analytical work to take place elsewhere. It is generally felt to be a convenient and time-saving option, enabling long-term interaction with resources outside of the reading room. This development is actually described as “the single most significant shift in research practices among historians.” It raises questions about whether the role of the archivist changes when the analytical work is displaced from the archive, as archivists may have less opportunity for intellectual engagement with researchers.  The study does highlight a possible issue with digital copies, namely the separation of metadata from content, where the researcher has hundreds of images and needs to organise them constructively, and it also found that scholars are struggling to work with digitised non-textual content effectively.

The ability to find time for research trips was a primary challenge for many researchers. “Interviewees repeatedly emphasized that the amount of time they are able to spend in the archives shapes the nature of the interaction with the sources significantly.” Because most struggle to find time for research trips,  digitised sources are hugely beneficial.

The study found that digitised finding aids help researchers to “travel more strategically”. It suggests that high-quality finding aids may become more important as researchers move more towards photographic visits to archives, rather than serendipitous visits. This connection is something I have not thought about before, and I would be very interested to hear what archivists think about this idea.

Of major relevance for a service like the Archives Hub is the conclusion about finding aids:

“The use of online finding aids greatly facilitates, and sometimes displaces, these visits. If a “good” finding aid is readily available online, this might make a scouting visit unnecessary, depending on the importance of the archive to the research project. In some cases, researchers were able to rule out a visit to an archive based on the online finding aids, and re-purpose funds and effort to tracking down other sources for the project.”

This study is a clear endorsement for our belief (which, I should say, is also backed up by our own researcher surveys) that finding aids play a role not only in identifying and prioritising sources, but also in providing enough information in themselves to make a visit unnecessary. As well as this, they may have a kind of positive negative effect: the researcher knows that materials can be ruled out.  The study strongly emphasised the need for “searchable databases” and “centralized searching” and participants talked about the problem with locating each collection independently, especially across the diverse types of archive repository: “The process of identifying archives – in some cases small, local archives or international archives – can present an amazing challenge to researchers.” Clearly comprehensive cross-searching search tools are a huge boon to researchers.

In terms of discovery, Google is clearly a major tool and there was a feeling that it was the most comprehensive discovery tool, as well as being convenient and easy to use. It is often used at the start of a searching process.: “Generally, historians discover finding aids through Google searches and archive websites.” There is a clear demand for more descriptions online: “The general consensus among interviewees was that more online finding aids would greatly benefit their research, and that archives should continue to make efforts to make these accessible online. Continued and expanded efforts to develop finding aids more efficiently and to make them available digitally would seem to support the needs of historians for improved access.”

In terms of PhD students (and maybe others who are inexperienced researchers), the study found issues with the use of archives and other sources:

“Interviews with PhD candidates indicated that there is often little support for them in learning about new research methods or practices, either in their department or elsewhere at their institution, of which they are aware. While the subject matter treated by historians continues to diversify dramatically, new methodologies develop, and research practices change rapidly, it is clearly critically important that students have a grounding in the methods and practices of the field.” The Archives Hub has recently produced a brief Guide to Using Archives for the Inexperienced, and discussions on the archives email list showed just how much this is an important topic for archivists and how there was a general consensus that  PhD students need more training on research methodologies.

Summing up, the report makes six recommendations specifically for Archives:

1. More online finding aids
2. More digitisation
3. Discovery tools that promote cross-searching, crossing institutional boundaries and encompassing small and local record offices
4. Adequate resources for ensuring the expertise of the archivist continues to be available, enabling archivists to be active interpreters of the collections
5. Adapting to and facilitating the use of digital cameras and scanners in reading rooms
6. Training PhD students in the use of archives

There is a great deal more of interest and relevance in the report around searching, Google Scholar, the use of the academic library, organising and managing research, citation management and digital research methods. It is very well worth reading.

 

The Shape of Knowledge

row of booksIn the 1870s a young man from a small town in New York decided to organise the world’s knowledge. Well, at least the world’s knowledge in book form. The now ubiquitous  Dewey Decimal system divides knowledge decimally, as Dewey loved the decimal system. So, there are ten top-level classes with ten first-level sub-divisions (and so on). It’s a curious arrangement. Eight of the nine major divisions for religion are given over to Christianity. Dewey relegates Buddhism right down the ranks of its hierarchy, as a ‘religion of Indian origin’. It gives an entire category over to ‘Paranormal Phenomena’, and 999 is, rather satisfyingly, ‘extraterrestrial worlds’ (under 990, ‘General history of other areas’). When computing came along, there was no room for it left in the 600’s – Technology and Applied Sciences – so it went under the 000’s, which was originally for ‘generalities’.

“And there’s the weakness and the greatness of Dewey’s system. The…system lets patrons stroll through the collected works of What We Know – our collective memory palace – but the price for ordering knowledge in the physical world is having to make either-or decisions…The library’s geography of knowledge can have one shape but no other.” (Everything is Miscellaneous, David Weinberger)

The world of Dewey classification doesn’t reflect the way we see the world now because the shape of knowledge is fluid and ever-changing, and even then there were many who disputed his arangement. But it seems that for now we’re stuck with the basics of the Dewey system because the implications of changing it would be massive – libraries the world over have been physically ordered based on Dewey, and long decimal numbers have been painstakingly written on the spines of millions of books.

The Dewey system came to be as a result of the need to store one book in one place – knowledge has to be ordered when it is on shelves. Archives avoid this particular trap because they are not set out on shelves for people to browse, so they do not need a set physical order. The danger of archives being stereotyped as dusty boxes on shelves in dark rooms at least provided the advantage that they did not need to be ordered for browsing; the intellectual arrangement of archives has always been via the finding aids, so the physical collections did not need to undergo the either-or of arrangement in the way that libraries did.

Dewey relies upon giving a book a subject (although there can be cross-referencing to it of course). A book is not always easy to categorise under a subject; but an archive collection may be nigh on impossible to shoe-horn into one subject heading. If it’s hard enough to decide where to put a book about something like globalisation, trade and technology, for example, then it is almost an impossible task with archives because one collection is typically about a whole range of subjects, often ostensibly unrelated. And, of course, often archives are not consciously ‘about’ a subject, in so far as the subject is not central to the reason they were created. For example, a series of correspondence held in a Manchester archive might not be created to consciously describe or explain aspects of social housing developments in Manchester, but it might provide valuable evidence nonetheless; a letter might be written by someone moving into a new housing development, giving a great insight into how people felt about the large post-war housing estates, and what sort of changes it made to their lives. But the collection wouldn’t be ‘put under housing’ because it doesn’t need to be. It would really be impossible to physically put it together with other materials about the same subject because the correspondence might cover all manner of subjects – in a sense random subjects – if the writer is essentially communicating news and stuff that affects their life.

So, what are the  implications for archives cataloguing? How does ‘the geography of knowledge’ impact on archives? We haven’t got something like Dewey, we don’t have the problem of arranging physical things on shelves for people to browse. But do we still have a sense of ‘the right way’ to organise knowledge?

Well, we may not physically arrange archive collections on shelves, but we do approach dealing with each collection by the principles that we deem to be important – provenance and original order. Maybe we’re lucky that we have the principle of original order because it gives us a sensible, rational means to order a collection of sometimes very disparate materials (or you might say the idea is that the collection is already ordered for us). If we dispensed with original order, then we could come up with all sorts of other ways to order things but it is hard to see them making much sense. Weinberger’s book ‘Everything is Miscellaneous’, holds to the principle that in the digital age information wants to be free from all physical constraints, but I contend that original order provides a physical order that gives researchers an option – a way into the content should they choose to take it. I think ‘everything can be miscellaneous’ is more to the point. There are good reasons for imposing a physical order on an archive; but that shouldn’t mean that researchers are constrained as a result.

I think that what we need to be thinking about is enabling researchers to organise knowledge themselves – in a way that is relevant and useful for their own purposes. This potential for organisation is directly related to how we catalogue. Many people will search by subject, but when I look at the descriptions on the Archives Hub, I find many don’t have subject headings added to them. Subject headings offer significant advantages; they allow for the idea of different ways into a collection of information. They are like different pathways for researchers to take in order to get to the collection and connect it up with other collections.

When I search for ‘cooperative movements’ as a phrase on the Hub I get 40 hits. When I search for it as a subject I get 15 hits. If the system was working perfectly, I would deduce from this that there are 15 instances where ‘cooperative movement’ is a significant subject, and 25 more where it is relevant in some way – maybe it is referred to in passing, but the archive is not substantially concerned with this topic. However, it doesn’t really work like this because it is impossible to achieve that level of consistency in cataloguing. Different people catalogue differently. Some cataloguers put in more subjects, and some less; some maybe take more time to think about appropriate subjects, others just add a few very quickly; some don’t put any in at all, maybe believing that a free text search is enough. The end result of this is that searching becomes even more of a chance thing than it maybe needs to be. The irony for me, managing an aggregator, is that life would probably be a great deal easier if everyone catalogued in a superficial way…as long as it was consistent. As it is, you enter a subject term and you may still miss an archive of major importance. Enter a keyword (searching all the text) and you may not enter the same word(s) the cataologuer has used. There is, without doubt, an inevitable mis-match between what the cataloguer does and what the researcher needs in many cases.

It is a similar situation with the title of the material, which has become a vital way into collections now that so many people use general search engines. The title is what they see in a list of Google results. It needs to do its very best to reflect the content of the archive.  “Miscellany of eighteenth century poems by various authors” is pretty good, when you have something that is quite varied it pulls it together by what it is and when it was created. “Verse miscellany” is not so good, as it gives the researcher less to go on. “Poems” is pretty vague. A researcher on the Hub can look for ‘poems’ and then narrow the search down by other means, but when on Google these titles are not so useful. We try to keep the dates of creation with the title, as the two together provide a good deal more information. But a title can so often give a sense of the miscellaneous in archives; and it can be quite difficult to get round this with some of the more varied collections, which can sometimes be somewhat esoteric. Other titles just offer a personal or organisation name, which is fine when the researcher is in the reading room – they assume the name means that this is an archive about this person/organisation. Out of content a name is just a name and could mean absolutely anything.

Of course, we have to take a pragmatic approach, and there has been plenty written about this. Cataloguing will never ever be perfect: researchers will always have to seek in order to find. But we can probably do more to make things better, and we can try to understand more about the ways that people both look for something they want to find and search for what is out there (not knowing what they want to find).

I believe that it is worth putting a small amount extra thought into the words that are chosen when cataloguing, thinking about how each end-user will want to organise their own geography of knowledge.  A bit of thought about the key significant subjects is a good approach. This will help people, coming from different perspectives, and different search strategies, to discover archive collections.

We are still a long way from connecting things up in a way that researchers would like to see. The vision of Linked Data is to do just this. It offers a way to make connections across data sets. It opens up the idea of organising knowledge so that its never just one thing but a completely fluid landscape.  It’s not Melvil Dewey, looking at the world and giving us his version of how it should be organised; rather it is offering the chance to organise the world in an infinite number of ways. If others out there have resources on ‘The Fabian Society’ or ‘Beatrice Webb’ or ‘ the co-operative movement’ they can state that their concepts are the same as mine, and therefore my archive can be linked to these other resources.  This opens up data, enabling people to traverse data sets and bring resources together for their own ends.  For creating Linked Data, structured concepts, like subject headings, are a great  help, because they facilitate making these connections. Of course, there’s a bit more involved in Linked Data (including creating persistent URIs and actually matching up the same concepts), but the potential to link knowledge together in this large-scale way is immense.

Another means to encourage this fluidity is to allow end-users to add tags to content, so that we generate a mass of ways into the data. We really have to seriously consider this option for archival data, because it offers such significant advantages in terms of making things more discoverable. It is moving away from the idea that there is one way of doing things. It allows for things to be organised in an infinite variety of ways. Plenty of projects are now doing this, such as the zooniverse science projects https://www.zooniverse.org/, the Your Paintings project and the British Library georeferencing project for maps, but I’m not sure that we are really embracing it on a day-to-day level within archive catalogues.

lego brick

An archive can act like a lego set. As archivists we present the set as it was originally built, and we aim to keep this because it is evidence of its use. But we want, somehow, to label the whole, and to label parts of the whole, in such a way that researchers can take bits of them and use them to build other constructs; the difference now from 50 years ago is that we are more aware that we should not try to second-guess the constructs that people want to make, but we should catalogue to allow for infinite patterns.

 

 

 

With a little help from the Interface

It is tempting to forge ahead with ambitious plans for Web interfaces that grab the attention, that look impressive and do new and whizzy things. But I largely agree with Lloyd Rutledge that we want “less emphasis on grand new interfaces” (Lloyd Rutledge, The Semantic Web – ISWC 2010, Selected Papers). I think it is important to experiment with exciting, innovative interfaces, but the priority needs to be creating interfaces that are effective for users, and that usually means a level of familiarity and supporting the idea that “users of the Web feel it acts they way they always knew it should (even though they actually couldn’t imagine it beforehand).” Maybe the key is to make new things feel familiar, so that we aren’t asking users to learn a whole new literacy, but a new literacy will gradually emerge and evolve.

For the Archives Hub, we face similar challenges to many websites that promote and provide access to archives, although our challenges are compounded by being an aggregator and not being in control of the content of the descriptions. We are seeking to gradually modify and improve our interfaces, in the hope that we help to make the users’ discovery experiences more effective, and encourage people to engage with archives.

One of our aims is to introduce options for users that allow them to navigate around in a fairly flexible manner, meeting different levels of experience and need, but without cluttering the screen or making the navigation look complicated and off-putting. Interviews with researchers have indicated how people have a tendency to ‘click and see’, learning as they go, but expecting useful results fairly quickly, so we want to work with this principle, to use hyperlinks effectively, on the understanding that the terminology used and the general layout of the page will have an effect on user expectations.

A Separation of Parts

One of the issues when presenting an archival description is how to separate out the ‘further actions’ or ‘find out more’ from the basic content. The challenge here is compounded by the fact that researchers often believe the description is the actual content, and not just metadata, or alternatively they assume that they can always access a digital resource.

We have tried to simplify the display by introducing a Utility Bar. It is intended to bring together the further options available to the end user. The idea is to make the presentation neater, show the additional options more clearly, and also keep the main description clear and self-contained.

Archives Hub description

 

The user can click to find out how to access the materials, to find out where the repository is located in the UK or contact the repository by email. We are planning to make the email contact link more direct, opening an email and populating it with the email address of the repository in order to cut down on the number of stages the user has to go through (currently we link to the Archon directory of Archive services). We can also modify other aspects of the Utility Bar over time, adding functionality as required, so it is a way to make the display more extensible.

We have included links to social networking sites, although in truth we have no real evidence that these are required or used. This really was a case of ‘suck it and see’ and it will be interesting to investigate whether this functionality really is of value. We certainly have a lively following on Twitter, and indications are that our Twitter presence is valued, so we do believe that social networking sites play an important part in what we do.

We have also included the ability to view different formats. This will not be of value to most researchers, but it is  intended to be part of our mission to open up the data and give a sense of transparency – anyone can see the encoding behind the description and see that it is freely available. Some of our contributors may find it useful, as well as developers interested in the XML behind the scenes.

The Biggest Challenge: how to present an archive description

Until recently we presented users with an initial hit list of results, which enabled them to see the title of a description and choose between a ‘summary’ presentation and a ‘full’ presentation. However, feedback indicates that users don’t know what we mean by this. Firstly, they haven’t yet seen the description, so there is nothing on which to base the choice of  link to click, and secondly, what is the definition of ‘summary’ and ‘full’ anyway? Our intention was to give the user the choice of a fairly brief, one page summary description, with the key descriptive data about the archive collection, or the full, complete description, which may run to many pages. A further consideration was that we could only provide highlighting of terms on a single page, so if we only had the full description, highlighting would not be possible.

There are a number of issues here. (a) Descriptions may be exactly the same for summary and full because sometimes they are short, only including key fields, and they do not provide multi-level content; the full description will only provide more information if the cataloguer has filled in additional fields, or created a multi-level display. (b) ‘Summary’ usually means a cut-down version of something, taking key elements, but we do not do this; we simply select what we believe to be the key fields. For example, Scope and Content may actually be very long and detailed, but it would always be part of the ‘summary’ description. (c) Fields that are excluded from the summary view may be particularly important in some cases – for example, the collection may be closed for a period of time, and this would really be key information for a researcher.

With the new Utility Bar we changed ‘summary’ and ‘full’ to become ‘brief’ and ‘detailed’. We felt that this more accurately reflects what these options represent. At present we have continued with the same principle of displaying selected fields in the ‘brief’ description, but we feel that this approach should be revised. After much discussion, we have (almost) decided that we will change our approach here. The brief description will become simply the collection-level description in its entirety; the detailed description will be the multi-level description. This gives the advantage of a certain level of consistency, but there are still potential pitfalls. Two of the key issues are (a) that ‘brief’ may actually be quite long (a collection description can still be very long) and (b) that many descriptions are not multi-level, so there would be no difference between the two descriptions. Therefore, we will look at creating a scenario where the user only gets the ‘Detailed Description’ link when the description is multi-level. If we can do this we will may change the terminology; but in the end there is no real user-friendly way to succinctly describe a collection-level as opposed to a multi-level description, simply because many people are not aware of what archival hierarchy really means.

Archives Hub list of resultsAs well as introducing the Utility Bar we changed the hit list of results to link the title of the description to the brief view. We simply show the title and the date(s) of the archive, as we feel that these are the key pieces of information that the researcher needs  in order to select relevant collections to view.

 

Centralised Innovation

For some of the more complex changes we want to make, we need to first of all centralise the Archives Hub, so that the descriptions are all held by us. For some time we thought that this seemed like a retrograde step: to move from a federated system to a centralised system. But a federated system adds a whole layer of complexity because not only do you not have control over the data you are presenting; you do not have control over some of the data at all, to view it, and examine any issues with it, and also to potentially improve the consistency (of the markup in particular). In addition, there is a dependency between the centralised system and the local systems that form the federated model. Centralising the data will actually allow us to make it more openly available as well, and to continue to innovate more easily.

Multiple Gateways: Multiple Interfaces

We will continue to work to improve the Archives Hub interface and navigation, but we are well aware that increasingly people use alternative interfaces, or search techniques. As Lorcan Dempsey states: “options have multiplied and the breadth of interest of the local gateway is diminished: it provides access only to a part of what I am potentially interested in.” We need to be thinking more broadly: “The challenge is not now only to improve local systems, it is to make library resources discoverable in other venues and systems, in the places where their users are having their discovery experiences.” (Lorcan Dempsey’s Webblog). This is partly why we believe that we need to concentrate on presenting the descriptions themselves more effectively – users increasingly come directly to descriptions from search engines like Google, rather than coming to the Archives Hub homepage and entering a search from there. We need to think about any page within our site as a landing page, and how best to help users from there, to discovery more about what we have to offer them.

 

 

 

 

 

 

 

 

Season’s greeting and Christmas closure

"Sunshine Annual 1938. The brightest of the year."
“Sunshine Annual 1938. The brightest of the year.”
The Sunshine Annual was a children’s annual produced by the Co-op movement.
Image copyright © National Co-operative Archive.

The Archives Hub team wish everyone a very Merry Christmas, and a Happy New Year!

The Archives Hub office will close on 21st December and will reopen on the 2nd January.

The Archives Hub service will be available over Christmas and New Year, but there will be no helpdesk support. Any queries sent over this period will be dealt with when we return.

The Hub out and about – presenting, training, and pubbing

The Hub team like to get out and about to present, teach, and chat about archives and information. It can get a bit lonely being a purely online service, with our users and contributors at the other end of an email or phone call, so we try to ensure that we take advantage of chances to meet them face-to-face.

The last week of November was a busy week for this! On the Wednesday Jane and I (Bethan) gave a presentation to the MA Library & Information students at MMU.

We’ve given similar presentations to Archive students and early-career professionals in the past, but this is the first time we’ve given one to Library students. I’m pleased to say it worked  well – the students were engaged and knowledgeable about archives, and how issues in libraries and archives cross-over.

It’s always very encouraging and stimulating to meet an enthusiastic group (I’d also met them the week before to talk about professional organisations), and both Jane and I really enjoyed giving the session. We had some nice feedback from the students, too, with one person saying:

The workshop was informative as well as entertaining. Complex issues were broken down so they were easier to understand. In a short amount of time a lot of areas were covered and due to the lively presentation style we all remained engaged and interested throughout.

And another said that they wished they had more next week!

I think it’s very important for us to be involved in talking to students, trainees, and early-career professionals. It’s good for them to hear from people who are actually working with the data that they’ll be creating. If nothing else, if we educate them about the need for good, interoperable data now, we’ll get better data from them later on! It’s also great to be able to tell them about the different sorts of jobs and opportunities there are for them, and hopefully give them some ideas about ‘alternative’ careers.

The next day saw me, Jane and Lisa heading down to London, for the inaugural ‘Hub in the Pub‘ on the Thursday evening, before a training session on the Friday. We joined forces with a large contingent of museum folk who were ‘Drinking about Museums’, and had a very enjoyable and useful couple of hours chatting about general information, data, and cultural heritage issues. We hope to have more ‘Hub in the Pub’ events in future, so watch our mailing list and twitter feed for details.

We made sure that the evening didn’t get too merry, so we were on top form for our contributors training day the next day. These training days are designed to help current and potential contributors use our EAD Editor, and are also a great chance to get to know our contributors  and chat to them about any issues they might have. We have a few places left on our next training day in Glagsow in January – do sign up if you’d like to come along, or contact us if you’d like to know more.

If you can’t get along to a training session, we have online audio tutorials and a workbook designed to give you a step-by-step guide to using the Editor – and we’re always happy to answer any questions.

An evaluation of the use of archives and the Archives Hub

This blog is based upon a report written by colleagues at Mimas* presenting the results of the evaluation of our innovative Linked Data interface, ‘Linking Lives‘. The evaluation consisted of a survey and a focus group, with 10 participants including PhD students and MA students studying history, politics and social sciences. We asked participants a number of questions about the Archives Hub service, in order to provide context for their thoughts on the Linking Lives interface.

This blog post concentrates on their responses relating to the use of archives, methods of searching and interpretation of results. You can read more about their responses to the Linking Lives interface on our Linking Lives blog.

Use of Archives and Primary Source Materials

We felt that it was important to establish how important archives are to the participants in our survey and focus group. We found that “without exception, all of the respondents expressed a need for primary resources” (Evaluation report). One respondent said:

“I would not consider myself to be doing proper history if I wasn’t either reinterpreting primary sources others had written about, or looking at primary sources nobody has written about. It is generally expected for history to be based on primary sources, I think.” (Survey response)

One of the most important factors to the respondents was originality in research. Other responses included acknowledgement of how archives give structure to research, bringing out different angles and perspectives and also highlighting areas that have been neglected. Archives give substance to research and they enable researchers to distinguish their own work:

“Primary sources are very valuable for my research because they allow me to put together my own interpretation, rather than relying on published findings elsewhere.” (Survey response)

Understanding of Archives

It is often the case that people have different perceptions of what archives are, and with the Linking Lives evaluation work this was confirmed. Commonly there is a difference between social scientists and historians; the former concentrating on datasets (e.g. data from the Office of National Statistics) and the latter on materials created during a person’s life or the activities of an organisation and deemed worthy of permanently preserving. The evaluation report states:

“The participants that had a similar understanding of what an archive was to the Archive Hub’s definition had a more positive experience than those who didn’t share that definition.”

This is a valuable observation for the work of the Hub in a general sense, as well as the Linking Lives interface, because it demonstrates how initial perceptions and expectations can influence attitudes towards the service. In addition, the evaluation work highlighted another common fallacy: that an archive is essentially a library. Some of the participants in the survey expected the Archives Hub to provide them with information about published sources, such as research papers.

These findings highlight one of the issues when trying to evaluate the likely value of an innovative service: researchers do not think in the same language or with the same perspectives as information professionals. I wonder if we have a tendency to present services and interfaces modelled from our own standpoint rather than from the standpoint of the researcher.

Search Techniques and Habits

“Searches were often not particularly expansive, and participants searched for specific details which were unique to their line of enquiry” (Evaluation report). Examples include titles of women’s magazines, personal names or places. If the search returned nothing, participants might then broaden it out.

Participants said they would repeatedly return to archives or websites they were familiar with, often linked to quite niche research topics. This highlights how a positive experience with a service when it is first used may have a powerful effect over the longer term.

The survey found that online research was a priority:

“Due to conflicting pressures on time and economic resources, online searching was prevalent amongst the sample. Often research starts online and the majority is done online. Visits to see archives in person, although still seen as necessary, are carefully evaluated.”  (Evaluation report)

The main resources participants used were Google and Google Scholar (the most ubiquitous search engines used) as well as The National Archives, Google Books and ESDS. Specialist archives were referred to relating to specific search areas (e.g. The People’s History Museum, the Wellcome Library, the Mass Observation Archive).

Thoughts and Comments About the Archives Hub

All participants found the Hub easy to navigate and most found locating resources intuitive. As part of the survey we asked the participants to find certain resources, and almost all of them provided the right answers with seemingly no difficulty.

“It is clear. The descent of folders and references at the top are good for referencing/orientating oneself. The descriptions are good – they obviously can’t contain everything that could be useful to everyone and still be a summary. It is similar to other archive searches so it is clear.” (Survey response, PhD history student)

The social scientists that took part in the evaluation were less positive about the Archives Hub than the historians. Clearly many social science students are looking for datasets, and these are generally not represented on the Hub. There was a feeling that contemporary sources are not well represented, and these are often more important to researchers in fields like politics and sociology. But overall comments were very positive:

“…if anyone ever asked about how to search archives online I’d definitely point them to the Archives Hub”.

“Useful. It will save me making specific searches at universities.”

Archives Hub Content

It was interesting to see the sorts of searches participants made. A search for ‘spatial ideas’ by one participant did not yield useful results. This would not surprise many archivists – collections are generally not catalogued to draw out such concepts (neither Unesco nor UKAT have a subject heading for this; LCSH has ‘spatial analysis’). However, there may well be collections that cover a subject like this, if the researcher is prepared to dig deep enough and think about different approaches to searching. Another participant commented that “you can’t just look for the big themes”. This is the type of search that might benefit from us drawing together archive collections around themes, but this is always a very flawed approach. This is one reason that we have Features, which showcase archives around subjects but do not try to provide a ‘comprehensive’ view onto a subject.

This kind of feedback from researchers helps us to think about how to more effectively present the Archives Hub. Expectations are such an important part of researchers’ experiences. It is not possible to completely mitigate against expectations that do not match reality, but we could, for example, have a page on ‘The Archives Hub for Social Scientists’ that would at least provide those who looked at it with a better sense of what the Hub may or may not provide for them (whether anyone would read it is another matter!).

This survey, along with previous surveys we have carried out, emphasises the importance of a comprehensive service and a clear scope (“it wasn’t clear to me what subjects or organisations are covered”). However, with the nature of archives, it is very difficult to give this kind of information with any accuracy, as the collections represented are diverse and sometimes unexpected. in the end you cannot entirely draw a clear line around the scope of the Archives Hub, just like you cannot draw a clear line around the subjects represented in any one archive. The Hub also changes continuously, with new descriptions added every week. Cataloguing is not a perfect art; it can draw out key people, places, subjects and events, but it cannot hope to reflect everything about a collection, and the knowledge a researcher brings with them may help to draw out information from a collection that was not explicitly provided in the description. If a researcher is prepared to spend a bit of time searching, there is always the chance that they may stumble across sources that are new to them and potentially important:

“…another student who was mainly focused on the use of the Kremlin Archives did point out that [the Archives Hub] brought up the Walls and Glasier papers, which were new to [them]”.

Even if you provide a list of subjects, what does that really mean? Archives will not cover a subject comprehensively; they were not written with that in mind; they were created for other purposes – that is their strength in many ways – it is what makes them a rich and exciting resource, but it does not make it easy to accurately describe them for researchers. Just one series of correspondence may refer to thousands of subjects, some in passing, some more substantially, but archivists generally don’t have time to go through an entire series and draw out every concept.

If the Archives Hub included a description for every archive held at an HE institution across the UK, or for every specialist repository, what would that signify? It would be comprehensive in one sense, but in a sense that may not mean much to researchers. It would be interesting to ask researchers what they see as ‘comprehensive resources’ as it is hard to see how these could really exist, particularly when talking about unpublished sources.

Relevance of Search Results

The difficulties some participants had with the relevance of results comes back to the problem of how to catalogue resources that often cover a myriad of subjects, maybe superficially, maybe in detail; maybe from a very biased perspective. If a researcher looks for ‘social housing manchester’ then the results they get will be accurate in a sense – the machine will do its job and find collections with these terms, and there will be weighting of different fields (eg. the title will be highly weighted), but they still may not get the results they expect, because collections may not explicitly be about social housing in Manchester. The researcher needs to do a bit more work to think about what might be in the collection and whether it might be relevant. However, cataloguers are at fault to some extent. We do get descriptions sent to the Hub where the subjects listed seem inadequate or they do not seem to reflect the scope and content that has been provided. Sometimes a subject is listed but there is no sense of why it is included in the rest of the description. Sometimes a person is included in the index terms but they are not described in the content. This does not help researchers to make sense of what they see.

I do think that there are lessons here for archivists, or those who catalogue archives. I don’t think that enough thought is gives to the needs of the researcher. The inconsistent use of subject terms, for example, and the need for a description of the archive to draw out key concepts a little more clearly. Some archivists don’t see the need to add index terms, and think in terms of technologies like Google being able to search by keyword, therefore that is enough. But it isn’t enough. Researchers need more than this. They need to know what the collection is substantially about, they need to search across other collections about similar subjects. Controlled vocabulary enables this kind of exploratory searching. There is a big difference between searching for ‘nuclear disarmament’ as a keyword, which means it might exist anywhere within the description, and searching for it as a subject – a significant topic within an archive.

 

*Linking Lives Evaluation: Final Report (October 2012) by Lisa Charnock, Frank Manista, Janine Rigby and Joy Palmer