The modern archivist: working with people and technology

I’ve recently read Kate Theimer’s very excellent post on Honest Tips for Wannabe Archivists Out There.

This is something that I’ve thought about quite a bit, as I work as the manager of an online service for Archives and I do training and teaching for archivists and archive students around creating online descriptions. I would like to direct this blog post to archive students or those considering becoming archivists. I think this applies equally to records managers, although sometimes they have a more defined role in terms of audience, so the perspective may be somewhat different.

It’s fine if you have ‘a love of history’, if you ‘feel a thrill when handling old documents’. That’s a good start. I’ve heard this kind of thing frequently as a motivation for becoming an archivist. But this is not enough. It is more important to have the desire to make those archives available to others; to provide a service for researchers. To become an archivist is to become a service provider, not an historian. It may not sound as romantic, but as far as I am concerned it is what we are, and we should be proud of the service we provide, which is extremely valuable to society. Understanding how researchers might use the archives is, of course, very important, so that you can help to support them in their work. Love of the materials, and love of the subject (especially in a specialist repository) should certainly help you with this core role. Indeed, you will build an understanding of your collections, and become more expert in them over time, which is one of the wonderful things about being an archivist.

Your core role is to make archives available to the community – for many of us, the community is potentially anyone, for some of us it may be more restricted in scope. So, you have an interest in the materials, you need to make them available. To do this you need to understand the vital importance of cataloguing. It is this that gives people a way in to the archives. Cataloguing is a real skill, not something to be dismissed as simply creating a list of what you have. It is something to really work on and think about. I have seen enough inconsistent catalogues over the last ten years to tell you that being rigorous, systematic and standards-based in cataloguing is incredibly important, and technology is our friend in this aim. Furthermore, the whole notion of ‘cataloguing’ is changing, a change led by the opportunities of the modern digital age and the perspectives and requirements of those who use technology in their every day life and work. We need to be aware of this, willing (even excited!) to embrace what this means for our profession and ready to adapt.

image of control roomThis brings me to the subject I am particularly interested in: the use of technology. Cataloguing *is* using technology, and dissemination *is* using technology. That is, it should be and it needs to be if you want to make an impact; if you want to effectively disseminate your descriptions and increase your audience. It is simply no good to see this profession as in any way apart from technology. I would say that technology is more central to being an archivist than to many professions, because we *deal in information*. It may be that you can find a position where you can keep technology at arm’s length, but these types of positions will become few and far between.  How can you be someone who works professionally with information, and not be prepared to embrace the information environment? The Web, email, social networks, databases: these are what we need to use to do our jobs. We generally have limited resources, and technology can both help us make the most of the resources we have and, conversely, we may need to make informed choices about the technology we use and what sort of impact it will have. Should you use Flickr to disseminate content? What are the pros and cons? Is ‘augmented reality’ a reality for us? Should you be looking at Linked Data? What is is and why might it be important? What about Big Data? It may sound like the latest buzz phrase but it’s big business, and can potentially save time and money. Is your system fit for purpose? Does it create effective online catalogues? How interoperable is it? How adaptable?

Before I give the impression that you need to become some sort of technical whizz-kid, I should make clear that I am not talking about being an out-and-out techie – a software developer or programmer. I am talking about an understanding of technology and how to use it effectively. I am also talking about the ability to talk to technical colleagues in order to achieve this. Furthermore, I am talking about a willingness to embrace what technology offers and not be scared to try things out. It’s not always easy. Technology is fast-moving and sometimes bewildering. But it has to be seen as our ally, as something that can help us to bring archives to the public and to promote a greater understanding of what we do. We use it to catalogue, and I have written previously about how our choice of system has a great impact on our catalogues, and how important it is to be aware of this.

Our role in using technology is really *all about people*. I often think of myself as the middleman, between the technology (the developers) and the audience. My role is to understand technology well enough to work with it, and work with experts, to harness it in order to constantly evolve and use it to best advantage, but also to constantly communicate with archivists and with researchers. To have an understanding of requirements and make sure that we are relevant to end-users. Its a role, therefore, that is about working with people. For most archivists, this role will be within a record office or repository, but either way, working with people is the other side of the coin to working with technology. They are both central to the world of archives.

If you wonder how you can possibly think about everything that technology has to offer: well, you can’t. But that’s why it is even more vital now than it has ever been to think of yourself as being in a collaborative profession. You need to take advantage of the experience and knowledge of colleagues, both within the archives profession and further afield. It’s no good sitting in a bubble at your repository. We need to talk to each other and benefit from sharing our understanding. We need to be outgoing. If you are an introvert, if you are a little shy and quiet, that’s not a problem; but you may have to make a little more effort to engage and to reach out and be an active part of your profession.

They say ‘never work with children and animals’ in show business because both are unpredictable; but in our profession we should be aware that working with people and technology is our bread and butter. Understanding how to catalogue archives to make them available online, to use social networks to communicate our messages, to think about systems that will best meet the needs of archives management, to assess new technologies and tools that may help us in our work. These are vital to the role of a modern professional archivist.

HubbuB: March 2012

New collections on the Hub

A special mention for the University of Worcester Research Collections – they have now been added to the Hub as collection level descriptions, thanks largely to their HLF ‘Skills for the Future’ trainee, Sarah.

We are delighted to have the Royal College of Psychiatrists as a new contributor, adding to a number of distinguished Royal Colleges already on the Hub.

Feature for March

This month we step into the world of augmented reality with a feature about the SCARLET project:

The feature tells us that “The SCARLET ‘app’ now enables students to study early editions of Dante’s Divine Comedy, for example, while simultaneous viewing catalogue data, digital images, webpages and online learning resources on their tablet devices and phones.” It all sounds very exciting, and something that archives can really play a very active part in.

EAD Editor

We’ve been busy testing the new instance of the EAD Editor, which will be released soon. We’ll be able to tell you more about that shortly.

We now have a page giving you information about the ‘right click’ menu that helps you with things like paragraphs, lists and links:

SRU and OAI-PMH

APIs are becoming increasingly important with the open data agenda. We have provided APIs for some years now. Recently we have updated the information on these to help developers who would like to use them to access Hub descriptions: http://archiveshub.ac.uk/sru/ and http://archiveshub.ac.uk/oaipmh/

The SRU interface is used to provide data to Genesis, the portal for Women’s Studies: http://www.londonmet.ac.uk/genesis/. It means that the data is only held in one place, but a different interface provides access to select descriptions – in this case, descriptions relating to women.

APIs may not mean a great deal to you, as they are primarily something developers use to create new interfaces, mash-ups and cross-data explorations, but do pass this on if you know of developers interested in working with our data. We want to ensure that archives are at the heart of innovations in opening up and exploring data connections.

Page about identifiers

Some of you may have read my recent blog post about issues with identifiers for archives and for archive descriptions. We now have a page on the Hub to help explain what a persistent unique identifier is and how you create it:

http://archiveshub.ac.uk/identifiers/

As ever, please ask us if you have any questions about this.

Former Reference

The Archives Hub now displays former reference with the label of ‘alternative ref’. This is because for some contributors the former reference is, in fact, the main reference, so we felt this was the best compromise. For example: http://archiveshub.ac.uk/data/gb1069-12 (see lower level entries).

The new EAD Editor will allow for descriptions with a former reference to be uploaded, edited and removed, but it will not provide the facility to create them from scratch.

Case Studies Wanted!

Finally, we have a case studies section – http://archiveshub.ac.uk/casestudies/. We’d love to hear from any researchers willing to provide us with a case study. It is a really useful way for us to convey the importance of the Hub to our funders.

More Product, Less Processing?

I’ve been reading a fascinating article by Mark A. Greene and Dennis Meissner, ‘More Product Less Process: Revamping Traditional Archival Processing‘ (PDF). I wanted to offer a summary of the article.

image of scalesThe essence of this article is that archivists spend too long processing collections (appraising, cataloguing and carrying out minor preservation). This approach is not working; the cataloguing backlog continues to increase. We are too conservative, cautious and set in our ways, and we need to think about a new approach to cataloguing that is more pragmatic and user-focussed. The article was written by archivists in the USA, but would seem to apply to archives here in the UK, where we know that the backlog is a continuing problem.

I think the article makes the argument well and with a good deal of conviction. The bottom line is that we must rethink our approach unless we are to continue to accrue backlogs and deny researchers access to hugely valuable primary source material.

However, there are arguments in support of detailed cataloguing. For digital archives it is extremely useful to provide metadata at the item level,  enabling such useful resources as http://archiveshub.ac.uk/data/gb1837des-dca?page=3#id634580. With this detailed list, researches can see digital resources described and then access them directly. It could be argued that if a collection is to be digitised, providing this sort of level of metadata is appropriate, and in general it is the more valuable and highly used collections that are digitised. But for born-digital collections, this level of detail would be totally unsustainable.

Also, I wonder if the work that volunteers do should be taken into account – they may be able to help us catalogue in more detail, whilst trained archivists continue to create the main collection or series-level descriptions. I remember a whole band of NADFAS volunteers cataloguing photographs where I used to work. Furthermore, I was speaking to an archivist recently who said that they had taken the time to weed out duplicates (something this report criticises)…and then sold them on eBay for a tidy profit, that helped them fund their very under-resourced archive (they had the rights to do this!). So, maybe there are factors to take into consideration that support a detailed approach, but I think a bold approach to examining this whole area in UK archives would be very welcome.

Some of the points made in the report:

  • Archivists spend too much time cataloguing, not necessarily doing what is necessary. We think in terms of an ideal that we have to reach, although we haven’t actually articulated what this ideal is, and really examined it.
  • We are too attached to old-fashioned ways of doing things, which worked when we had smaller collections to deal with, but are not appropriate for large 20th century collections.
  • We give a higher priority to serving the needs of our collections rather than the needs of our users.
  • We need a new set of guidelines that focus on what we absolutely need to do.
  • We need to discuss, debate and examine our approach to cataloguing, and not be defensive about our roles.
  • We tend to arrange collections down to item level. In particular, we carry out preservation activities to this level. We accept the premise that basic preservation steps necessitate an item-level approach.
  • We often remove all metal fastenings and put materials into acid-free folders. So, even if we do not describe collections down to item level (maybe we just describe at collection or series level), we go down to this level of detail in our preservation activities.  Yet, with good climate control, metal fasteners should not rust, and as yet we do not have strong evidence of a detrimental effect of standard manila folders if the materials is stored in a controlled environment.
  • We often weed out duplicates throughout a collection, which requires processing down to item level. Is this really worth doing?
  • The various sources of advice about the level of detail we process archives to are inconsistent. Some sources advocate description to series level, but preservation activities to item level. NARA advocates preservation in accordance with intrinsic value and anticipated use, so, for example, new folders should only be used if current ones are damaged, and metal fasteners should be removed only if ‘appropriate’ – meaning where they are causing obvious damage.
  • We seem to believe that we need to aspire to ‘a substantial, multi-layered, descriptive finding aid,’ a reflection of ‘slow, careful scholarly research’.  But in reality, maybe we should adopt a more flexible approach, taking each collection in turn on its merits. Some may justify detailed cataloguing, but many do not.
  • We should take the position that users come to do research, and that we do not have to do this for them in advance.
  • We should ‘get beyond our absurd over-cautiousness’ about providing access to unprocessed collections, and make them available unless there are good legal or preservation reasons to restrict access or the collection is of extremely high value.
  • We have very inadequate processing metrics. Attempts to quantify processing expectations have resulted in wildly differing figures. Figures given in various studies include 3, 6.9, 8, 12.7 and 10.6 hours per cubic foot. Other studies have come up with between 3 and 5.5 days per foot.
  • One major study  by an archive centre revealed 15.1 hours were spent on each cubic foot, far more than the value that was placed upon  what was accomplished. The study gave ‘an improved sense of the real and total costs involved’.
  • The Greene/Meissner study looked at various projects funded by NHPRC grants (National Historical Publications & Records Committee), and found an average productivity figure of 9 hours per foot, but with highs of around 67 hours per foot.  It also conducted an email survey and found expectations of processing times averaged at 14.8 hours, although there was a high of 250 hours!
  • Grant funding often encourages an item-level focus, rather than helping us to really tackle our substantial backlogs. There should be more of a requirement to justify meticulous processing – it should only be for exceptional collections.
  • The study recommends aiming for a processing rate of 4 hours per cubic foot for most large 20th century collections, using a series-level approach for description and preservation.
  • Studies show a lack of standardisation, not only in our definitions but also around the levels of arrangement, preservation and access that are useful and necessary.  We do not have proper administrative controls over this work. We tend to argue for each of us having a unique situation, that does not allow for comparison, and we do not have a common sense of acceptibile policies and procedures.
  • Whilst we continue to process to item level, a substantial number do not make catalogues available through OPACs or Websites, arguably prioritising processing over user needs.

The report concludes that maybe we should recognise that ‘the use of archival records…is the ultimate purpose of identification and administration.’ (SAA, Planning for the Archival Profession, 1986).  Maybe we should agree that a collection is catalogued if it ‘can be used productively for research.’ And maybe we should be willing to take a different approach for each collection, making choices and setting priorities, rather than being too caught up in a ‘love of craftmanship’ that could be seen as fastidiousness that does not truly serve the user.

The question seems to be how much would be lost by putting speed of processing before careful examination of all documents in a collection.  Maybe this does require defining good cataloguing? Maybe we believe that our professional standing is tied up with undertaking detailed cataloguing…more so than the ever increasing growth of backlogs, where the papers are entirely unaccessible to researchers?

Greene and Meissner state that there should be a ‘golden minimum’ for processing, where we adequately address user needs and only go beyond this where there are demonstrable business reasons. They also believe that arrangement, description and preservation should all occur at the same level of detail, again, unless there are good reasons to deviate from this.

What do you think…?

Out and about or Hub contributor training

Every year we provide our contributors and potential contributors with free training on how to use our EAD editor software.

The days are great fun and we really enjoy the chance to meet archivists from around the UK and find out what they are working on.

The EAD editor has been developed so that archivists can create online descriptions of their collections without having to know EAD.  It’s intuitive and user friendly and allows contributors to easily add collection level and multi-level descriptions to the Hub.  Users can also enhance their descriptions by adding digital archival objects  – images, documents and sound files.

Contributor training day

Our training days are a mixture of presentation, demonstration and practical hands on. We (The training team consists of Jane, Beth and myself) tend to start by talking a little about Hub news and developments to set the scene for the day and then we move onto why the Hub uses EAD and why using standards is important for interoperability and means that more ‘stuff’ can be done with the data. We go from here on to a hands-on session that demonstrates how to create a basic record. We cover also cover adding lower level components and images and we show contributors how to add index terms to their descriptions. (Something that we heartily endorse! We LOVE standards and indexing!).

We always like to tailor our training to the users, and encourage users to bring along their own descriptions for the hands-on sessions. Some users manage to submit their first descriptions to the Hub by the end of the training session!

This year we have done training in Manchester and London, for the Lifeshare project team in Sheffield and for the Oxford colleges. We are also hoping (if we get enough take up) to run courses in Glasgow and Cardiff this year. (6th Sept at Glasgow Caledonian, Cardiff date TBC. Email archiveshub@mimas.ac.uk to book a place)

So far this year three new contributors have joined the Hub as a result of training:  Middle East Centre Archive, St Antony’s College, Oxford; Salford City Archive and the Taylor Institute, Oxford. We’ve also enabled four of our existing contributors to start updating their collections on the Hub: National Fairground Archive, the Co-operative Archive, St John’s College, Oxford and the V&A.

We have been given some great feedback this year and 100% of our attendees agreed/strongly agreed that they were satisfied with the content and teaching style of the course.

Some our feedback:

A very good introductory session to working with the EAD editor for the Archives Hub. I have not used the Archives Hub for a long time so an excellent refresher course.

This was a fantastic workshop – excellently designed resources, Lisa and Jane were really helpful (and patient!). The hands-on aspect was really useful: I now feel quite confident about creating EAD records for the Hub, and even more confident that the Hub team are on hand with online help

The hands on experience and being able to ask questions of the course leaders as things happened was really useful. Being able to work on something relevant to me was also a bonus.

Excellent presentation and delivery. I came along with a theoretical but not a practical knowledge of the Archives Hub and its workings, and the training session was pitched perfectly and was completely relevant to my job. Many thanks.

The Hub team train archivists how to use the EAD editor, archive students about EAD and Social media and research students in how to use the Hub to search for primary source materials. You can find our list of training that we provide on our training pages: http://archiveshub.ac.uk/trainingmodules/ .  We’re always happy to hear from people who are interested in training – do let us know!

HubbuB: August 2011

We are out and About in August. Jane and Joy will be going to the Society of American Archivists’ Conference this year, speaking as part of a panel session. We will be talking about Discovery, the Archives Hub and Linked Data. We’re also very excited to be visiting the OCLC offices in Dublin Ohio.  Lisa and Bethan will be at the Archives and Records Association conference in Edinburgh, so go and say hello if you are there. Lisa is also speaking at the conference.

Our Monthly Feature is all levitating women and mustacheod men, as we take a trip into Magic and Illusion at the Fairground Archive: http://archiveshub.ac.uk/features/magic/. Some great images, and a lovely photograph of Cyril Critchlow, a wizard in his 80’s, performing as ‘Wizardo, Harry Potter’s grandfather’!

We’ve recently created a page of Top Tips for Cataloguing: http://archiveshub.ac.uk/cataloguingtips/. These are some of the key areas that we believe are important for good online catalogues. We do still find that archivists don’t always think about the global online environment, so it’s worth setting out some of the most important points to bear in mind. It’s partly about thinking of the audience, browsing the Web, using Google, scanning pages for relevant content, and it’s partly about descriptions – ensuring that the title is as clear and self-explanatory as possible, thinking about how best to describe the archive in a way that is user-friendly.

We’ve been talking about ways to help get descriptions onto the Hub when they are created in Microsoft Word or Excel. We’re just exploring possibilities at the moment, but we are interested in anyone who uses, or knows anyone who uses, Microsoft Word to catalogue. Maybe smaller offices, or maybe you ask volunteers to do some of this?

We know people do use Microsoft Excel as well. We are thinking about ‘Tips for using Excel’. Would this be useful? We don’t necessarily want to give the impression that Excel is the most appropriate choice for cataloguing – its a spreadsheet software, not really for complex hierarchical archives. But we do realise that for some people, the choice of what to use is limited, and we want to do our best to accommodate the realities that people are faced with.

We’ve had some interest in the idea of researchers being able to request digital copies of archives through the Hub. That is, a researcher comes across an archive they would like to see, and they would like digital copies, so they indicate this in some way. Not yet fully thought out, but again, we’d need to know if there is a need for this. How many officers are starting to digitise on demand?

Finally, we’re covering music, dance, plants, medicine and the Middle East with our latest contributors. Check out who is recently on board on our contributors’ page:
http://archiveshub.ac.uk/contributors/

A bit about Resource Discovery

The UK Archives Discovery Network (UKAD) recently advertised our up and coming Forum on the archives-nra listserv. This prompted one response to ask whether ‘resource discovery’ is what we now call cataloguing and getting the catalogues online. The respondent went on to ask why we feel it necessary to change the terminology of what we do, and labelled the term resource discovery as ‘gobledegook’. My first reaction to this was one of surprise, as I see it as a pretty plain talking way of describing the location and retrieval of information , but then I thought that it’s always worth considering how people react and what leads them to take a different perspective.

It made me think that even within a fairly small community, which archivists are, we can exist in very different worlds and have very different experiences and understanding. To me, ‘resource discovery’ is a given; it is not in any way an obscure term or a novel concept. But I now work in a very different environment from when I was an archivist looking after physical collections, and maybe that gives me a particular perspective. Being manager of the Archives Hub, I have found that a significant amount of time has to be dedicated to learning new things and absorbing new terminology. There seem to be learning curves all over the place, some little and some big. Learning curves around understanding how our Hub software (Cheshire) processes descriptions, Encoded Archival Description , deciding whether to move to the EAD schema, understanding namespaces, search engine optimisation, sitemaps, application programming interfaces, character encoding, stylesheets, log reports, ways to measure impact, machine-to-machine interfaces, scripts for automated data processing, linked data and the semantic web, etc. A great deal of this is about the use of technology, and figuring out how much you need to know about technology in order to use it to maximum effect. It is often a challenge, and our current Linked Data project, Locah, is very much a case in point (see the Locah blog). Of course, it is true that terminology can sometimes get in the way of understanding, and indeed, defining and having a common understanding of terms is often itself a challenge.

My expectation is that there will always be new standards, concepts and innovations to wrestle with, try to understand, integrate or exclude, accept or reject, on pretty much a daily basis. When I was the archivist at the RIBA (Royal Institute of British Architects), back in the 1990’s, my world centered much more around solid realities: around storerooms, temperature and humidity, acquisitions, appraisal, cataloguing, searchrooms and the never ending need for more space and more resources. I certainly had to learn new things, but I also had to spend far more time than I do now on routine or familiar tasks; very important, worthwhile tasks, but still largely familiar and centered around the institution that I worked for and the concepts terminology commonly used by archivists. If someone had asked me what resource discovery meant back then, I’m not sure how I would have responded. I think I would have said that it was to do with cataloguing, and I would have recognised the importance of consistency in cataloguing. I might have mentioned our Website, but only in as far as it provided access through to our database. The issues around cross-searching were still very new and ideas around usability and accessibility were yet to develop.

Now, I think about resource discovery a great deal, because I see it as part of my job to think of how to best represent the contributors who put time and effort into creating descriptions for the Hub. To use another increasingly pervasive term, I want to make the data that we have ‘work harder’. For me, catalogues that are available within repositories are just the beginning of the process. That’s fine if you have researchers who know that they are interested in your particular collections. But we need to think much more broadly about our potential global market: all the people out there who don’t know they are interested in archives – some, even, who don’t really know what archives are. To reach them, we have to think beyond individual repositories and we have to see things from the perspective of the researcher. How can we integrate our descriptions into the ‘global information environment’ in a much more effective way. A most basic step here, for example, is to think about search engine optimisation. Exposing archival descriptions through Google, and other search engines, has to be one very effective way to bring in new researchers. But it is not a straightforward exercise – books are written about SEO and experts charge for their services in helping optimise data for the Web. For the Archives Hub, we were lucky enough to be part of an exercise looking at SEO and how to improve it for our site. We are still (pretty much as I write) working on exposing our actual descriptions more effectively.

Linked Data provides another whole world of unfamiliar terminology to get your head round. Entities, triples, URI patterns, data models, concepts and real world things, sparql queries, vocabularies – the learning curve has indeed been steep. Working on outputting our data as RDF (a modelling framework for Linked Data) has made me think again about our approach to cataloguing and cataoguing standards. At the Hub, we’re always on about standards and interoperability, and it’s when you come to something like Linked Data, where there are exciting possibilities for all sorts of data connections, well beyond just the archive community, that you start to wish that archivists catalogued far more consistently. If only we had consistent ‘extent’ data, for example, we could look at developing a lovely map-based visualisation showing where there are archives based on specific subjects all around the country and have a sense of where there are more collections and where there are fewer collections. If only we had consistent entries for people’s names, we could do the same sort of thing here, but even with thesauri, we often have more than one name entry for the same person. I sometimes think that cataloguing is more of an art than a science, partly because it is nigh on impossible to know what the future will bring, and therefore knowing how to catalogue to make the most of as yet unknown technologies is tricky to say the least. But also, even within the environment we now have, archivists do not always fully appreciate the global and digital environment which requires new ways of thinking about description. Which brings me back to the idea of whether resource discovery is another term for cataloguing and getting catalogues online. No, it is not. It is about the user perspective, about how researchers locate resources and how we can improve that experience. It has increasingly become identified with the Web as a way to define the fundamental elements of the Web: objects that are available and can be accessed through the Internet, in fact, any concept that has an identity expressed as a URI. Yes, cataloguing is key to archives discovery, cataloguing to recognised standards is vital, and getting catalogued online in your own particular system is great…but there is so much more to the whole subject of enabling researchers to find, understand and use archives and integrating archives into the global world of resources available via the Web.