Digital Content on Archives Hub

As part of the Archives Hub Labs ‘Images and Machine Learning’ project we are currently exploring the challenges around implementing IIIF image services for archival collections, and also for Archives Hub more specifically as an aggregator of archival descriptions. This work is motivated by our desire to encourage the inclusion of more digital content on Archives Hub, and to improve our users’ experience of that content, in terms of both display and associated functionality.

Before we start to report on our progress with IIIF, we thought it would be useful to capture some of our current ideas and objectives with regards to the presentation of digital content on Archives Hub. This will help us to assess at later stages of the project how well IIIF supports those objectives, since it can be easy to get caught up in the excitement of experimenting with new technologies and lose sight of one’s starting point. It will also help our audience to understand how we’re aiming to develop the Hub, and how the Labs project supports those aims.

The poet Edward Thomas, ‘Wearing hat, c.1904’.

Why more digital content?

  • We know its what our users want
  • Crucial part of modern research and engagement with collections, especially after Covid
  • Another route into archives for researchers
  • Contributes to making archives more accessible
  • Will enable us to create new experiences and entry points within Archives Hub
  • To support contributing archives which can’t host or display content themselves

The Current Situation

At the moment our contributors can include digital content in their descriptions on Archives Hub. They add links to their descriptions prior to publication, and they can do this at any level, e.g. ‘item’ level for images of individually catalogued objects, or maybe ‘fonds’ or ‘collection’ level for a selection of sample images. If the links are to image files, these are displayed on the Hub as part of the description. If the links are to video or audio files, or documents, we just display a link.

There are a few disadvantages to this set up: it can be a labour-intensive process adding individual links to descriptions; links often go dead because content is moved, leading to disappointment for researchers; and it means contributing archives need to be able to host content themselves, which isn’t always possible.

From Glasgow School of Art Archives: Art, Design and Architecture Collection

Where images are included in descriptions, these are embedded in the page as part of the description itself. If there are multiple images they are arranged to best fit the size of the screen, which means their order isn’t preserved.

If a user clicks on an image it is opened in a pop out viewer, which has a zoom button, and arrows for browsing if there is more than one image.

The embedded image and the viewer are both quite small, so there is also a button to view the image in fullscreen.

The viewer and the fullscreen option both obscure all or part of the decription itself, and there is no descriptive information included around the image other than a caption, if one has been provided.

As you can see the current interface is functional, but not ideal. Listed below are some of the key things we would like to look at and improve going forwards. The list is not intended to be exhaustive, but even so it’s pretty long, and we’re aware that we might not be able to fix everything, and certainly not in one go.

Documenting our aims though is an important part of steering our innovations work, even if those aims end up evolving as part of the exploration process.

Display and Viewing Experience

❐ The viewer needs updating so that users can play audio and video files in situ on the Hub, just as they can view images at the moment. It would be great if they could also read documents (PDF, Word etc).

❐ Large or high-resolution image files should load more quickly into the viewer.

❐ The viewer should also include tools for interacting with content, e.g. for images: zoom, rotate, greyscale, adjust brightness/contrast etc; for audio-visual files: play, pause, rewind, modify speed etc.

❐ When opened, any content viewer should expand to a more usable size than the current one.

❐ Should the viewer also support the display of descriptive information around the content, so that if the archive description itself is obscured, the user still has context for what they’re looking at? Any viewer should definitely clearly display rights and licensing information alongside content.

Search and Navigation

❐ The Archives Hub search interface should offer users the option to filter by the type of digital content included in their search results (e.g. image, video, PDF etc).

❐ The search interface should also highlight the presence of digital content in search results more prominently, and maybe even include a preview?

❐ When viewing the top level of a multi-level description, users should be able to identify easily which levels include digital content.

❐ Users should also be able to jump to the digital content within a multi-level description quickly – possibly being able to browse through the digital content separately from the description itself?

❐ Users should be able to begin with digital content as a route into the material on Archives Hub, rather than only being able to search the text descriptions as their starting point.

Contributor Experience

❐ The Archives Hub could offer some form of hosting service, to support archives, improve availability of digital content on the Hub, and allow for the development of workflows around managing content.

❐ We could develop a user-friendly method for linking content to descriptions, to make including and updating digital content easy and time-efficient.

❐ Any workflows or interfaces for managing digital content should be straightforward and accessible for non-technical staff.

❐ If contributors wish to publish or curate their digital content on the Archives Hub, the service could give them access to innovative but sustainable tools, which drive engagement by highlighting their collections.

❐ If possible, any resources created should be re-usable within an archive’s own sites or resources – making the most of both the material and the time invested.

❐ We could offer options for contributors to curate content in creative and inventive ways which aren’t tied to cataloguing alone, and which offer alternative ways of experiencing archival material for users.

Future Possibilities

❐ It would be exciting for users to be able to ‘collect’, customise or interact with content in more direct ways. Some examples might include:
– Creating their own collections of content
– Creating annotations or notes
– Publicly tagging or commenting on content

❐ Develop the experience for users with things like: automated tagging of images for better search; providing searchable OCR scanned text for text within images; using the tagging or classification of content to provide links to information and resources elsewhere.

Image credits

Edward Thomas: Papers of Edward Thomas (GB 1239 424/8/1/1/10), Cardiff University Archives / Prifysgol Caerdydd.

Images and Machine Learning Project

Under our new Labs umbrella, we have started a new project, ‘Images and Machine Learning’ it has three distinct and related strands.

screenshot with bullet points to describe the DAO store, IIIF and Machine Learning
The three themes of the project

We will be working on these themes with ten participants, who already contribute to the Archives Hub, and who have expressed an interest in one or more of these strands: Cardiff University, Bangor University, Brighton Design Archives at the University of Brighton, Queens University Belfast, the University of Hull, the Borthwick Institute for Archives at the University of York, the Geological Society, the Paul Mellon Centre, Lambeth Palace (Church of England) and Lloyds Bank.

This project is not about pre-selecting participants or content that meet any kind of criteria. The point is to work with a whole variety of descriptions and images, and not in any sense to ‘cherry pick’ descriptions or images in order to make our lives easier. We want a realistic sense of what is required to implement digital storage and IIIF display, and we want to see how machine learning tools work with a range of content. Some of the participants will be able to dedicate more time to the project, others will have very little time, some will have technical experience, others won’t. A successful implementation that runs beyond our project and into service will need to fit in with our contributors needs and limitations. It is problematic to run a project that asks for unrealistic amounts of time from people that will not be achievable long-term, as trying to turn a project into a service is not likely to work.

DAO Store

Over the years we have been asked a number of times about hosting content for our contributors. Whilst there are already options available for hosting, there are issues of cost, technical support, fit for purpose-ness, trust and security for archives that are not necessarily easily met.

Jisc can potentially provide a digital object store that is relatively inexpensive, integrated with the current Archives Hub tools and interfaces, and designed specifically to meet our own contributors’ requirements. In order to explore this proposal, we are going to invest some resource into modifying our current administrative interface, the CIIM, to enable the ingest of digital content.

We spent some time looking at the feasibility of integrating an archival digital object store with the current Jisc Preservation Service. However, for various reasons this did not prove to be a practical solution. One of the main issues is the particular nature of archives as hierarchical multi-level collections. Archival metadata has its own particular requirements. The CIIM is already set up to work with EAD descriptions and by using the CIIM we have full control over the metadata so that we can design it to meet the needs of archives. It also allows us to more easily think about enabling IIIF (see below).

The idea is that contributors use the CIIM to upload content and attach metadata. They can then organise and search their content, and publish it, in order to give it web address URIs that can be added to their archival descriptions – both in the Archives Hub and elsewhere.

It should be noted that this store is not designed to be a preservation solution. As said, Jisc already provides this service, and there are many other services available. This is a store for access and use, and for providing IIIF enabled content.

The metadata fields have not yet been finalised, but we have a working proposal and some thoughts about each field.

Titlemandatory? individual vs batch?
Datespreferably structured, options for approx. and not dated.
Licencepossibly a URI. option to add institution’s rights statement.
Resource typecontrolled list. values to be determined with participants. could upload a thesaurus. could try ML to identify type.
Keywordsfree text
Taggingenable digital objects to be grouped e.g by topic or e.g. ‘to do’ to indicate work is required
Statusunpublished/published. May refer to IIIF enabled.
URLunique URI of image (at individual level)
Proposed fields for the Digital Object Store

We need to think about the workflow and user interface. The images would be uploaded and not published by default, so that they would only be available to the DAO Store user at that point. On publication, they would be available at a designated URL. Would we then give the option to re-size? Would we set a maximum size? How would this fit in with IIIF and the preference for images of a higher resolution? We will certainly need to think about how to handle low resolution images.

International Image Interoperability Framework

IIIF is a framework that enables images to be viewed in any IIIF viewer. Typically, they can be sequenced, such as for a book, and they are zoomable to a very high resolution. At the heart of IIIF is the principle that organisations expose images over the web in a way that allows researchers to use images from anywhere, using any platform that speaks IIIF. This means a researcher can group images for their own research purposes, and very easily compare them. IIIF promotes the idea of fully open digital content, and works best with high resolution images.

There are a number of demos here: https://matienzo.org/iiif-archives-demo/

And here is a demo provided by Project Mirador: http://projectmirador.org/demo/

An example from the University of Cambridge: https://cudl.lib.cam.ac.uk/view/MS-RGO-00014-00051/358

And one from the University of Manchester: https://www.digitalcollections.manchester.ac.uk/collections/ruskin/1

There are very good reasons for the Archives Hub to get involved in IIIF, but there are challenges being an aggregator that individual institutions don’t face, or at least not to the same degree. We won’t know what digital content we will receive, so we have to think about how to work with images of varying resolutions. Our contributors will have different preferences for the interface and functionality. On the plus side, we are a large and established service, with technical expertise and good relationships with our contributors. We can potentially help smaller and less well-resourced institutions into this world. In addition, we are well positioned to establish a community of use, to share experiences and challenges.

One thing that we are very convinced by: IIIF is a really effective way to surface digital content and it is an enormous boon to researchers. So, it makes total sense for us to move into this area. With this in mind, Jisc has become a member of the IIIF Consortium, and we aim to take advantage of the knowledge and experience within the community – and to contribute to it.

Machine Learning

This is a huge area, and it can feel rather daunting. It is also very complicated, and we are under no illusions that it will be a long road, probably with plenty of blind alleys. It is very exciting, but not without big challenges.

It seems as if ML is getting a bad reputation lately, with the idea that algorithms make decisions that are often unfair or unjust, or that are clearly biased. But the main issue lies with the data. ML is about machines learning from data, and if the data is inadequate, biased, or suspect in some way, then the outcomes are not likely to be good. ML offers us a big opportunity to analyse our data. It can help us surface bias and problematic cataloguing.

We want to take the descriptions and images that our participants provide and see what we can do with ML tools. Obviously we won’t do anything that affects the data without consulting with our contributors. But it is best with ML to have a large amount of data, and so this is an area where an aggregator has an advantage.

This area is truly exploratory. We are not aiming for anything other than the broad idea of improved discoverability. We will see if ML can help identify entities, such as people, places and concepts. But we are also open to looking at the results of ML and thinking about how we might benefit from them. We may conclude that ML only has limited use for us – at least, as it stands now. But it is changing all the time, and becoming more sophisticated. It is something that will only grow and become more embedded within cultural heritage.

Over the next several months we will be blogging about the project, and we would be very pleased to receive feedback and thoughts. We will also be holding some webinar sessions. These will be advertised to contributors via our contributors list, and advertised on the JiscMail archives-nra list.

Thomas Baron Pitfield (1903-1999): a visual autobiography

Archives Hub feature for July 2015

Monstrous Monster drawing, 1979
TP1.12 Pen-and-wash drawing of the Monstrous Monster, the Duophonia, and landscapes, 1979

This month’s archive one true love is the Thomas Baron Pitfield Collection at the Royal Northern College of Music in Manchester. Pitfield was, to name a handful of epithets, a composer, teacher, poet, artist, engineer, furniture maker, calligrapher and engraver.

He studied and later taught at the Royal Manchester College of Music (RMCM). He is a well-loved composer. However, it is the rest of his creative life that I wish to draw attention to here in this feature. In particular, his sketchbooks.

A bit of context

Pitfield was born 5 April 1903 to a strict Church of England family in Bolton. His parents had him late in life and according to his memoirs he was an unwanted and unplanned for child.

Pitfield was not born into an environment of plentiful inspiration and artistic encouragement. His creative nature was exactly that: his nature. Nurture was not a feature. In his autobiographies he mentions that he was given no means to entertain himself as a child save for his own resourcefulness which he believed fostered innovation in his early years.

Painted minstrel, 1933
TP1.10 Painted minstrel, 1933

By age two he was notably good at drawing and in school his ability to learn music almost instantaneously by ear was remarked upon. Much, he assures us, to the unimpressed pillars of his parents who intended for him to be a joiner like his father. He strove on however, collecting scraps from his father’s workshop and working them into toys and other objects.

At age 14 he was pulled from school and enrolled in an apprenticeship in the millwrights’ department of a local engineering firm, which he despised. It took time away from his creative and musical endeavours which he sneakily developed when everyone else was asleep. He also abhorred the idea that the machines he was helping to maintain could one day severely harm or even kill someone, as the near misses he witnessed assured him could happen.

The artist

“The artist [it is said] should be able to find his inspiration in the objects and life about him. I could never wax poetic about the gasometers and industrial plant.” (Pitfield, A Song After Supper, 1990 p84). And so he haunted the Bolton moors at the weekends bringing sketchbooks with him. “The countryside is the backdrop of most of my creative thoughts.” (ibid 12)

Tree drawing, 1981
TP1.15 Pen-and-wash drawing of a contorted tree, Dunham Park, 1981

Here we witness the birth of his sketchbook obsession. By the end of his life he had filled over 6,000 pages of thoughts, ideas, paintings, music, teachings, prose, poetry and designs. The calls them “a visual autobiography… so that they have become an outline of my life’s activities.”(ibid, p95)

Calligraphy, 1960
TP1.16 Calligraphy swirls, 1960

In his books we see everything that influenced his life for over seven decades. From the many pen-and-wash sketches of churches, woodlands, creatures and characters, to the incredible astuteness of his calligraphy and furniture designs. This stream of creative consciousness follows him through his short time as a student at the RMCM after quitting engineering at 21; working as a teacher of woodwork for the unemployed from 23; his fruitful composition career; his fondly remembered time returning as a teacher to the RMCM and beyond.

Philosophy and themes

Pitfield was a complex mould breaker. He remarks that early on he “began to see that an almost rabid conformity in those about me was no assurance of their sanity.” (Pitfield, No Song, No Supper, 1986, p24) In his life, themes of self-efficiency and great personal motivation permeate, whether it be stepping away from the religious upbringing, becoming vegetarian at a young age, his pacifism or his love of John Ruskin and William Morris.

Dunbleton Church sketch, 1968
TP1.1 Dunbleton Church sketch with Aaron Copeland quote, 1968

Nevertheless Christian iconography is very apparent in his notebooks and sits alongside furniture designs and the wild nature scenes which uproot the carefully penned calligraphy and drafts for lino prints, prose and poetry. The finished artworks crop up elsewhere in the archive but it is in the sketchbooks, the first manifestation for many of his creative outputs, where we find an absolute wonderland of inspiration.

Thanks for reading. If you would like to know more about his wonderful creations then do get in touch: archives@rncm.ac.uk.

Heather Roberts
College Archivist
Royal Northern College of Music

 

Related:

The Thomas Pitfield collections on the Archives Hub: http://archiveshub.ac.uk/data/gb1179-tp.

Browse the collections of the Royal Northern College of Music on the Archives Hub.

Artworks copyright: The Pitfield Trust.

HubbuB: October 2011

Europeana and APENet

Europeana LogoI have just come back from the Europeana Tech conference, a 2 day event on various aspects of Europeana’s work and on related topics to do with data. The big theme was ‘open, open, open’, as well, of course, as the benefits of a European portal for cultural heritage.  I was interested to hear about Europeana’s Linked Data output, but my understanding is that at present, we cannot effectively link to their data, because they don’t provide URIs  for concepts. In other words, identifiers for names such as http://data.archiveshub.ac.uk/doc/agent/gb97/georgebernardshaw, so that we can say, for example, that our ‘George Bernard Shaw’ is the same as ‘George Bernard Shaw’ represented on Europeana.

I am starting to think about the Hub being part of APENet and Europeana. APENet is the archival aggregator for Europe. I have been in touch with them about the possibility of contributing our data, and if the Hub was to contribute, we could probably start from next year. Europeana only provide metadata for digital content, so we could only supply descriptions where the user can link to the digital content, but this may well be worth doing, as a means to promote the collections of any Hub contributors who do link to digital materials.

If you are a contributor, or potential contributor, we would like to know what you think…. we have a quick question for you at http://polldaddy.com/poll/5565396/. It simply asks if you think its a good idea to be part of these European initiatives. We’d love to get your views, and you only have to leave your name and a comment if you want to.

Flickr: an easy way to provide images online

You will be aware that contributors can now add images to descriptions and links to digital content of all kinds. The idea is that the digital content then forms an integral whole with the metadata, and it is also interoperable with other systems.

I’ve just seen an announcement by the University of Northampton, who have recently added materials to Flickr . I know that many contributors struggle to get server space to put their digital content online, so this is one possible option, and of course it does reach a huge number of people this way. There may be risks associated with the persistence of the URIs for the images, but then that is the case wherever you put them.

On the Hub we now have a number of images and links to content, for example: http://archiveshub.ac.uk/data/gb1089ukc-joh, http://archiveshub.ac.uk/data/gb1089ukc-bigwood, http://archiveshub.ac.uk/data/gb1089ukc-wea, http://archiveshub.ac.uk/data/gb141boda?page=7#boda.03.03.02.

Ideally, contributors would supply digital content at item level, so the metadata is directly about the image/digital content, but it is fine to provide it at any level that is appropriate.  The EAD Editor makes adding links easy (http://archiveshub.ac.uk/dao/). If you aren’t sure what to do, please do email us.

Preferred Citation

We never had the field for the preferred citation in our old template for the creation of EAD, and it has not been in the EAD Editor up till now. We were prompted to think about this after seeing the results of a survey on the use of EAD fields presented at the Society of American Archivists conference. Around 80% of archive institutions do use it. We think it’s important to advise people how to cite the archive, so we are planning to provide this in the Editor and may be able to carry out global edits to add this to contributors’ data.

List of Contributors

Our list of contributors within the main search page has now been revised, and we hope it looks substantially more sensible, and that it is better for researchers. This process really reminded us how hard it is to come up with one order for institutions that works for everyone!  We are currently working on a regional search, something that will act as an alternative way to limit searching. We hope to introduce this next year.

And finally…A very engaging Linked Data interface

This interface demonstration by Tim Sherratt shows how something driven by Linked Data can really be very effective. It also uses some of the Archives Hub vocabulary from our own Linked Data work, which is a nice indication of how people have taken notice of what we have been doing. There is a great blog post about it by Pete Johnston, Storytelling, archives and Linked Data. I agree with Pete that this sort of work is so exciting, and really shows the potential of the Linked Data Web for enabling individual and collective storytelling…something we, as archivists, really must be a part of.

HubbuB

Diary of the Archives Hub, June 2011

Design Council Archive poster
Desing Council Archive: Festival of Britain poster

This is the first of our monthly diary entries, where we share news, ideas and thoughts about the Archives Hub and the wider world. This diary is aimed primarily at archives that contribute to the Hub, or are thinking about contributing, but we hope that it provides useful information for others about the sorts of developments going on at the Hub and how we are working to promote archives to researchers.

Hub Contributors’ Forum

At the Hub we are always looking to maintain an active and constructive relationship with our contributors. Our Contributors’ Forum provides one way to do this. It is informal, friendly, and just meets once or twice a year to give us a chance to talk directly to archivists. We think that archivists also value the opportunity to meet other contributors and think about issues around data discovery.

We have a Contributors’ Forum on 7th July at the University of Manchester and if any contributors out there would like to come we’d love to see you. It is a chance to think about where the Hub is going and to have input into what you think we should be doing, where our priorities should lie and how to make the service effective for users. Just in case you all jump in at once, we do have a limit on numbers….but please do get in touch if you are interested.

The session will be from 10.30 to 1.00 at the University of Manchester with lunch provided. It will be with some members of the Hub Steering Committee, so a chance for all to mix and mingle and get to know each other. And for you to talk to Steering Committee members directly.

Please email Lisa if you would like to attend: lisa.jeskins@manchester.ac.uk.

Contributor Audio Tutorials

Our audio tutorial is aimed at contributors who need some help with creating descriptions for the Hub. It takes you through the use of our EAD Editor, step-by-step. It is also useful in a general sense for creating archival descriptions, as it follows the principles of ISAD(G). The tutorial can be found at http://archiveshub.ac.uk/tutorials/. It is just a simple audio tutorial, split into convenient short modules, covering basic collection-level descriptions through to multi-level and indexing. Any feedback greatly appreciated – if you want any changes or more units added, just let us know.

Archives Hub Feature: 100 Objects

We are very pleased with our monthly features, founded by Paddy, now ably run by Lisa. They are a chance to show the wealth of archive collections and provide all contributors the opportunity to showcase their holdings.  They do quite well on Google searches as well!

Our monthly feature for June comes from Bradford Special Collections, one of our stalwart contributors, highlighting their current online exhibition: 100 Objects.  Some lovely images, including my favourite, ‘Is this man an anarchist?’ (No!! he’s just trying to look after his family): http://archiveshub.ac.uk/features/100objects/Nationalunionofrailwaymenposter.html

Relevance Ranking

Relevance ranking is a tricky beast, as our developer, John, will attest. How to rank the results of a search in a way that users see as meaningful? Especially with archive descriptions, which range from a short description of a 100 box archive to a 10 page description of a 2 box archive!

John has recently worked on the algorithm used for relevance ranking so that results now look more as most users would expect. For example, if you searched for ‘Sir John Franklin’ before, the ‘Sir John Franklin archive’ would not come up near the top of the results. It now appears 1st in results rather than way down the list, as it was previously. Result.

Images

Since last year we have provided the ability to add images to Hub descriptions. The images have to be stored elsewhere, but we will embed them into descriptions at any level (e.g. you can have an image to represent a whole collection, or an image at each item level description).

We’ve recently got some great images from the Design Council Archive: http://archiveshub.ac.uk/data/gb1837des-dca – take a look at the Festival of Britain entries, which have ‘digital objects’ linked at item level, enabling researchers to get a great idea of what this splendid archive holds.

Any contributors wishing to add images, or simple links to digital content, can easily do so through using the EAD Editor: http://archiveshub.ac.uk/images/ You can also add links to documents and audio files. Let us know if you would like more information on this.

Linking to descriptions

Linking to Hub descriptions from elsewhere has become simpler, thanks to our use of ‘cool URIs’. See http://archiveshub.ac.uk/linkingtodescriptions/. You simply need to use the basic URI for the Hub, with the /data/ directory, e.g. http://archiveshub.ac.uk/data/gb029ms207.

Out and About

It would take up too much space to tell you about all of our wanderings, but recently Jane spent a very productive week in Prague at the European Libraries Automation Group (ELAG), a very friendly bunch of people, a good mix of librarians and developers, and a very useful conference centering on Linked Data.

Bethan is at the CILIP new professionals information day today, busy twittering about networking and sharing knowledge.

Lisa is organising our contributors’ workshops for this year (feels like our summer season of workshops) and has already run one in Manchester. More to follow in Glasgow, London and Cardiff. This is our first workshop in Wales, so please take advantage of this opportunity if you are in Wales or south west England. More information at http://archiveshub.ac.uk/contributortraining/

Joy is very busy with the exciting initiative, UKDiscovery. This is about promoting an open data agenda for archives, museums and libraries – something that we know you are all interested in. Take a look at the new website: http://discovery.ac.uk/.

With best wishes,
The Hub Team