How the Exploring British Design project informed the development of the Archives Hub

Back in 2014 the Archives Hub joined forces with The University of Brighton Design Archives for an exciting new project, funded by the Arts and Humanities Research Council, ‘Exploring British Design’ (EBD).

The project explored Britain’s design history by connecting design-related content in different archives, with the aim of giving researchers the freedom to explore around and within archives.

You can read a number of blog posts on the project, and there is also a video introducing the EBD website on You Tube, but in this post I wanted to set out how we have learned from the project and how it has informed the development of the new Archives Hub.

Unfortunately, we may not be able to maintain the website longer term, and so it seemed timely to reflect on how the principles used in this project are being taken forward.

Modelling the Data

A key component of EBD was our move away from the traditional approach of putting the archive collection at the centre of the user experience. Instead, we wanted to reflect the richness of the content – the people, organisations, places, subjects, events that a collection represents.

We had many discussions and filled many pieces of paper with ideas about how this might work.

rough ideas for data connectivity
Coming up with ideas for how EBD should work

We then took these ideas and translated them into our basic model.

model of data for EBD
Relationships between entities in the EBD data

Archives are represented on our model as one aspect of the whole. They are a resource to be referenced, as are bibliographic resources and objects. They relate to the whole – to agents, time periods, places and events. This essentially puts them into a whole range of contexts, which can expand as the data grows.

Screenshot of EBD homepage
Homepage of Exploring British Design: People are foremost.

The Exploring British Design website was one way to reflect the inter-connected model that we created.

We have taken the principles of this approach with the new Archives Hub architecture and website, which was launched back in December 2016. Whilst the archive collection description stays very much in the forefront of the users’ experience, we have introduced additional tabs to represent themed collections and repositories. All three of these sources of information are, in a data and processing sense, treated equally. The user searches the Hub and the search runs across these three data sources. The model allows us to be flexible with how we present the data, so we could also try different interfaces in future, maybe foregrounding images, or events.

screenshot of Archives Hub search results
Search for ‘design industry’ gives results across Archive Collections, Themed Collections and Repositories

Names

The EBD project had a particular focus on people. We opted to combine machine methods of data extraction – data taken partly from our already existent archive descriptions as well as from other external sources – with manual methods, to create rich records about designers. This manual approach is not sustainable for a large-scale service like the Archives Hub, but it shows what is possible in terms of creating more context and connectivity.

screenshot of a person page from the EBD website
EBD website showing a person page

We wanted to indicate that well-structured data allows a great deal more flexibility in presentation. In this case the ‘Archive and Museum Resources’ are one link in the list of resources about or related to the individual. We could have come up with other ways to present the information, given how it was structured.

We are intending to introduce names pages to the Archives Hub, which will then more clearly echo the EBD approach. They will largely have been created through automated processes, as we needed to create them at scale. They will generally be quite brief, without the ideal structure or depth, but the principle remains that we can then link from a person page to a host of related resources. The Hub website will have a new tab for ‘Names’ and end users will be able to run searches that take in collections, themes, repositories, people and organisations.

The EBD project allowed us to explore standards used for the creation of names data. It was our first experience of using Encoded Archival Context (Corporate Bodies, Persons and Families) (EAC-CPF), so we could start to see what we could do with it, as well as discover some of the shortcomings of the standard, as our data went beyond what is supported. For example, we wanted to link images to people and events but this was not covered by the standard. It was useful to have this preliminary exploration of it, and what it can – and can’t – do, as we look to adopt it for names within the Archives Hub.

Structured Data

One of the things the project did reinforce for me was the importance of indexing. On the Archives Hub we have always recommended indexing, but we have had mixed reactions from archivists, some feeling that it is less useful than detailed narrative, some saying that it is not needed ‘now we have Google’, some simply saying they don’t have time.

Indexing has many advantages, some of which I’ve touched on in various blog posts – and one at the top of the list, is that it brings the advantages of structured data. A name in a narrative can, in theory, be pulled out and utilised as a point of connectivity, but a name as an index term tends to be a great deal easier to work with: it is identified as a name, it usually has structured surname, forename content, it usually includes life dates and may include titles and epithets to help unambiguously identify an individual.

EBD was all about structured data, and we gave ourselves the luxury of adding to the data by hand, creating rich structured records about designers. This was partly to demonstrate what could be done in an interface, but we were well aware that it would be problematic to create records of that level of detail at scale. However, as we start to grapple with expanding name records in the Archives Hub, we have EBD as a reference point. It has helped us to think more about approaches and priorities when creating name records. If we were to create an EAC Editor (similar to our EAD Editor) we would think carefully about how to facilitate creating relationships. For example, the type of relationship – should there be a controlled list of relationship types? e.g. ‘worked with, collaborated with, had professional connection with, influenced by,  spouse of’ – these are some of the relationships we used in EBD, after much discussion about how best to approach this. Or would it be more practical to stick to ‘associated with’ (i.e. not defined), which is easier, but far less useful to a researcher. Could we have both? How would one combine them in an interface?  Another example – the potential to create timelines. If we wanted to provide end users with timelines, we would need to focus on time-bound events. There are many issues to consider here, not least of which is how comprehensive the timeline would be.

The vexed question of how to combine data from name descriptions created by several institutions is not something we really dealt with in EBD, but that will be one of the biggest challenges for us in aiming to implement name data on the Archives Hub.

The level of granularity that you decide upon has massive implications for complexity, resources and benefits. The more granular the data, the more potential for researchers to be able to drill down into lives, events, locations, etc. So including life dates allows for a search for designers from 1946; including places of education allows for exploring possible connections through education, but adding dates of education allows for a more specific focus still.

Explaining our approach

One thing that struck me about this project was that it was harder than I had anticipated to convey to people what we were trying to achieve and what we could achieve. I tended to find that showing the website raised a number of expectations that I knew would be difficult to fulfill, and if I’m being honest, I sometimes felt rather frustrated at the lack of recognition of what we had achieved – it’s really not easy to combine, process and present different data sources!  It is ironic that the more we press forwards with new functionality, and try to push the boundaries of what we do, the more it seems that people ask for developments that are beyond that!  You can try to modify expectations by getting deep down and technical with the challenges involved in aggregating and enhancing data created over time, by different people, in different environments (we worked with CSV data, EAC-CPF data, RDF and geodata for example), with different perspectives and priorities.  But detailed explanations of technical challenges are not going to work for most audiences. End users see and make an assessment of the website; they shouldn’t really need to be aware of what is going on behind the scenes.

Originally, in our project specification, we asked the question: “How can we encourage researchers, archive and museum professionals, and the public, to apprehend an integrated and extended rather than collection-specific sense of Britain’s design history?”  Whilst we did not go as far to answer this question as we had hoped, the work that we did made me feel that it might be harder than I had envisaged. People are very used to the traditional catalogues and other finding aids that are out there, and it creates a certain (possibly unconscious) mindset. I know this too well, because, as an archivist, I have had to adjust my own thinking to see data in a different way and appreciate that traditional approaches to cataloguing and discoverability are not always suited to the digital online age.

Data Model

The hierarchical approach to data is very embedded among archivists, and this is what people are used to being presented with.  Unless archivists catalogue in a different way, providing more structured information about entities (names, places, etc) then actually presenting things in a more connected way is hard.

image of hierarchical folders
A folder structure is often used to represent archival hierarchy

A more inter-connected model, which eschews linear hierarchy in favour of fluid entity relationships, and allows for a more flexible approach with the front-end interface to the data relies upon the quality, structure and consistency of the data. If we don’t have place names at all we can’t provide a search by place. If we don’t have place names that are unambiguously identified (i.e. not just ‘Cambridge’) then we can provide a search by place, but a researcher will be presented with all places called Cambridge, anywhere in the world (including the US, Australia and Jamaica).

A diagram showing archives and other entities connected
An example of connected entities

The new Archives Hub was designed on the basis of a model that allows for entities to be introduced and new connections made.

Archives Hub Entity Relationship diagram
Entities within the Archives Hub system

So, the tabs that the end user sees in the interface can be modified and extended over time. Searches can be run across all entities; it is not solely about retrieving descriptions of archives. This approach allows for researchers to find e.g. repositories that are significantly about ‘design’ or repositories that are located in London. It allows us to introduce Themed Collections as a separate type of description, so a student doing a project on ‘plastics’ would discover the Museum of Design in Plastics as a resource alongside archive collections at repositories including Brighton Design Archives, the V&A and the Paul Mellon Centre.

screenshot of Archives Hub search results
Search for ‘plastics and design’ shows archives and themed resources

Website Maintenance

One of the things I’ve learnt from this project is that you need to factor in the ongoing costs and effort of maintaining a project website. The EBD website is quite sophisticated, which means there are substantial technical dependencies, and we ended up running into issues with security, upgrades and compatibility of software, issues that are par for the course for a website but nonetheless need dealing with promptly. Maybe we should have factored this in more than we did, as we know the systems administration required for the Archives Hub is no small thing, but when you are in the throws of a project your focus is on the objectives and final output more than the ongoing issues. We cannot maintain a site long-term that is not being regularly used. EBD does not get the level of use that would justify the resources we would have to put into it on an ongoing basis.

Conclusion

When we were creating the model for the Archives Hub, we thought as much about flexibility and future potential as anything else. This is one thing that we have learnt from running the Hub for 25 years and from projects like Exploring British Design. You need to plan for potential developments in order to start to work with cataloguers, to get the data into the shape that you need it to be. We wanted to be able to introduce additional entities, so that we could have names, places, languages, images, or any other entities as ‘first class citizens‘ of the Hub. We wanted to be able to enhance the end user’s ability to take different paths, and locate relevant archives through different avenues of exploration.

We need to temper our ambitions for the Hub with the realities of cataloguing, aggregation and resources available, and we need as much information as we can get about what researchers really want; but this is why it is so important to encompass potential as well as current functionality. We may not be able to introduce everything we have envisioned or that users ask for right now; but it is important to understand the vital link between approaches to cataloguing, adherence to data standards, and front end functionality. We created visualisations for EBD and we would love to do this for the Hub, but it was not an easy thing to do, and so we would need to consider what the data allows, the software options available, whether the technical requirements are sustainable over time, and the effectiveness of the end result for the researcher.

Visualisation showing connections to Elizabeth Denby
Visualisation for Elizabeth Denby

When we demonstrated the visualisations in EBD, they had the wow factor that was arguably lacking in the main text-based site, but for serious researchers the wow factor is a great deal less important that the breadth and depth of the content, and that requires a model that is fundamentally rigorous, sustainable over time and realistic in terms of the data that you have to work with.

 

Archives Hub Search Analysis

Search logs can give us an insight into how people really search. Our current system provides ‘search logs’ that show the numbers based on the different search criteria and faceting that the Hub offers, including combined searches. We can use these to help us understand how our users search and to give us pointers to improve our interface.

The Archives Hub has a ‘default search’ on the homepage and on the main search page, so that the user can simply type a search into the box provided. This is described as a keyword search, as the user is entering their own significant search terms and the results returned include any archival description where the term(s) are used.

The researcher can also choose to narrow down their search by type. The figure below shows the main types the Archives Hub currently has. Within these types we also have boolean type options (all, exact, phrase), but we have not analysed these at this point other than for the main keyword search.

Archives Hub search box

Archives Hub search box showing the types of searches available

There are caveats to this analysis.

1. Result will include spiders and spam

With our search logs, excluding bots is not straightforward, something which I refer to in a previous post: Archives Logs and Google Analytics. We are shortly to migrate to an entirely new system, so for this analysis we decided to accept that the results may be slightly skewed by these types of searches. And, of course, these crawlers often perform a genuine service, exposing archive descriptions through different search engines and other systems.

2. There are a small number of unaccounted for searches

Unidentified searches only account for 0.5% of the total, and we could investigate the origins of these searches, but we felt the time it would take was not worth it at this point in time.

3. Figures will include searches from the browse list.

These figures include searches actioned by clicking on a browse list, e.g. a list of subjects or a list of creators.

4. Creator, Subject and Repository include faceted searching

The Archives Hub currently has faceted searching for these entities, so when a user clicks to filter down by a specific subject, that counts as a subject search.

Results for One Month (October 2015)

Monthly figures for searches

For October 2015 the total searches are 19,415. The keyword search dominates, with a smaller use of the ‘any’ and ‘phrase’ options within the keyword search. This is no surprise, but this ‘default search’ still forms only 36% of the whole, which does not necessarily support the idea that researchers always want a ‘google type’ search box.

We did not analyse these additional filters (‘any/phrase/exact’) for all of the searches, but looking at them for ‘keyword’ gives a general sense that they are useful, but not highly used.

A clear second is search by subject, with 17% of the total. The subject search was most commonly combined with other searches, such as a keyword and further subject search. Interestingly, subject is the only search where a combined subject + other search(es) is higher than a single subject search. If we look at the results over a year, the combined subject search is by far the highest number for the whole year, in fact it is over 50% of the total searches. This strongly suggests that bots are commonly responsible for combined subject searches.

These searches are often very long and complex, as can be seen from the search logs:

[2015-09-17 07:36:38] INFO: 94.212.216.52:: [+0.000 s] search:: [+0.044 s] Searching CQL query: (dc.subject exact “books of hours” and/cql.relevant/cql.proxinfo (dc.subject exact “protestantism” and/cql.relevant/cql.proxinfo (dc.subject exact “bible o.t. psalms” and/cql.relevant/cql.proxinfo (dc.subject exact “authors, classical” and/cql.relevant/cql.proxinfo (dc.subject exact “bible o.t. psalms” and/cql.relevant/cql.proxinfo (dc.subject exact “law” and/cql.relevant/cql.proxinfo (dc.subject exact “poetry” and/cql.relevant/cql.proxinfo (dc.subject exact “bible o.t. psalms” and/cql.relevant/cql.proxinfo (dc.subject exact “sermons” and/cql.relevant/cql.proxinfo bath.personalname exact “rawlinson richard 1690-1755 antiquary and nonjuror”))))))))):: [+0.050 s] 1 Hits:: Total time: 0.217 secs

It is most likely that the bots are not nefarious; they may be search engine bots, or they may be indexing for the purposes of  information services of some kind, such as bibliographic services, but they do make attempts to assess the value of the various searches on the Hub very difficult.

Of the remaining search categories available from the main search page, it is no surprise that ‘title’ is used a fair bit, at 6.5%, and then after that creator, name, and organisation and personal name. These are all fairly even. For October 2015 they are around 3% of the total each, and it seems to be similar for other months.

The repository filter is popular. Researchers can select a single repository to find all of their descriptions (157), select a single repository and also search terms (916), and also search for all the descriptions from a single repository from our map of contributors (125). This is a total of 1,198, which is 6.1% of the total. If we also add the faceted filter by repository, after a search has been carried out, the total is 2,019, and the percentage is 10.4%. Looking at the whole year, the various options to select repository become an even bigger percentage of the total, in particular the faceted filter by repository.   This suggests that improvements to the ability to select repositories, for example, by allowing researchers to select more than one repository, or maybe type of repository, would be useful.

Screen shot of Hub map

Google Map on the Hub showing the link to search by contributor

We have a search within multi-level descriptions, introduced a few years ago, and that clearly does get a reasonable amount of use, with 1,404 uses in this particular month, or 7.2% of the total. This is particularly striking as this is only available within multi-level descriptions. It is no surprise that this is valuable for lengthy descriptions that may span many pages.

The searches that get minimal use are identifier, genre, family name and epithet. This is hardly surprising, and illustrates nicely some of the issues around how to measure the value of something like this.

Identifier enables users to search by the archival reference. This may not seem all that useful, but it tends to be popular with archivists, who use the Hub as an administrative tool. However, the current Archives Hub reference search is poor, and the results are often confusing. It seems likely that our contributors would use this search more if the results were more appropriate. We believe it can fulfill this administrative function well if we adjust the search to give better quality results; it is never likely to be a highly popular search option for researchers as it requires knowledge of the reference numbers of particular descriptions.

Epithet is tucked away in the browse list, so a ‘search’ will only happen if someone browses by epithet and then clicks on a search result. Would it be more highly used if we had a ‘search by occupation or activity’? There seems little doubt of this. It is certainly worth considering making this a more prominent search option, or at least getting more user feedback about whether they would use a search like this. However, its efficacy may be compromised by the extremely permissive nature of epithet for archival descriptions – the information is not at all rigorous or consistent.

Family name is not provided as a main search option, and is only available by browsing for a family name and clicking on a result, as with epithet. The main ‘name’ search option enables users to search by family name. We did find the family name search was much higher for the whole year, maybe an indication of use by family historians and of the importance of family estate records.

Genre is in the main list of search options, but we have very few descriptions that provide the form or medium of the archive. However, users are not likely to know this, and so the low use may also be down to our use of ‘Media type’, which may not be clear, and a lack of clarity about what sort of media types people can search for. There is also, of course, the option that people don’t want to search on this facet. However, looking at the annual search figures, we have 1,204 searches by media type, which is much more significant, and maybe could be built up if  we had something like radio buttons for ‘photographs’, ‘manuscripts’, ‘audio’ that were more inviting to users. But, with a lack of categorisation by genre within the descriptions that we have, a search on genre will mean that users filter out a substantial amount of relevant material. A collection of photographs may not be catalogued by genre at all, and so the user would only get ‘photographs’ through a keyword search.

Place name is an interesting area. We have always believed that users would find an effective ‘search by place’ useful. Our place search is in the main search options, but most archivists do not index their descriptions by place and because of this it does not seem appropriate to promote a place name search. We would be very keen to find ways to analyse our descriptions and consider whether place names could be added as index terms, but unless this happens, place name is rather like media type – if we promote it as a means to find descriptions on the Archives Hub, then a hit list would exclude all of those descriptions that do not include place names.

This is one of the most difficult areas for a service like the Archives Hub. We want to provide search options that meet our users’ needs, but we are aware of the varied nature of the data. If a researcher is interested in ‘Bath’ then they can search for it as a keyword, but they will get all references to bath, which is not at all the same as archives that are significantly about Bath in Gloucestershire. But if they search for place name: bath, then they exclude any descriptions that are significantly about Bath, but not indexed by place. In addition, words like this, that have different meanings, can confuse the user in terms of the relevance of the results because ‘bath’ is less likely to appear in the title. It may simply be that somewhere in the description, there is a reference to a Dr Bath, for example.

This is one reason why we feel that encouraging the use of faceted search will be better for our users. A more simple initial search is likely to give plenty of results, and then the user can go from there to filter by various criteria.

It is worth mentioning ‘date’ search. We did have this at one point, but it did not give good results. This is partly due to many units of description not including normalised dates. But the feedback that we have received suggests that a date search would be popular, which is not surprising for an archives service.  We are planning to provide a filter by date, as well as the ordering by date that we currently have.

Finally, I was particularly interested to see how popular our ‘search collection level only’ is. screen shot of Hub search boxThis enables users to only see ‘top level’ results, rather than all of the series and items as well. As it is a constant challenge to present hierarchical descriptions effectively, this would seem to be one means to simplify things. However, for October 2015 we had 17 uses of this function, and for the whole year only 148. This is almost negligible. It is curious that so few users chose to use this. Is it an indication that they don’t find it useful, or that they didn’t know what it means? We plan to have this as a faceted option in the future, and it will be interesting to see if that makes it more popular or not.

We are considering whether we should run this exercise using some sort of filtering to check for search engines, dubious IP addresses, spammers, etc., and therefore get a more accurate result in terms of human users.  We would be very interested to hear from anyone who has undertaken this kind of exercise.

 

From Ivory Tower to People Power

Here is a presentation I gave at ELAG 2015 to introduce our innovation project, Exploring British Design. The presentation is entitled ‘From Ivory Tower to People Power‘ (You Tube link) and emphasises the collaborative nature of the project and the focus on people as a topic, rather than on archival description, which is not always the best starting place for researchers. The presentation covers:

  • Aims of the project
  • Workshops with postgraduate students about how they research and analysis of their research paths
  • Workshops with postgraduates about websites: what students do and don’t like in terms of discovery
  • Traditional archival cataloguing ‘lock in’ of entities such as people, places and events.
  • Connectivity beyond single A to B connections; ‘anything can be a focus’ and can link to a myriad of other things
  • Use of EAC-CPF (XML standard for archival authority files)
  • Creating the data, handcrafting data, limitations of our approach, too many ideas not enough time!
  • Demonstration of the Website

 

Exploring British Design: Interface Design Principles

Britain Can Make It exhibition poster
Britain Can Make It, exhibition poster

For our AHRC project, ‘Exploring British Design‘ one of the questions we asked is:

How might a website co-designed by researchers, rather than a top-down collection-defined approach to archive content, enhance engagement with and understanding of British design?

The workshops that we have run were one of the key ways that we hoped to understand more about how postgraduates and others research their topics, what they liked and didn’t like about websites, and in a general sense how they think and understand resources, and how we can tune into that thinking.

 

 

In the blogs posts that we have created so far, we set out one of our central ideas:

Providing different routes into archives, showing different contexts, and enabling researchers to create their own narratives, can potentially be achieved through a focus on the ‘real things’ within an archive description; the people, organisations and places, and also the events surrounding them.

The feedback from the workshops gave us plenty to work with, and here I wanted to draw out some of the key messages that we are using to help us design an interface.

Researchers often think visually

Several of the participants in our workshops were visual thinkers. Maybe we had a slightly biased group, in that they work within or study design, but it seems reasonable to conclude that a visual approach can be attractive and engaging. We want to find a way to represent information more visually, whilst providing a rich and detailed resource. Our belief is that the visual should not dominate or hide the textual, as does often happen with cultural heritage resources, but that they should work better together.

Researchers often think in terms of creating a story or narrative

When we asked our participants to focus on an individual object, several of them thought in terms of its ‘story’. It seemed to me that most of the discussions that we had assumed a narrative type approach. It is hardy surprising, as when we talk about people, places and events we connect them together. It is a natural thing to do.

Different types of contexts provide value

When we asked workshop participants to think about how they would go about researching the object they were given, they tended to think of ways to contextualise it. They were interested in where it came from, in its physicality and its story. For example, we gave out photographs of an exhibition and they wanted to know where the photographs were taken, more about the exhibition and the designers involved in it, what else was going on at that time?   Our idea with Exploring British Design is that we can create records that allow these kinds of contexts to flourish. The participants did not concentrate on traditional archival context, as they did not tend to recognise this in the same way as archivists – it is one perspective amongst many.

We cannot provide a substitute for the value of handling the original object, and it was clear that researchers found this to be immensely valuable, but we can help to provide context that helps to scope reality.

Uncovering the obscure is a good thing

Not surprisingly, our workshop participants were keen that their research efforts should result in finding little-known information that they could utilise. They talked about the excitement of uncovering information and the benefits for their work.

Habits are part of the approach to research

The balance between being innovative and anchoring an interface in what people are familiar with seems to be important.

Trust is very important

The importance of trust was stressed at all of our workshops, and the need to know the context of information. We need to build something that researchers believe is a quality resource, with information they can rely on.

Serendipity is good…although it can lead you astray

It was clear that our participants wanted to explore, and liked the idea of coming across the unexpected. Several of them felt that the library bookshelves provide a good opportunity to browse and discover new sources (they talked about this more than the serendipity of the web). But there was also a note of caution about time wasted pursuing different avenues of information. It seems good to build in serendipity, whilst providing an interface that gives clear landmarks and signposts.

Search and Relevance

Our workshop participants were clear that choice of search terms has a big influence on what you find, and this can be a disadvantage. You may be presented with a search box, and you don’t really know what to search for to get what you want, especially if you don’t know what you want! Also, the relevance ranking can be a puzzle. Library databases often seem to give results that don’t make that much sense.

One thing that stood out to me was the willingness to use Google, which is a simple search box, with no indication of how to search, that brings back huge amounts of results; but the criticisms of library databases, where choice of search term is crucial and where ‘too many results’ are seen as a problem. It seemed that the key here was effective relevance ranking, but our workshop participants did agree that relevance ranking can deceive: the first page of results may look good, but you don’t really know what you are missing. Google is good at providing a first page of useful looking results….and maybe that’s enough to stop most people wondering about what they might be missing!

 Exploring British Design

As our project has progressed, I think it is fair to say that we have benefitted hugely from the input of the students and academics that we have talked to, not only for this project but also more generally. But it was not possible for us to manage to implement a directly co-designed website. The logistics of the project didn’t allow for this, as we wanted to gather input to inform the project, and then we had the complications of pulling together the data, designing the back end and the API. We would probably have needed at least another 6 months on the project to go back to the workshop participants and ask them about the website design as we went along.

But I think we have achieved a good deal in terms of engagement. Our Exploring British Design project has been about other ways through content, moving away from a search box and a list of search results, and thinking about immersing researchers in a ‘landscape’, where they can orientate themselves but also explore freely. So, we are thinking about engagement in terms of a more visually attractive and immersive experience, giving researchers the opportunity to follow connections in a way that gives them a sense of movement through the design landscape, hints at the unknown, and shows the relevancy of the entities that are featured in the website.  We hope to show how this can potentially expand understanding because it allow for a wider context and more varied narratives.

In the next project post we hope to present our interface for this pilot project!

 

Exploring British Design: Research Paths II

We recently ran a second workshop as part of our Exploring British Design project. The workshops aim  to understand more about  approaches to research, and researchers’ understanding and use of archives.

The second workshop was run largely on the same basis as the first workshop, using the same exercises.

Looking at what our researchers said and documented about their research paths over the two workshops, some points came out quite strongly:

  • Google is by far the most common starting point but its shortcomings are clear and issue of trust come up frequently.
  • There is often a strong visual emphasis to research, including searching for images and the use of Pinterest; there seems to be a split between those who gravitate towards a more text-based approach and those who think visually (many of our participants were graphic designers though!).
  • It is common to utilise the references listed in Wikipedia articles.
  • The library as a source is seen as part of a diverse landscape – it is one place to go to, albeit an important one. It is not the first port of call for the majority.
  • Aggregators are not specifically referred to very often. But they may be seen as a place to go if other searches don’t yield useful results.
  • Talking to people is very important, be it lecturers, experts, colleagues or friends
  • Online research is more immediate, and usually takes less effort, but there are issues of trust and it may not yield specific enough results, or uncover the more obscure sources.
  • There is a tendency to start from the general and work towards the more specific. With the research paths of most of the researchers, the library/archive was somewhere in the middle of this process.
  • Personal habits and past experience play a very large part, but there is a real interest in finding new routes through research, so habit is not a sticking point, but simply the dominant influence unless it is challenged.

For the second workshop, the first exercise asked participants to document their likely research paths around a topic.

flip chart showing research paths for a topic
Research paths of two researchers for the topic of Simpsons of Piccadilly

 

We had four pairs of researchers looking at different topics, and we left them to discuss their research paths for about 45 minutes. The discussions following the exercise picked up on a number of areas:

Online vs Offline

We kicked off by asking the researchers about online versus ‘offline’ research paths. One participant commented that she saw online as a route through to traditional research – maybe to locate a library or archive – ‘online is telling me where to look’ but in itself it is too general and not specific enough; whereas the person she was paired with tended to do more research online. He saw online as giving the benefit of immediacy – at any time of day or night he could access content. The issue of trust came up in the discussion around this issue, and one participant summed up nicely: “If you do online research there is less effort but there is less trust; if you research offline there is more effort but there is more trust.”

Following on from the discussion about how people go about using online services, there was a comment that things found online are often the more obvious, the more used and cited resources. Visiting a library or archive may give more opportunity to uncover little known sources that help with original research. This seemed to be endorsed by most participants, one commenting that Pinterest tends to reflect what is trendy and popular. However, there was also a view that something like Pinterest can lead researchers to new sources, as they are benefiting from the efforts, and sometimes the quite obsessive enthusiasms, of a wide range of people.

There was agreement that online research can lead to ‘information dumping’, where you build up a formidable collection of resources, but are unlikely to get round to sorting them all out and using them.

Library Resources

The issue of effort came up later in the discussion when referring to a particular university library (probably typical of many university libraries), and the amount of effort involved in using its databases. There was a comment about how you need to ‘work yourself up to an afternoon in the library’ and there seemed to be a general agreement that the ‘search across all resources’ often produced quite meaningless results. When compared to Google, the issue seems to be that relevance ranking is not effective, so the top results often don’t match your requirements. There was also some discussion around the way that library resource discovery services often involve too many steps, and there is effort in understanding how the catalogue works. One participant, whose research centres on the Web and the online user experience, felt that printed sources were of little use to him, as they were out of date very quickly.

Curating your sources

One researcher talked about using Pinterest to organise findings visually. This was followed up by another researcher talking about how with online research you can organise and collect things yourself. It facilitates ‘curating’ your own collection of resources. It can also be easier to remember resources if they are visual. Comparing Pinterest to the Library – with the former you click to add the image to your board; with the Library you pay a visit, you find the book, you take it to the scanner, you pay to take a scan…although it is increasingly possible to take pictures of books using your own device. But the general feeling was that the Web was far quicker and more immediate.

Attitudes towards research

One participant felt that there might be a split between those more like him who see research as ‘a means to an end’ and those who enjoy the process itself. So maybe some are looking for the shortest route to the end goal, and others see research as more exploratory activity and expect it to take time and effort. This may partly be a result of the nature and scope of the research. Short time scales preclude in-depth research.

Talking about serendipitous approaches, someone commented that browsing the library shelves can be constructive, as you can find books around your subject that you weren’t aware existed. This is replicated to some extent in something like Amazon, which suggests books you might be interested in. There was also some feeling that exploring too many avenues can take the researcher off topic and take up a great deal of time.

Trust and Citation

The issue of trust is important.  A first-hand experience, whether of a place you are researching, or using physical archive sources, is the most trustworthy, because you are seeing with your own eyes, experiencing first hand or looking at primary sources first hand; a library provides the next level of trust, as a book is an interpretation, and you may feel it requires corroboration; the online world is the least trustworthy. You will have the least trust if you are looking at a website where you don’t know about who or what is behind it. There was agreement that trust can come through crowd sourced information, but also some discussion around how to cite this (for example, using the Harvard system to reference web pages and crowd sourced resources). This led on to a short discussion around the credibility of what is cited within research. Maybe attitudes to Wikipedia are slowly changing, but at present there is generally still a feeling that a researcher cannot cite it as a source. There are traditions within disciplines around how to cite and what are the ‘right’ things to cite.

[Further posts on Exploring British Design will follow, with reflections on our workshops and updates on the project generally]

 

 

 

 

 

 

Exploring British Design: Research Paths

Introduction

As part of our Exploring British Design project we are organising workshops for researchers, aiming to understand more about their approaches to research, and their understanding and use of archives. Our intention is to create an interface that reflects user requirements and, potentially, explores ideas that we gather from our workshops.

Of course, we can only hope to engage with a very small selection of researchers in this way, but our first workshop at Brighton Design Archive showed us just how valuable this kind of face-to-face communication can be.

We gathered together a small group of 7 postgraduate design students. We divided them into 4 groups of 2 researchers and a lone researcher, and we asked them to undertake 2 exercises. This post is about the first exercise and follow up discussion.  For this exercise, we presented each group with an event, person or building:

The Festival of Britain, 1951
Black Eyes and Lemonade Exhibition, Whitechapel Art Gallery, 1951
Natasha Kroll (1912-2004)
Simposons of Piccadilly, London

We gave each group a large piece of paper, and simply asked them to discuss and chart their research paths around the subject they had been given. Each group was joined by a facilitator, who was not there to lead in any way, but just to clarify where necessary, listen to the students and make notes.

Case Study

Researchers charting their research paths for the Festival of Britain
Researchers charting their research paths for the Festival of Britain

I worked with two design students, Richard and Caroline, both postgraduate students researching aspects of design at The University of Brighton. They were looking at the subject of the Festival of Britain (FoB). It fascinated me that even when they were talking about how to represent their research paths, one instinctively went to list their methods, the other to draw theirs, in a more graphic kind of mind map. It was an immediate indication of how people think differently. They ended up using the listing method (see left).

 

diagram showing stages of research
Potential research paths for the Festival of Britain

The above represents the research paths of Richard and Caroline. It became clear early on that they would take somewhat different paths, although they went on to agree about many of the principles of research. Caroline immediately said that she would go to the University library first of all and then probably the central library in Brighton. It is her habit to start with the library, mainly because she likes to think locally before casting the net wider, she prefers the physicality of the resources to the virtual environment of the Web. She likes the opportunity to browse, and to consider the critical theory that is written around the subject as a starting point. Caroline prefers to go to a library or archive and take pictures of resources, so that she can then work through them at her leisure.  She talked about the importance of being able to take pictures, in order to be able to study sources at her leisure, and how high charges for the use of digital cameras can inhibit research.

Richard started with an online search. He thought about the sort of websites that he would gravitate towards – sites that were directly about the topic, such as an exhibition website. He referred to Wikipedia early on, but saw it as a potential starting place to find links to useful websites, through the external links that it includes, rather than using the content of Wikipedia articles.

Richard took a very visual approach. He focused in on the FoB logo (we used this as a representation of the Festival) and thought about researching that. He also talked about whether the FoB might have been an exhibition that showcased design, and liked the idea of an object-based approach, researching things such as furniture or domestic objects that might have been part of the exhibition. It was clear that his approach was based upon his own interests and background as a film maker. He focused on what interested and excited him; the more visual aspects including the concrete things that could be seen, rather than thinking in a text-based way.

Caroline had previous experience of working in an archive, and her approach reflected this, as well as a more text-based way of thinking. She talked about a preference for being in control of her research, so using familiar routes was preferable. She would email the Design Archives at Brighton, but that was not top of the list because it was more of an unknown quantity than the library that she was used to. Maybe because she has worked in an archive, she referred to using film archives for her research;  whereas Richard, although a film maker, did not think of this so readily. Past experience was clearly important here.

Both researchers saw the library as a place for serendipitous research. They agreed that this browsing approach was more effective in a library than online. They were clearly attracted to the idea of searching the library shelves, and discovering sources that they had not known about. I asked why they felt that this was more effective than an online exploration of resources. It seemed to be partly to do with the dependency of the physical environment and also because they felt that the choice of search term online has a substantial effect on what is, and isn’t, found.

Both researchers were also very focused on issues of trust; both very much of opinion that they would assess their sources in terms of provenance and authorship.

In addition, they liked the idea of being able to search by user-generated tags and to have the ability to add tags to content.

General Discussion

In the general discussion some of the point made in the case study were reinforced. In summary:

Participants found the exercise easy to do. It was not hard to think about how they would research the topics they were given. They found it interesting to reflect on their research paths and to share this with others.

For one other participant the library was the first port of call, but the majority started online.

Some took a more historical approach, others a much more narrative and story-based approach.  There were different emphases, which seemed to be borne out of personality, experiences and preferences. For example, some thought more about the ordering of the evidence, others thought more about what was visually stimulating.
It was therefore clear that different researchers took different approaches based on what they were drawn to, which usually reflected their interests and strengths.

There was a strong feeling about trust being vital when assessing sources. Knowing the provenance of an article or piece of writing was essential.

The participants agreed that putting time and effort into gathering evidence is part of the enjoyment of research. One mentioned the idea that ‘a bit of pain’ makes the end result all the more rewarding!  They were taken aback at the idea that that discovery services feel pressured to constantly simplify in order to ensure that we meet researchers’ needs. They understood that research is a skill and a process that takes time and effort (although, of course, this may not be how the majority of undergraduates or more inexperienced researchers feel).  Certainly they agreed that information must not be withheld, it must be accessible. We (service providers) need to provide signposts, to allow researchers to take their own paths. There was discussion about ‘sleuthing’ as part of the research process, and trying unorthodox routes, as chance discoveries may be made. But there was consensus that researchers do not need or wish to be nannnied!

All researchers did use Google at some point….usually using it to start their search. Funnily enough, some participants had quite long discussions about what they would do, before they realised they would actually have gone to Google first of all. It is so common now, that most people don’t think about it. It seemed to operate very much as a as a starting point, from where the researchers would go to sites, assess their worth and ensure that the information was trustworthy.

[There will be follow up posts to this, providing more information about our researcher workshops, summarising the second activity, which was more focused on archive sources, and continuing to document our Exploring British Design project.]

 

 

Exploring British Design: New Routes through Content

At the moment, the Archives Hub takes a largely traditional approach to the navigation and display of archive collections. The approach is predicated on hundreds of years of archival theory, expanded upon in numerous books, articles, conferences and standards. It is built upon “respect des fonds” and original order. Archival provenance tells us that it is essential to provide the context of a single item within the whole archive collection; this is required in order to  understand and interpret said item.

ISAD(G) reinforces the ‘top down’ approach. The hierarchy of an archive collection is usually visualised as a tree structure, often using folders. The connections show a top-down or bottom-up approach, linking each parent to its child(ren).

image of hierarchical folders
A folder structure is often used to represent archival hierarchy

This principle of archival hierarchy makes very good sense. The importance of this sort of context is clear: one individual letter, one photograph, one drawing, can only reveal so much on its own. But being able to see that it forms part of a series, and part of a larger collection, gives it a fuller story.

However, I wonder if our strong focus on this type of context has meant that archivists have sometimes forgotten that there are other types of context, other routes through content. With the digital environment that we now have, and the tools at our disposal, we can broaden out our ambitions with regards to how to display and navigate through archives, and how we think of them alongside other sources of information. This is not an ‘either or’ scenario; we can maintain the archival context whilst enabling other ways to explore, via other interfaces and applications. This is the beauty of machine processable data – the data remains unchanged, but there can be numerous interfaces to the data, for different audiences and different purposes.

Providing different routes into archives, showing different contexts, and enabling researchers to create their own narratives, can potentially be achieved through a focus on the ‘real things’ within an archive description; the people, organisations and places, and also the events surrounding them.

image of entities and links
Very simplified model of entities within archive descriptions and links between them

This is a very simplified image, intended to convey the idea of extracting people, organisations and places from the data within archive descriptions (at all levels of description). Ideally, these entities and connections can be brought together within events, which can be built upon the principle of relationships between entities (i.e. a person was at a place at a particular time).

Exploring British Design is a project seeking to probe this kind of approach. By treating these entities as an important part of the ‘networks of things’, and by finding connections between the entities, we give researchers new routes through the content and the potential to tell new stories and make new discoveries. The idea is to explore ways to help us become more fully a part of the Web, to ensure that archives are not resources in isolation, but a part of the story.

A diagram showing archives and other entities connected
An example of connected entities

 

For this project, we are focussing on a small selection of data, around British design, extracting entities from the Archives Hub data, and considering how the content within the descriptions can be opened up to help us put it into new contexts.

We are creating biographical records that can be used to include structured data around relationships, places and events.  We aim to extract people from the archive descriptions in which they are ‘embedded’ so that we can treat them as entities – they can connect not only to archive collections they created or are associated with, but they can also connect to other people, to organisations, to events, to places and subjects. For example, Joseph Emberton designed Simpsons in Piccadilly, London, in 1936. There, we have the person, the building, the location and the time.

With this paradigm, the archive becomes one of the ‘nodes’ of the network,  with the other entities equally to the fore, and the ability to connect them together shows how we can start to make connections between different archive collections. The idea is that a researcher could come into an archive from any type of starting point. The above diagram (created just as an example) includes ‘1970’s TV comedy’ through to the use of portland stone, and it links the Brighton Design Archive, the V&A Theatre and Performance Archive and the University of the Arts London Archive. The long term aim is that our endeavours to open up our data will ensure that it can be connected to other data sources (that have also been made open); sources outside of our own sphere (the Archives Hub data). The traditional interface has its merits; certainly we need to continue to provide archival context and navigation through collections; but we can be more imaginative in how we think about displaying content. We don’t need to just have one interface onto our data. We need to ensure that archives are part of the bigger story, that they can be seen in all sorts of contexts, and they are not relegated to being a bit part, isolated from everything else.

 

Facing the Music: are researchers and information professionals dancing to different tunes?

Still of presentation at ELAG 2013
What are the chief weapons we need to use to improve the user experience?

At ELAG 2013 I gave a presentation with a colleague from The University of Amsterdam, Lukas Koster. We wanted to do something entertaining, but with a worthwhile message that we both feel strongly about. We believe that more needs to be done to integrate resources and provide them to researchers in a way that suits end-user needs. We gave a presentation where we urged our colleagues to ‘mind the gap’ between the perspective of the information professional – their jargon and their complicated systems, which often fail to link resources adequately – and the researcher, who wants an integrated approach, language that is not a barrier to use and expects the power of the Web to be used within a library context, just as they might when looking for music online.

Still of a presentation where a librarian is explaining the library system to a researcher
A researcher tries to make sense of the library systems

Our presentation included two sketches: one in a music shop, where a punter (the ‘seeker’) expects the shop owner (the ‘pusher’) to know who else bought this music and what they thought of if; and one in a library, where the seeker wants an overview of everything available, and they want to look at research data and other resources without struggling with different catalogue systems and terminology.

In our presentation we referred to the ‘seeker’ wanting a discipline-focussed approach (not format based), and access regardless of location. I highlighted one of the problems with searching by showing examples of search terms used on the Archives Hub where the researchers were confused by the results. The terms researchers use don’t always fit into our approach, using controlled vocabularies.  We talked about the importance of connections between information. Our profession is making headway here, but there is a long way to go before researchers can really pull things together across different systems.

I spoke about the danger of making assumptions about our users and showed some examples of the Archives Hub survey results. Researchers don’t always come to our websites knowing what they are or what they want; they don’t necessarily have the same understanding of ‘archives’ as we do. Lukas expanded more on our musical theme. We can learn from some of the initiatives in this area – such as the ability people have to explore the musical world in so many different ways though things like MusicBrainz. Lukas also showed examples of researcher interfaces, looking to pull things together for the end user. Isn’t the idea of giving the researcher the ability to manage all of their research in this way  something libraries should be spearheading?

Image of a woman at a desk surrounded by books
A librarian contemplates the end of the index card…

We concluded that the vision of integrated, interconnected data is not easy. As information professionals we may have to move out of our comfort zones. But we don’t have any choice unless we want to be sidelined. This means that we need to change our mindsets (we talked about a ‘librarian lobe’!) and we need to actually think about whether it is us that needs to learn information literacy because we need to learn to think more like the end user!

Still of a scence in which the librarian cuts up a book for the researcher
The librarian has a frustrating time with a researcher who only wants one chapter!

See the slides on Slideshare.

The presentation is on You Tube, but be warned there are scenes of book cutting that may be upsetting to some!