Municipal Archive of Girona, Thursday, April 27, 2023
Yesterday I checked out some of the ‘Artificial Intelligence and Archives Seminar‘ hosted by the Municipal Archive of Girona “within the framework of the Faber-Llull Residency (Olot, Catalonia) and the project InterPARES Trust AI of the University of British Columbia (Vancouver, Canada), and with the collaboration of the Society of Catalan Archivists“. There were some useful things discussed in this still quite new area of AI, so thought I’d share my notes.
InterPARES Trust AI Project (https://interparestrustai.org/), Muhammad Abdul-Mageed
The Trust AI goals are to:
- Identify specific AI technologies that can address critical records and archives challenges
- Determine the benefits and risks of using AI technologies on records and archives
- Ensure that archival concepts and principles inform the development of responsible AI
- Validate outcomes from Objective 3 through case studies and demonstrations
Muhammad focussed on trustworthiness as an issue for Archives. They are looking at using AI to assess and verify the authenticity of Archives through time. The essential research question: Can we develop artificial intelligence for carrying out competently and efficiently all records and archives functions while respecting the nature and ensuring the continuing trustworthiness of the record.
He noted that a fundamental difference between analog and digital records is the fact that analogue materials can be proven and verified on face value and rarely need extrinsic evidence. However for digital materials, extrinsic elements such as metadata are needed. They rely on ‘circumstantial’ evidence such as the integrity of the hosting system as well as the politics, procedures and technology surrounding the digital record.
Muhammad suggests that off-the-shelf tools are not well suited to archives, so within the Archives profession we will have to develop the systems ourselves. We are the only ones who know what to do because we are the professionals. Developers need to talk to archives professionals to find out what they want and design appropriate AI tools for them. The tools need to respect the trustworthiness of the records. The project is looking to influence the development of responsible tools.
The project looks to provide a wealth of tools and code. A very important aspect of the project is training the community. Muhammad suggested that the Archives profession will have to do a great deal of training to engage with AI tools and its possibilities.
Linking AI to Archives and Records, Peter Sullivan
The aim of the talk was to look at combining archival concepts and principles with AI. Peter used the lens of Diplomatic to consider AI solutions and how AI may interact with different components of the record including the context, act, persons, procedure, form and archival bond. Which parts of the archival record are impacted by AI and how does this inform the design of AI tools that respect diplomatic theory?
The most important component is the ‘archival bond’ which covers how aspects of records are related to each other. AI may be poor at looking at records in context of other records, and may not be able to respect the archival bond. Also, AI may not respect the context of the creation of the records and may not be aware of different levels of appraisal used.
AI may be helpful where there are different variations of names and fuzzy matching can be used to reconcile names. This aligns with the Archives Hub Names project. Dealing with records in aggregate may be somewhere AI is able to help, using topic modelling and clustering techniques. This is a use case we have identified ourselves and something we are looking at with the Archives Hub Labs Project. Finally he mentioned the interesting question of how we will archive the artefacts of AI developments themselves.
Model for an AI-Assisted Digitisation Project, Peter Sullivan
Peter talked about how AI is being used to help with the archiving of audio recordings, providing AI generated metadata enrichment. He noted this is very time-consuming to do by hand. Different types of recordings create very different challenges to AI to analyse . For UNESCO audio they are using four models, one for language translation and three for text extraction and text summarisation.
AI and Archives: Basic Requirements, Pilar Campos and Eloi Puertas
The project is aiming to provide a resource for archive professionals to assess AI solutions to help guide decision-making and create recommendations. They will provide a check list to assess AI tool performance. The rationale behind this is that there is a huge amount of interest and concern regarding AI, but a scarcity of implementation examples, along with a lack of knowledge of AI solutions for the professionals in the archives domain. There is also a degree of mistrust of the results of AI.
The expected results of the project are to provide AI knowledge in the archive domain and a list of potential risks for archivists. A SWOT analysis about AI from the Archives viewpoint will be provided, along with an assessment of the balance between our expectations of risk.
Automated Transcription: Palaeography and AI, Thiara Alves and Leonardo Fontes
The talk was essentially about using AI for automated transcription. The speakers talked about using Transkribus for transcription of text from images of documents. They found that most algorithms weren’t good at detecting old versions of Portuguese and Spanish words. The speakers felt that the context provided by the archivist was necessary for the transcribers transcriptions to be useful.
First Steps and Main Expectations from CRDI’s Experience of AI, David Inglésias
David talked about a project looking at being able to search images that haven’t been catalogued, so they don’t have metadata unless it is created by using AI. This ability is very useful for a photographic archive. They work with the Europeana Kaleidoscope project to attempt to provide archival context for images.
AI also allows for innovative new approaches to presenting photographs in addition to the standard historical ways of doing so. AI can be used for clustering photos that appear to be similar or related in someway. This could be something that the Archives Hub could look at also.
The full seminar is available on Youtube.