Archival Context: entities and multiple identities


I recently took part in a Webinar (Web seminar) on the new EAC-CPF standard. This is a standard for the encoding of information about record creators: corporate bodies, persons and families. This information can add a great deal to the context of archives, supporting a more complete understanding of the records and their provenance.

We were given a brief overview of the standard by Kathy Wisser, one of the Working Group members, and then the session was open to questions and discussion.

The standard is very new, and archivists are still working out how it fits in to the landscape and how it relates to various other standards. It was interesting to note how many questions essentially involved the implementation of EAC-CPF: who creates the records? where are they kept? how are they searched? who decides what?
These questions are clearly very important, but the standard is just a standard for the encoding of ISAAR(CPF) information. It will not help us to figure out how to work together to create and use EAC-CPF records effectively.
In general, archivists use EAD to include a biographical history of the record creator, and may not necessarily create or link to a whole authority record for them. The idea is that providing separate descriptions for different entities is more logical and efficient. The principle of separation of entities is well put: “Because relations occur between the descriptive nodes [i.e. between archive collections, creators, functions, activities], they are most efficiently created and maintained outside of each node.” So that if you have a collection description and a creator description, the relationship between the two is essentially maintained separately to the actual descriptions. If only EAD itself was a little more data-centric (database friendly you might say), this would facilitate a relational approach.
I am interested in how we will effectively link descriptions of the same person, because I cannot see us managing to create one single authoritative record for each creator. This is enabled via the ‘identities’: a record creator can have two or more identities with each represented by a distinct EAC-CPF instance. I think the variety of identity relationships that the standard provides for is important, although it inevitably adds a level of complexity. It is something we have implemented in our use of the tag to link to related descriptions. Whilst this kind of semantic markup is a good thing, there is a danger that the complexity will put people off.
I’m quite hung-up on the whole issue of identifiers at the moment. This may be because I’ve been looking at Linked Data and the importance of persistent URLs to identify entities (e.g. I have a URL, you have a URL, places have a URL, things have a URL and that way we can define all these things and then provide links between them). The Archives Hub is going to be providing persistent URLs for all our descriptions, using the unique identifier of the countrycode, repository code and local reference for the collection (e.g. http://www.archiveshub.ac.uk/search/record.html?id=gb100mss, where 100 is the repository code and MSS is the local reference).
I feel that it will be important for ISAAR(CPF) records to have persistent URLs, and these will come from the recordID and the agencyCode. Part of me thinks the agency responsible for the EAC-CPF instance should not be part of the identifer, because the record should exist apart from the institution that created it, but then realistically, we’re not going to get consensus on some kind of independent stand-alone ISAAR(CPF) record. One of the questions I’m currently asking myself is: If two different bodies have EAC-CPF records, does it matter what the identifers/URLs are for those records, even if they are for the same person? Is the important thing to relate them as representing the same thing? I’m sure its very important to have a persistent URL for all EAC-CPF instances, because that is how they will be discoverable; that is their online identity. But the question of providing one unique identifier for one person, or one corporate body is not something I have quite made my mind up about.
It will be interesting to see how the standard is assessed by archivists and more examples of implementation. The Archives Hub would be very interested to hear from anyone using it.