ULCC – Ed The Archivist

Reworking AIDA: Storage

In the fourth of our series of posts on reworking the AIDA self-assessment toolkit, we look at a technical element – Managed Storage.

Reworking AIDA Storage

In reworking the toolkit, we are now looking at the 11th Technology Element. In the “old” AIDA, this was called “Institutional Repository”, and it pretty much assessed whether the University had an Institutional Repository (IR) system and the degree to which it had been successfully implemented, and was being used.

For the 2009 audience, and given the scope of what AIDA was about, an IR was probably just the right thing to assess. In 2009, Institutional Repository software was the new thing and a lot of UK HE & FE institutions were embracing it enthusiastically. Of course your basic IR doesn’t really do storage by itself; certainly it enables sharing of resources, it does managed access, perhaps some automated metadata creation, and allows remote submission of content. An IR system such as EPrints can be used as an interface to storage – as a matter of fact it has a built-in function called “Storage Manager” – but it isn’t a tool for configuring the servers where content is stored.

Storage in 2016

In 2016, a few things occurred to me thinking about the storage topic.

I doubt I shall ever understand everything to do with storage of digital content, but since working on the original AIDA my understanding has improved somewhat. I now know that it is at least technically possible to configure IT storage in ways that match the expected usage of the content. Personally, I’m particularly interested in such configuration for long-term preservation purposes.
I’m also aware that it’s possible for a sysadmin – or even a digital archivist – to operate some kind of interface with the storage server, using for instance an application like “storage manager”, that might enable them to choose suitable destinations for digital content.
Backup is not the same as storage.
Checksums are an essential part of validating the integrity of stored digital objects.

I have thus widened the scope of Element TECH 11 so that we can assess more than the limited workings of an IR. I also went back to two other related elements in the TECH leg, and attempted to enrich them.

To address (1), the capability that is being assessed is not just whether your organisation has a server room or network storage, but rather if you have identified your storage needs correctly and have configured the right kind of storage to keep your digital content (and deliver it to users). We might add this capability is nothing to do with the quantity, number, or size of your digital materials.

To assess (2), we’ve identified the requirement for an application or mechanism that helps put things into storage, take them out again, and assist with access while they are in storage. We could add that this interface mechanism is not doing the same job as metadata, capability for which is assessed elsewhere.

To address (3), I went back to TECH 03 and changed its name from “Ensuring Availability” to “Ensuring Availability / Backing Up”. The element description was then improved with more detailed descriptions concerning backup actions; we’re trying to describe the optimum backup scenario, based on actual organisational needs; and provide caveats for when multiple copies can cause syncing problems. Work done on the CARDIO toolkit was very useful here.

To incorporate (4), I thought it best to include checksums in element TECH 04, “Integrity of Information”. Checksum creation and validation is now explicitly suggested as one possible method to ensure integrity of digital content.

Managed storage as a whole is thus distributed among several measurable TECH elements in the new toolkit.

In this way I’m hoping to arrive at a measurable capability for managed storage that does not pre-empt the use the organisation wishes to make of such storage. The wording is such that even a digital preservation strategy could be assessed in the new toolkit – as could many other uses. If I can get this right, it would be an improvement on simply assessing the presence of an Institutional Repository.

Reworking the AIDA toolkit: why we added new sections to cover Depositors and Users

Why are we reworking the AIDA toolkit?

The previous AIDA toolkit covered digital content in an HE & FE environment. As such, it made a few basic assumptions about usage; one assessment element was not really about the users at all, but about the Institutional capability for measuring use of resources. To put it another way, an Institution might be maintaining a useless collection of material that nobody looks at (at some cost). What mechanism do you have to monitor and measure use of assets?

That is useful, but also limited. For the new toolkit, I wanted to open up the whole question of usage, and base the assessment on a much wider interpretation of the “designated user community”. This catch-all term seems to have come our way via the OAIS reference model, but it seems to have caught on in the community. As I would have it, it should mean:

Anyone who views, reads and uses digital material.
They do it for many purposes and in many situations –I would like user scenarios to include internal staff looking at born-digital records in an EDRMS, or readers downloading ebooks, or photographers browsing a digital image gallery, or researchers running an app on a dataset.

To understand these needs, and meet them with appropriate mechanisms, ought to be what any self-respecting digital content service is about.

Measuring organisational commitment to users

I thought about how I could turn that organisational commitment into a measurable, assessable thing, and came up with four areas of benchmarking:

Creating access copies of digital content, and providing a suitable technological platform to play them on
Monitoring and measuring user engagement with digital content, including feedback
Evaluation of the user base to identify their needs
Some mechanism whereby they relate the user experience to the actual digital content. User evaluation will be an indicator here.

This includes the original AIDA element, but adds more to it. I’d like to think a lot of services can recognise their user community provision in the above.

After that, I thought about the other side of the coin – the people who create and deposit the material with our service in the first place. Why not add a new element to benchmark this?

Measuring organisational commitment to depositors

The OAIS reference model doesn’t have a collective term for these people, but it calls them “Producers”, a piece of jargon I have never much cared for. We decided to stick with “Depositors” for this new element; I’m more interested in the fact that they are transferring content to us, whether or not they actually “produced” it. As I would have it, a Depositor means:

Anyone who is a content creator, submitter, or donor, putting digital material into your care.
Again, they do it in many situations: external depositors may donate collections to an archive; internal users may transfer their department’s born-digital records to an organisational record-keeping system; researchers may deposit publications, or datasets, in a repository.

When trying to benchmark this, it occurred to me there’s a two-way obligation going on in this transfer situation; we have to do stuff, and so do the depositors. We don’t have to be specific about these obligations in the toolkit; just assess whether they are understood, and supported.

In reworking the toolkit, I came up with the following assessable things:

Whether obligations are understood, both by depositors and the staff administering deposits
Whether there are mechanisms in place for allowing transfer and deposit
Whether these mechanisms are governed by formal procedures
Whether these mechanisms are supported by documents and forms, and a good record-keeping method

For both Users and Depositors, there will of course be legal dimensions that underpin access, and which may even impact on transfer methods. However, these legal aspects are catered for in two other benchmarking elements, which will be the subject of another blog post.

Conclusion

With these two new elements, I have fed in information and experience gained from teaching the DPTP, and from my consultancy work; I hope to make the new AIDA into something applicable to a wider range of digital content scenarios and services.

Updating the AIDA toolkit

This week, I have been mostly reworking and reviewing the ULCC AIDA toolkit. We’re planning to relaunch it later this year, with a new name, new scope, and new scorecard.

AIDA toolkit – a short history

The AIDA acronym stands for “Assessing Institutional Digital Assets”. Kevin Ashley and myself completed this JISC-funded project in 2009, and the idea was it could be used by any University – i.e. an Institution – to assess its own capability for managing digital assets.

At the time, AIDA was certainly intended for an HE/FE audience; and that’s reflected in the “Institutional” part of the name, and the type of digital content in scope. Content likely to have been familiar to anyone working in HE – digital libraries, research publications, digital datasets. As a matter of fact, AIDA was pressed into action as a toolkit directly relevant to the needs of Managing Research Data, as is shown by its reworking in 2011 into the CARDIO Toolkit.

I gather CARDIO, under the auspices of Joy Davidson, HATII and the DCC, has since been quite successful and its take-up among UK Institutions to measure or benchmark their own preparedness for Research Data Management perhaps indicates we were doing something right.

A new AIDA toolkit for 2016

My plan is to open up the AIDA toolkit so that it can be used by more people, apply to more content, and operate on a wider basis. In particular, I want it to apply to:

Not just Universities, but any Organisation that has digital content
Not just research / library content, but almost anything digital (the term “Digital Assets” always seemed vague to me; where the term “Digital Asset Management” is in fact something very specific and may refer to particular platforms and software)
Not just repository managers, but also archivists, records managers, and librarians working with digital content.

I’m also going to be adding a simpler scorecard element; we had one for AIDA before, but it got a little too “clever” with its elaborate weighted scores.

Readers may legitimately wonder if the community really “needs” another self-assessment tool; we teach several of the known models on our Digital Preservation Training Programme, including the use of the TRAC framework for self-assessment purposes; and since doing AIDA, the excellent DPCMM has become available, and indeed the latter has influenced my thinking. The new AIDA toolkit will continue to be a free download, though, and we’re aiming to retain its overall simplicity, which we believe is one of its strengths.

A new acronym

As part of this plan, I’m keen to bring out and highlight the “Capability” and “Management” parts of the AIDA toolkit, factors which have been slightly obscured by its current name and acronym. With this in mind, I need a new name and a new acronym. The elements that must be included in the title are:

Assessing or Benchmarking
Organisational
Capacity or Readiness [for]
Management [of]
Digital Content

I’ve already tried feeding these combinations through various online acronym generators, and come up empty. Hence we would like to invite the wider digital preservation community & use the power of crowd-sourcing to collect suggestions & ideas. Simply comment below or tweet us at @dart_ulcc and use the #AIDAthatsnotmyname hashtag. Naturally, the winner(s) of this little crowd-sourcing contest will receive written credit in the final relaunched AIDA toolkit.