The AIDA toolkit: use cases

There are a few isolated uses of the old AIDA Toolkit. In this blog post I will try and recount some of these AIDA toolkit use cases.

In the beginning…

In its first phase, I was aided greatly in 2009 by five UK HE Institutions who volunteered to act as guinea pigs and do test runs, but this was mainly to help me improve the structure and the wording. However, Sarah Jones of HATII was very positive about its potential in 2010.

“AIDA is a very useful for seeing where your strengths and weaknesses lie. The results could provide a benchmark too, so if you go on to make some changes you can measure their effects…AIDA sounds particularly useful for your context too as this is about institutional readiness and assessing where strengths and weaknesses lie to determine areas for investment.”

I also used AIDA as part of consultancy for a digital preservation strategy, working with the digital archivist at Diageo in 2012; they said

“We agree that the AIDA assessment would be worthwhile doing as it will give us a good idea of where we are in terms of readiness and the areas we need to focus on to enable the implementation of a digital preservation strategy and system.”

Sarah Makinson of SOAS also undertook an AIDA assessment.

Further down the line…

Between 2011 and 2015, the toolkit was published and made available for download on a Jisc-hosted project website. During that time various uses were made of AIDA by an international audience:

Natalya Kusel used it for benchmarking collection care; she had

“been looking for some free self-assessment tools that I can use for benchmarking the current ‘health’ of collections care. I’m looking for something that will help me identify how the firm currently manages digital assets that have a long retention period so I can identify risks and plan for improvement.”

Anthony Smith used it as a teaching aid for part of UNESCO’s Intergovernmental Oceanographic Data Exchange sponsored teaching programme.

Kelcy Shepherd of Amherst College used it in her workshops.

“Coincidentally, the Five Colleges, a consortium I’m involved in, used the Toolkit a few years ago. Each institution completed the survey to ascertain levels of readiness at the various institutions, and determine areas where it would make sense to collaborate. This helped us identify some concrete steps that we could take together as a consortium.”

Walter D Ray, the Political Papers archivist at Southern Illinois University, used it to assess his library’s readiness:

“I’m glad to see work is being done on the AIDA toolkit. We used it for our self-assessment and found it helpful. As my boss, Director of Special Collections Pam Hackbart-Dean says, “the digital readiness assessment was a useful tool in helping give us direction.” I would add that it helped us define the issues we needed to confront.

“Since then we have developed some policies and procedures, revised our Deed of Gift form, set up a digital forensics workstation, and put a process in place to handle digital projects coming from elsewhere on campus. We greatly appreciate the work you’ve done on the AIDA toolkit.”

However, on the less positive side, Nathan Moles and Christoph Becker of University of Toronto studied AIDA as part of their “in-depth review of the state of the art of assessment frameworks in Digital Preservation.” Their survey of the landscape indicates the following:

“Our work showed that (too) many models have already been designed. Most models have been designed with a focus on practice (which is good), but in very informal ways without rigorous design methods (which is not so good). Aside from a model, there’s also need for a tool, a method, guidance, and empirical evidence from real-world applications to be developed and shared.”

AIDA in particular was found wanting:

“I think AIDA provides an interesting basis to start, but also currently has some shortcomings that we would need to see addressed to ensure that the resulting insights are well-founded. Most importantly, the fundamental concepts and constructs used in the model are currently unclear and would benefit from being set on a clear conceptual foundation.”

These stories show that AIDA had more of a shelf-life and more application than I originally expected. Our hope is that the new AOR Toolkit will give the ideas a new lease of life and continue to be of practical help to some in performing assessments.

Reworking AIDA: Storage

In the fourth of our series of posts on reworking the AIDA self-assessment toolkit, we look at a technical element – Managed Storage.

Reworking AIDA Storage

In reworking the toolkit, we are now looking at the 11th Technology Element. In the “old” AIDA, this was called “Institutional Repository”, and it pretty much assessed whether the University had an Institutional Repository (IR) system and the degree to which it had been successfully implemented, and was being used.

For the 2009 audience, and given the scope of what AIDA was about, an IR was probably just the right thing to assess. In 2009, Institutional Repository software was the new thing and a lot of UK HE & FE institutions were embracing it enthusiastically. Of course your basic IR doesn’t really do storage by itself; certainly it enables sharing of resources, it does managed access, perhaps some automated metadata creation, and allows remote submission of content. An IR system such as EPrints can be used as an interface to storage – as a matter of fact it has a built-in function called “Storage Manager” – but it isn’t a tool for configuring the servers where content is stored.

Storage in 2016

In 2016, a few things occurred to me thinking about the storage topic.

  1. I doubt I shall ever understand everything to do with storage of digital content, but since working on the original AIDA my understanding has improved somewhat. I now know that it is at least technically possible to configure IT storage in ways that match the expected usage of the content. Personally, I’m particularly interested in such configuration for long-term preservation purposes.
  2. I’m also aware that it’s possible for a sysadmin – or even a digital archivist – to operate some kind of interface with the storage server, using for instance an application like “storage manager”, that might enable them to choose suitable destinations for digital content.
  3. Backup is not the same as storage.
  4. Checksums are an essential part of validating the integrity of stored digital objects.

I have thus widened the scope of Element TECH 11 so that we can assess more than the limited workings of an IR. I also went back to two other related elements in the TECH leg, and attempted to enrich them.

To address (1), the capability that is being assessed is not just whether your organisation has a server room or network storage, but rather if you have identified your storage needs correctly and have configured the right kind of storage to keep your digital content (and deliver it to users). We might add this capability is nothing to do with the quantity, number, or size of your digital materials.

To assess (2), we’ve identified the requirement for an application or mechanism that helps put things into storage, take them out again, and assist with access while they are in storage. We could add that this interface mechanism is not doing the same job as metadata, capability for which is assessed elsewhere.

To address (3), I went back to TECH 03 and changed its name from “Ensuring Availability” to “Ensuring Availability / Backing Up”. The element description was then improved with more detailed descriptions concerning backup actions; we’re trying to describe the optimum backup scenario, based on actual organisational needs; and provide caveats for when multiple copies can cause syncing problems. Work done on the CARDIO toolkit was very useful here.

To incorporate (4), I thought it best to include checksums in element TECH 04, “Integrity of Information”. Checksum creation and validation is now explicitly suggested as one possible method to ensure integrity of digital content.

Managed storage as a whole is thus distributed among several measurable TECH elements in the new toolkit.

In this way I’m hoping to arrive at a measurable capability for managed storage that does not pre-empt the use the organisation wishes to make of such storage. The wording is such that even a digital preservation strategy could be assessed in the new toolkit – as could many other uses. If I can get this right, it would be an improvement on simply assessing the presence of an Institutional Repository.

Reworking AIDA: Legal Compliance

Today we’re looking briefly at legal obligations concerning management of your digital content.
The original AIDA had only one section on this, and it covered Copyright and IPR. These issues were important in 2009 and are still important today, especially in the context of research data management when academics need to be assured that attribution, intellectual property, and copyright are all being protected.

Legal Compliance – widening the scope

For the new toolkit, in keeping with my plan for a wider scope, I wanted to address additional legal concerns. The best solution seemed to be to add a new component to assess them.

What we’re assessing under Legal Compliance:

  1. Awareness of responsibility for legal compliance.
  2. The operation of mechanisms for controlling access to digital content, such as by licenses, redaction, closure, and release (which may be timed).
  3. Processes of review of digital content holdings, for identifying legal and compliance issues.

Legal Compliance – Awareness

The first one is probably the most important of the three. If nobody in the organisation is even aware of their own responsibilities, this can’t be good. My view would be that any effective information manager – archivist, librarian, records manager – is probably handling digital content with potential legal concerns regarding its access, and has a duty of care. But a good organisation will share these responsibilities, and embeds awareness into every role.

Legal Compliance – Mechanisms & Procedures

Secondly, we’d assess whether the organisation has any means (policies, procedures, forms) for controlling access and closure; and thirdly, whether there’s a review process that can seek out any legal concerns in certain digital collections.

Legislation regimes vary across the world, of course, and this makes it challenging to devise a model that is internationally applicable. The new version of the model name-checks specific acts in UK legislation, such as the Data Protection Act and Freedom of Information. On the other hand, other countries have their own versions of similar legislation; and copyright laws are widespread, even when they differ on detail and interpretation.

The value of the toolkit, if indeed it proves to have any, is not that we’re measuring an organisation’s specific point-by-point compliance with a certain Statute; rather, we’re assessing the high-level awareness of legal compliance, and what the organisation does to meet it.

Interestingly, the high-level application of legal protection across an organisation is something which can appear somewhat undeveloped in other assessment tools.

The ISO 16363 code of practice refers to copyright implications, intellectual property and other legal restrictions on use only in the context of compiling good Content Information and Preservation Description Information.

The expectation is that “An Archive will honor all applicable legal restrictions. These issues occur when the OAIS acts as a custodian. An OAIS should understand the intellectual property rights concepts, such as copyrights and any other applicable laws prior to accepting copyrighted materials into the OAIS. It can establish guidelines for ingestion of information and rules for dissemination and duplication of the information when necessary. It is beyond the scope of this document to provide details of national and international copyright laws.”

Personally I’ve always been disappointed by the lack of engagement implied here. To be fair though, the Code does cite many strong examples of “Access Rights” metadata, when it describes instances of what exemplary “Preservation Description Information” should look like for Digital Library Collections.

The DPCMM maturity model likewise doesn’t see fit to assess legal compliance as a separate entity, and it is not singled out as one of its 15 elements. However, the concept of “ensuring long‐term access to digital content that has legal, regulatory, business, and cultural memory value” is embedded in the model.

Reworking the AIDA toolkit: why we added new sections to cover Depositors and Users

Why are we reworking the AIDA toolkit?

The previous AIDA toolkit covered digital content in an HE & FE environment. As such, it made a few basic assumptions about usage; one assessment element was not really about the users at all, but about the Institutional capability for measuring use of resources. To put it another way, an Institution might be maintaining a useless collection of material that nobody looks at (at some cost). What mechanism do you have to monitor and measure use of assets?

That is useful, but also limited. For the new toolkit, I wanted to open up the whole question of usage, and base the assessment on a much wider interpretation of the “designated user community”. This catch-all term seems to have come our way via the OAIS reference model, but it seems to have caught on in the community. As I would have it, it should mean:

  • Anyone who views, reads and uses digital material.
  • They do it for many purposes and in many situations –I would like user scenarios to include internal staff looking at born-digital records in an EDRMS, or readers downloading ebooks, or photographers browsing a digital image gallery, or researchers running an app on a dataset.

To understand these needs, and meet them with appropriate mechanisms, ought to be what any self-respecting digital content service is about.

Measuring organisational commitment to users

I thought about how I could turn that organisational commitment into a measurable, assessable thing, and came up with four areas of benchmarking:

  • Creating access copies of digital content, and providing a suitable technological platform to play them on
  • Monitoring and measuring user engagement with digital content, including feedback
  • Evaluation of the user base to identify their needs
  • Some mechanism whereby they relate the user experience to the actual digital content. User evaluation will be an indicator here.

This includes the original AIDA element, but adds more to it. I’d like to think a lot of services can recognise their user community provision in the above.

After that, I thought about the other side of the coin – the people who create and deposit the material with our service in the first place. Why not add a new element to benchmark this?

Measuring organisational commitment to depositors

The OAIS reference model doesn’t have a collective term for these people, but it calls them “Producers”, a piece of jargon I have never much cared for. We decided to stick with “Depositors” for this new element; I’m more interested in the fact that they are transferring content to us, whether or not they actually “produced” it. As I would have it, a Depositor means:

  • Anyone who is a content creator, submitter, or donor, putting digital material into your care.
  • Again, they do it in many situations: external depositors may donate collections to an archive; internal users may transfer their department’s born-digital records to an organisational record-keeping system; researchers may deposit publications, or datasets, in a repository.

When trying to benchmark this, it occurred to me there’s a two-way obligation going on in this transfer situation; we have to do stuff, and so do the depositors. We don’t have to be specific about these obligations in the toolkit; just assess whether they are understood, and supported.

In reworking the toolkit, I came up with the following assessable things:

  • Whether obligations are understood, both by depositors and the staff administering deposits
  • Whether there are mechanisms in place for allowing transfer and deposit
  • Whether these mechanisms are governed by formal procedures
  • Whether these mechanisms are supported by documents and forms, and a good record-keeping method

For both Users and Depositors, there will of course be legal dimensions that underpin access, and which may even impact on transfer methods. However, these legal aspects are catered for in two other benchmarking elements, which will be the subject of another blog post.

Conclusion

With these two new elements, I have fed in information and experience gained from teaching the DPTP, and from my consultancy work; I hope to make the new AIDA into something applicable to a wider range of digital content scenarios and services.

Updating the AIDA toolkit

This week, I have been mostly reworking and reviewing the ULCC AIDA toolkit. We’re planning to relaunch it later this year, with a new name, new scope, and new scorecard.

AIDA toolkit – a short history

The AIDA acronym stands for “Assessing Institutional Digital Assets”. Kevin Ashley and myself completed this JISC-funded project in 2009, and the idea was it could be used by any University – i.e. an Institution – to assess its own capability for managing digital assets.

At the time, AIDA was certainly intended for an HE/FE audience; and that’s reflected in the “Institutional” part of the name, and the type of digital content in scope. Content likely to have been familiar to anyone working in HE – digital libraries, research publications, digital datasets. As a matter of fact, AIDA was pressed into action as a toolkit directly relevant to the needs of Managing Research Data, as is shown by its reworking in 2011 into the CARDIO Toolkit.

I gather CARDIO, under the auspices of Joy Davidson, HATII and the DCC, has since been quite successful and its take-up among UK Institutions to measure or benchmark their own preparedness for Research Data Management perhaps indicates we were doing something right.

A new AIDA toolkit for 2016

My plan is to open up the AIDA toolkit so that it can be used by more people, apply to more content, and operate on a wider basis. In particular, I want it to apply to:

  • Not just Universities, but any Organisation that has digital content
  • Not just research / library content, but almost anything digital (the term “Digital Assets” always seemed vague to me; where the term “Digital Asset Management” is in fact something very specific and may refer to particular platforms and software)
  • Not just repository managers, but also archivists, records managers, and librarians working with digital content.

I’m also going to be adding a simpler scorecard element; we had one for AIDA before, but it got a little too “clever” with its elaborate weighted scores.

Readers may legitimately wonder if the community really “needs” another self-assessment tool; we teach several of the known models on our Digital Preservation Training Programme, including the use of the TRAC framework for self-assessment purposes; and since doing AIDA, the excellent DPCMM has become available, and indeed the latter has influenced my thinking. The new AIDA toolkit will continue to be a free download, though, and we’re aiming to retain its overall simplicity, which we believe is one of its strengths.

A new acronym

As part of this plan, I’m keen to bring out and highlight the “Capability” and “Management” parts of the AIDA toolkit, factors which have been slightly obscured by its current name and acronym. With this in mind, I need a new name and a new acronym. The elements that must be included in the title are:

  • Assessing or Benchmarking
  • Organisational
  • Capacity or Readiness [for]
  • Management [of]
  • Digital Content

I’ve already tried feeding these combinations through various online acronym generators, and come up empty. Hence we would like to invite the wider digital preservation community & use the power of crowd-sourcing to collect suggestions & ideas. Simply comment below or tweet us at @dart_ulcc and use the #AIDAthatsnotmyname hashtag. Naturally, the winner(s) of this little crowd-sourcing contest will receive written credit in the final relaunched AIDA toolkit.

Building a Digital Preservation Strategy

IRMS ARAI Event 19 November 2015

Last week I was in Dublin where I gave a presentation for the IRMS Ireland Group at their joint meeting with ARA Ireland. It was great for me personally to address a roomful of fellow Archivists and Records Managers, and learn more about how they’re dealing with digital concerns in Ireland. I heard a lot of success stories and met some great people.

Sarah Hayes, the Chair of IRMS Ireland, heard me speak earlier this year at the Celtic Manor Hotel (the IRMS Conference) and invited me to talk at her event. Matter of fact I got a similar invite from IRMS Wales this year, but Sarah wanted new content from me, specifically on the subject of Building a Digital Preservation Strategy.

How to develop a digital preservation strategy

My talk on developing a digital preservation strategy made the following points:

  • Start small, and grow the service
  • You already have knowledge of your collections and users – so build on that
  • Ask yourself why you are doing digital preservation, and who will benefit
  • Build use cases
  • Determine your own organisational capacity for the task
  • Increase your metadata power
  • Determine your digital preservation strategy (or strategies) in advance of talking to IT, or a vendor

I also presented some imaginary scenarios that would address digital preservation needs incrementally and meet requirements for different audiences:

  • Bit-level preservation (access deferred)
  • Emphasis on access and users
  • Emphasis on archival care of digital objects
  • Emphasis on legal compliance
  • Emphasis on income generation

Event Highlights

In fact the whole day was themed on Digital Preservation issues. John McDonough, the Director of the National Archives of Ireland, gave encouraging reports of how they are managing electronics records by “striding up the slope of enlightenment”. There’s an expectation that public services in Ireland must be “digital by default”, with an emphasis on continual online access to archival content in digital form. John is clear that archives in Ireland “underpin citizen’s rights” and are crucial to the “development of Nation and statehood”, which fits the picture I have of Dublin’s culture – it’s a city with a very clear sense of its own identity, and history.

In terms of change management and advocacy for working digitally, Joanne Rothwell has single-handedly transformed the records management of Waterford City and County Council, using SharePoint. Her resourceful use of an alphanumeric File Index allows machine-readable links between paper records and born-digital content, thus preserving continuity of materials. She also uses SharePoint’s site-creation facility to build a virtual space for holding “non-current” records, which replicate existing file structures. It’s splendid to see sound records management practice carry across into the digital realm so successfully.

DPTP alumnus from the class of November 2011, Hugh Campbell of the Public Record Office of Northern Ireland, has developed a robust and effective workflow for the transfer, characterisation and preservation of digital content. It’s not only a model of good practice, but he’s done it all in-house with his own team, using open source tools and developer skills.

During the breaks I managed to mingle and met many other professionals in Ireland who have responded well to digital challenges. I was especially impressed by Liz Robinson, the Records Officer for the Health and Safety Authority in Ireland. We agreed that any system implementation should only proceed after a thorough planning period, where the organisation establishes its own workflows and procedures, and does proper requirements gathering. This ought to be a firm foundation in advance of purchasing and implementing a system. Sadly, we’ve both seen projects where the system drove the practice, rather than the other way around.

Plan, plan and plan again before you speak to a vendor; this was the underlying message to my ‘How to develop a digital preservation strategy’ talk, so it was nice to be singled out in one Tweet as a “particular highlight” of the day.

Making Progress in Digital Preservation: Part 3 – Roundtable

This one-day event on 31 October 2014 was organised by the DPC. The day concluded with a roundtable discussion, featuring a panel of the speakers and taking questions from the floor. The level of engagement from delegates throughout the event was clearly shown in the interesting questions posed to the panel, the thoughtful responses and the buzz of general discussion in this session. Among many interesting topics covered, three stand out as typical of the breadth of knowledge and interest shown at the event.

First, a fundamental question about the explosion of digital content and how it will impact on our work. How can we keep all of this stuff, where will we put it, and how much will it really cost? Sarah Middleton urged us to attend the upcoming 4C Conference in London to hear discussion of cutting-edge ideas about large-scale storage approaches. Catherine Hardman reminded us of one of the most obvious archival skills, which we sometimes tend to forget: selection. We do not have to keep “everything”, and a well-formulated selection policy continues to be an effective way to target the preservation of the most meaningful digital resources.

Next, a question on copyright and IPR as it applies to archives/archivists and hence digital preservation quickly span into the audience and back to different panel members in a lively discussion. The general inability of the current legislation, formed in a world of print, to deal with the digital reality of today was quickly identified as an obstacle to both those engaged in digital preservation and to users seeking access to digital resources.

The Hargreaves report was mentioned (by Ed Pinsent of ULCC) and given an approving nod for the sensible approach it took to bringing legislation into the 21st century. However, the speed with which any change has actually been implemented was of concern for all, and was felt to be damaging to the need to preserve material. The issues around copyright and IPR were knowledgeable discussed from a wide variety of perspectives, including the cultural heritage sector, specialist collections, archaeological data and resources and, equally important among delegates, the inability to fully open up collections to users in order to comply with the law as it stands.

Some hope was found, though, in the recent (and ongoing) Free Our History campaign. Using the national and international awareness of various exhibitions, broadcasts and events to mark the anniversary of the First World War, the campaign has focussed on the WW1 content that museums, libraries and archives are unable to display because of current copyright law. Led by the National Library of Scotland, other memory institutions and many cultural heritage institutions have joined in the CILIP campaign to prominently exhibit a blank piece of paper. The blank page represents the many items which cannot be publicly displayed. The visual impact of such displays has caught attention, and the accompanying petition is currently being addressed by the UK government.

The third issue raised during this session was the suggestion for more community activity, for example more networking and exchange of experience opportunities. Given the high rate of networking during lunchtime and breaks, not to mention the lively discussions and questions, this was greeted with enthusiasm. Kurt Helfrich from RIBA explained his idea for an informal group to organise site visits and exchange of experience sessions among themselves, perhaps based in London to start off with. Judging by the level of interest among delegates to share their own work and learn from others during this day, this would be really useful to many. Leaving the event with positive plans for practical action felt a very fitting way to end an event around making progress in digital preservation.

Download the slides from this event

Making Progress in Digital Preservation: Part 2 – Costs, Standards, Tools and Solutions

This one-day event on 31 October 2014 was organised by the DPC. After lunch Sarah Middleton of the DPC reported on progress from the 4C Project on the costs of curation. The big problem facing the digital preservation community is that the huge volumes of data we are expected to manage are increasing dramatically, yet our budgets are shrinking. Any investment we make must be strategic and highly targeted, and collaboration with others will be pretty much an essential feature of the future. To assist with this, the 4C project has built the Curation Exchange platform, which will allow participating institutions to share – anonymised, of course – financial data in a way that will enable the comparison of costs. The 4C project has worked very hard to advance us beyond the simple “costs model” paradigm, and this dynamic interactive tool will be a big step in the right direction.

William Kilbride then described the certification landscape, mentioning Trusted Digital Repositories, compliance with the OAIS Model, and the Trusted Repositories Audit & Certification checklist, and the evolution of European standards DIN 31644 and the Data Seal of Approval. William gave his personal endorsement to the Data Seal of Approval approach (it has been completed by 36 organisations, and another 30 are in progress of doing it), and suggested that we all try an exercise to see how many of the 16 elements we felt we could comply with. After ten minutes, a common lament was “there are things here beyond my control…I can’t influence my depositors!”

William went on to discuss tools for digital preservation. Very coincidentally, he had just participated in the DPC collaborative “book sprint” event for the upcoming new DPC Handbook, and helped to write a chapter on this very topic. Guess what? There are now more tools for digital preservation than we know what to do with. The huge proliferation of devices we can use, for everything from ingest to migration to access, has developed into a situation where we can hardly find them any more, let alone use them. William pins his hopes on the Tools Registry COPTR, the user-driven wiki with brief descriptions of the functionality and purpose of hundreds of tools – but COPTR is just one of many such registries. The field is crowded out with competitors such as the APARSEN Tool Repository, DCH-RP, the Library of Congress, DCEX…ironically, we may soon need a “registry of tool registries”.

Our host James Mortlock described the commercial route his firm had taken in building a bespoke digital repository and cataloguing tool. His project management process showed him just how requirements can evolve in the lifetime of a project – what they built was not what they first envisaged, but through the process they came up with stronger ideas about how to access content.

Kurt Helfrich’s challenge was not only to unify a number of diverse web services and systems at RIBA, but also to create a seamless entity in the Cloud that could meet multiple requirements. RIBA’s in a unique position to work on system platforms and their development, because of their strategic partnership with the V&A, a partner organisation with whom they even share some office space. The problem he faces is not just scattered teams, but one of mixed content – library and archive materials in various states of completion regarding their digitisation or cataloguing. Among his solutions, he trialled the Archivists’ Toolkit which served him so well in California; and the open-source application Archivematica, with an attached Atom catalogue and Duracloud storage service. A keen adaptor of tools, Kurt proposed that we look at the POWRR tool grid, which is especially suitable for small organisations; and Bit Curator, the digital forensics systems from Chapel Hill.

Download the slides from this event

Making Progress in Digital Preservation: Part 1 – The path towards a steady state

This one-day event on 31 October 2014 was organised by the DPC and hosted at the futuristic, spacious offices of HSBC, where the presentation facilities and the catering were excellent. All those attending were given plenty of mental exercises by William Kilbride. He said he wanted to build on his “Getting Started in Digital Preservation” events and help everyone move further along the path towards a steady state, where digital preservation starts to become “business as usual”. The very first exercise he proposed was a brief sharing-discussion exercise where people shared things they have tried, and what worked and didn’t work.

Kurt Helfrich from The RIBA Library said his organisation had a large amount of staff administering a historic archive; various databases, created at different time for different needs, would be better if connected. He was keen to collaborate with other RIBA teams and link “silos” in his agency.

Lindsay Ould from Kings College London said “starting small worked for us”. They’ve built a standalone virtual machine, using locally-owned kit, and are using it for “manual” preservation; when they’ve got the process right, they could automate it and bring in network help from IT.

When asked about “barriers to success”, over a dozen hands in the room went up. Common themes: getting the momentum to get preservation going in the first place; extracting a long-term commitment from Executives who lose interest when they see it’s not going to be finished in 12 months. There’s a need to do advocacy regularly, not just once; and a need to convince depositors to co-operate. IT departments, especially in the commercial sector, are slow to see the point of digital preservation if its “business purpose” – a euphemism for “income stream”, I would say – is not immediately apparent. Steph Taylor of ULCC pointed out how many case studies in tools in our profession are mostly geared to the needs of large memory institutions, not the dozens of county archives and small organisations who were in the room.

Ed Pinsent (i.e. me) delivered a talk on conducting a preservation assessment survey, paying particular attention to the Digital Preservation Capability Maturity Model and other tools and standards. If done properly, this could tell you useful things about your capability to support digital preservation; you could even use the evidence from the survey to build a business case for investment or funding. The tricky thing is choosing the model that’s right for you; there are about a dozen available, with varying degrees of credibility as to their fundamental basis.

Catherine Hardman from the Archaeological Data Service (ADS) is one who is very much aware of “income streams”, since the profession of archaeology has become commercialised and somewhat profit-driven. She now has to engage with many depositors as paying customers. To that end, she’s devised a superb interface called ADS Easy that allows them to upload their own deposits, and add suitable metadata through a series of web forms. This process also incorporates a costing calculator, so that the real costs of archiving (based on file size) can be estimated; it even acts as a billing system, creating and sending out invoices. Putting this much onus on depositors is, in fact, a proven effective way of engaging with your users. In the same vein, ADS have published good practice guidance on things to consider when using CAD files, and advice on metadata to add to a Submission Package. Does she ever receive non-preferred formats in a transfer? Yes, and their response is to send them back – the ADS has had interesting experiences with “experimental” archaeologists in the field. Kurt Helfrich opened up the discussion here, speaking of the lengthy process before deposit that is sometimes needed; he memorably described it as a “pre-custodial intervention”. Later in the day, William Kilbride picked up this theme: maybe “starting early”, while good practice, is not ambitious enough. Maybe we have to begin our curation activities before the digital object is even created!

Catherine also perceived an interesting shift in user expectations; they want more from digital content, and leaps in technology make them impatient for speedy delivery. As part of meeting this need, ADS have embraced OAI-PMH protocols, which enables them to reuse their collections metadata and enhance their services to multiple external shareholders.

There is no doubt that having a proper preservation policy in place would go some way to helping address issues like this. When Kirsty Lee from the University of Edinburgh asked how many of us already had a signed-off policy document, the response level was not high. She then shared with us the methodology that she’s using to build a policy at Edinburgh, and it’s a thought-through meticulous process indeed. Her flowcharts show her constructing a complex “matrix” of separate policy elements, all drawn from a number of reports and sources, which tend to say similar things but in different ways; her triumph has been to distil this array of information and, equally importantly, arrange the elements in a meaningful order.

Kirsty is upbeat and optimistic about the value of a preservation policy. It can be a statement of intent; a mandate for the archive to support digital records and archives. It provides authority and can be leverage for a business case; it helps get senior management buy-in. To help us understand, she gave us an excellent handout which listed some two dozen elements; the exercise was to pick only the ones that suit our organisation, and to put them in order of priority. The tough part was coming up with a “single sentence that defines the purpose of your policy” – I think we all got stumped by this!

Download the slides from this event

IT skills for archivists and librarians

In September this year Dave Thompson of the Wellcome Library asked a question by Twitter, one which is highly relevant to digital preservation practice and learning skills. Addressing digital archivists and librarians, he asked: “Do we need to be able to do all ourselves, or know how to ask for what is required?”

My answer is “we need to do both”…and I would add a third thing to Dave’s list. We also need to understand enough of what is happening when we get what we ask for, whether it’s a system, tool, application, storage interface, or whatever.

Personally, I’ve got several interests here. I’m a traditional archivist (got my diploma in 1992 or thereabouts) with a strong interest in digital preservation, since about 2004.

As an archivist wedded to paper and analogue methods, for some years I was fiercely proud of my lack of IT knowledge. Whenever forced to use IT, I found I was always happier when I could open an application, see it working on the screen, and experiment with it until it does what I want it to do. On this basis, for example, I loved playing around with the File Information Tool Set (FITS).

When I first managed to get some output from FITS, it was like I was seeing the inside of a file format for the first time. I could see tags and values of a TIFF file, some of which I was able to recognise as those elusive “significant properties” you hear so much about. So this is what they look like! From my limited understanding of XML – which is what FITS outputs into – I knew that XML was structured and could be stored in a database. That meant I’d be able to store those significant properties as fields in a database, and interrogate them. This would give me the intellectual control that I used to relish with my old card catalogues in the late 1980s. I could see from this how it would be possible to have “domain” over a digital object.

There’s a huge gap, I know, between me messing around on my desktop and the full functionality of a preservation system like Preservica. But with exercises like the above, I feel closer to the goal of being able to “ask for what is required”, and more to the point, I could interpret the outputs of this functionality to some degree. I certainly couldn’t do everything myself, but I want to feel that I know enough about what’s happening in those multiple “black boxes” to give me the confidence I need as an archivist that my resources are being preserved correctly.

I would like to think it’s possible to equip archivists, librarians and data managers with the same degree of confidence; teaching them “just enough” of what is happening in these complex processes, at the same time translating machine code into concrete metaphors that an information professional can grasp and understand. In short, I believe these things are knowable, and archivists should know them. Of course it’s important that the next step is to open a meaningful discussion with the developer, data centre manager, or database engineer (i.e. “ask for what is required”), but it’s also important to keep that dialogue open, to go on asking, to continue understanding what these tools and systems are doing. There is a school of thought that progress in digital preservation can only be made when information professionals and IT experts collaborate more closely, and I would align myself with that.