March 2012 – Ed The Archivist

Enhancing Linnean Online: The AIDA metrics

From Enhancing Linnean Online project blog

We’ve mentioned so far Beagrie’s metrics for measuring improvements to the management of academic research data, and the Ithaka metrics for measuring improvements to delivery of content, particularly with regard to the operations of an organisation’s business model.

A third possibility is making use of UoL’s AIDA toolkit, a desk-based assessment method which has gone through many iterations and possible applications. Over time, we’ve shown how it could be used for digital assets, records management, and even research data (although admittedly it has never been used in anger in those situations). AIDA doesn’t intend to measure assets, but instead measures the capability of the Institution (or the owning Organisation) to preserve its own digital resources.

In July 2011 we produced a detailed reworking of AIDA that could specifically be used for research data. This was part of the JISC-funded IDMP project and the intention was that AIDA could feed into the DCC’s online assessment tool, CARDIO. The detail of the reworked AIDA was assisted greatly by the expertise of numerous external consultants, recruited from a wide range of international locations and skillsets. They fine-tuned the wording of the AIDA assessment statements to make it into a benchmarking tool with great potential.

AIDA is predicated on the notion of “continuous improvement”, and expresses its benchmarking with an adapted version of the “Five Stages” model which was originally invented and developed at Cornell University by Anne Kenney and Nancy McGovern. It also uses their “Three Legs” framework to ensure that the three mainstays of digital preservation (i.e. Organisation, Technology and Resources) are properly investigated.

We think there may be some scope for applying AIDA to JISC ELO, mainly as an analysis tool or knowledge base for measuring the results of responses to questionnaires and surveys. It could assess broadly whether the Linnean Online service finds itself at a Stage Two or Stage Three. We could subsequently measure whether the enhancements, once implemented, have moved the service forward to a Stage Four or Stage Five.

This could be done with a little tweaking of the wording of the current iteration of AIDA, and through selective / partial application of its benchmarks. We think it would be a good fit for the ELO project strands which discuss Metadata, Licensing, and Preservation Policy – all of which are expressed in the Organisation leg of AIDA. The Resources leg of AIDA could be tweaked to measure improvements in the area of ELO’s Revenue Generation. One of the most salient features of AIDA is its flexibility.

Versions of the adapted AIDA toolkit can be found via the project blog, although the improved CARDIO version has not been published as yet.

Future-Proofing: Final Report

From the JISC Future Proofing project blog

Here’s a copy of our final report, published today. The PDF is 1.63MB, 55 pages long.

Enhancing Linnean Online: The Ithaka metrics

From Enhancing Linnean Online project blog

In our last post, we considered whether the Beagrie metrics are going to work for this project. This time, we’ll look at another JISC-related initiative, the Ithaka study on sustainability (Sustaining Digital Resources: An On-the-Ground View of Projects Today) from July 2009.

Beagrie’s metrics were of course directed at the HFE sector, and the main beneficiaries in his report are Universities, researchers, staff, and students who benefit from improved scholarly access. Conversely, Ithaka takes the view that an organisation really needs a business model to underpin long-term access to its digital content, and manage preservation of that content. They undertook 12 case studies examining such business models in various European organisations, and identified numerous key factors for success and sustainability.

The subjects of these case studies were not commercially-oriented businesses as such, but Ithaka takes a no-nonsense view of what “sustainability” means in a digital context: it means whatever you do, you need to cover your operating costs. One of the report’s chief interests then, is discovering what your revenue-generating strategy is going to be. They identify metrics for success, but it’s clear what they mean by “success” is the financial success of the resource and revenue model, and that is what is being measured.

The metrics proposed by Ithaka are very practical and tend to deal with tangibles. Broadly I see three themes to the metrics:

1. Quantitative metrics which apply to the content

Amount of content made available
Usage statistics for the website

2. Quantitative metrics which apply to the revenue model

Amount of budget expected to be generated by revenue strategies
Numbers of subscriptions raised, against the costs of generating them
Numbers of sales made, against the costs of generating them

3. Intangible metrics

Proving the value and effectiveness of a project to the host institution
Proving the value and effectiveness of a project to stakeholders and beneficiaries

How would these work for our project? My sense is that (1) ought to be easy enough to establish, particularly if we apply our before-and-after method here and compile some benchmark statistics (e.g. figures from the Linnean weblogs) at an early stage, which can be revisited in a few years.

As to (2), revenue generation is something we have explicitly outlined in our bid. Since the project is predicated on repository enhancements, we intend to develop these enhancements in line with existing revenue models proposed to us by the Linnean staff. Our thinking at this time is that the digitised content can be turned into an income stream by imaginative and innovative strategies for reuse of images and other digital content, which might involve licensing. As yet we haven’t discussed plans for a subscription service, or direct sales of content.

(3) is an interesting one. The immediate metric we’re thinking of applying here is how the enhanced repository features will improve the user experience. I’m also expecting that when we interview stakeholders in more detail, they can provide more wide-ranging views about “value and effectiveness”, connected with their research and scholarship. These intangibles amount to much more than just ease of navigation or speed of download, and they ought to be translatable into something of value which we can measure.

But maybe we can also look again at the host institution, and find examples of organisational goals and policies at Linnean that we could align with the enhancement programme, with a view to indicating how each enhancement can assist with a specific goal of the organisation. As Ithaka found however, this approach works better with a large memory institution like TNA, which happens to work under a civil service structure with key performance indicators and very strong institutional targets.

In all the Ithaka model looks like it can work well for this project, provided we can promote the idea of a “business model” to Linnean without sounding like we’re planning some form of corporate takeover!

Enhancing Linnean Online: Beagrie’s Metrics

From Enhancing Linnean Online project blog

We’re aiming at delivering a set of enhancements to Linnean, but how will we know if they worked? One of the aims of the ELO project is to measure the results of the programme of enhancements in terms of tangible benefits to Linnean and its stakeholders. We’re thinking about a framework that will enable us to measure the results of this before-and-after process.

Our thinking at the moment is that we could adapt and make use of the Beagrie metrics published in Benefits from the infrastructure projects in the JISC managing research data programme, which were devised for measuring the value of research data to an HEI.

The Institutions that Beagrie worked with were asked about how their lives would improve if their research data was better managed. Data management planning is a wide-ranging process that includes preservation as one of the outcomes. Those consulted were very strong at coming up with lists of potential benefits. But it was slightly harder for them to come up with reliable means of measuring those benefits.

Even so, the report came up with a very credible list. It was organised under the names of the stakeholders who would benefit the most. A little tinkering with that table allows us to put Linnean at the top of the list as the main beneficiary. We also know Linnean has researchers, and that they are concerned with scholarly access. This suggests a framework like the one below might work for us.

Benefits Metrics for Linnean

New research grant income
Number of research dataset publications generated
Number of research papers
Improvements over time in benchmark results
Cost savings/efficiencies
Re-use of infrastructure in new projects

Benefits Metrics for researchers

Increase in grant income/success rates
Increased visibility of research through data citation
Average time saved
Percentage improvement in range/effectiveness of research tool/software

Benefits Metrics for Scholarly Communication and Access

Number of citations to datasets in research articles
Number of citations to specific methods for research
Percentage increase in user communities
Number of service level agreements for nationally important datasets

The Institutions in the report go on to give specific instances of how these metrics apply in their case. For instance, for the “Average Time Saved” metric the Sudadmih project reported:

“In an attempt to measure benefit 1 (time saved by researchers by locating and retrieving relevant research notes and information more rapidly) Sudamih asked course attendees to estimate how much of their time spent writing up their research outputs is actually spent looking for notes/files/data that they know they already have and wish to refer to. The average was 18%, although in some instances it was substantially more, especially amongst those who had already spent many years engaged in research (and presumably therefore had more material to sift through). This would indicate that there is at least considerable scope to save time (and improve research efficiency) by offering training that over the long term could improve information management practices.”

However, the report is also clear that any form of enhancements (technical, administrative, cultural) can take some time to bed down before their benefits are even visible, let alone become measurable. “Measuring benefits therefore might be best undertaken over a longer time-scale”, is one possible conclusion. That is a caveat we’ll have to bear in mind, but it doesn’t preclude us devising our own bespoke set of metrics.

Every man his own modified digital object

Today we’ve just completed our Future-Proofing study at ULCC and sent the final report to the JISC Programme Manager, with hopes of a favourable sign-off so that we can publish the results on our blog.

It was a collaboration between myself and Kit Good, the records manager here at UoL. We’re quite pleased with the results. We wanted to see if we could create preservation copies of core business documents that require permanent preservation, but do it using a very simple intervention and with zero overheads. So we worked with a simple toolkit of services and software that can plug into a network drive; we used open source migration and validation tools. Our case study sought to demonstrate the viability of this approach. Along the way we learned a lot about how Xena digital preservation software operates, and how (combined with Open Office) it makes a very credible job of producing bare-bones Archival Information Packages, and putting information into formats with improved long-term prospects.

The project has worked on a small test corpus of common Institutional digital records, performed preservation transformations on them and conducted systematic evaluation to ensure that the conversions worked, that the finished documents render correctly, that sufficient metadata been generated for preservation purposes, and that it can feasibly be extracted and stored in a database; and that the results are satisfactory and fit for purpose.

The results show us that it is possible to build a low-cost, practical preservation solution that addresses immediate preservation problems, makes use of available open source tools, and requires minimal IT support. We think the results of the case study can feasibly be used by other Institutions facing similar difficulties, and scaled up to apply to the preservation of other and more complex digital objects. It will enable non-specialist information professionals to perform certain preservation and information management tasks with a minimum of preservation-specific theoretical knowledge.

Future-Proofing won’t solve your records management problems, but it stands a chance of empowering records managers by allowing them to create preservation-worthy digital objects out of their organisation’s records, without the need for an expensive bespoke solution.