Things to consider before undertaking a digitisation project

Counter-intuitive as it may seem, this blog post will try and advance the idea that embarking on a project to digitise your paper collections isn’t always a great idea. This isn’t to say you should abandon the idea completely, but we would encourage you to think it through. You could read this post as a sort of cautionary tale.

The Harvard report Selecting Research Collections for Digitization proposes a number of very sound reasons for why an HFE Institution should pause before it commits resource to any large and complex digitisation project. They provide the reader with a series of questions that will help a good project planner steer a way through the decision process.

Among the reasons identified by these experts, I will single out two of my favourite themes:

Is anyone even interested?

Look at the material you’re intending to digitise. Does it have any value? Do you think readers, users, researchers and customers are going to be interested in it? Even if they are interested, why does it improve the situation for them to access it in digital form? Will usage of the material increase? If you increase access to thousands more people around the world who look at the material through your online catalogue, is that a genuine improvement? Why?

The answers to these questions may seem to be obvious to you, but this line of thinking also can expose some of our assumptions and pre-conceived ideas about our relationship with our audience, and the real value of serving content digitally.

We might assume a collection is going to be popular when it isn’t. We might assume that simply scanning a book and putting images of the pages online is all we need to do. Have we even asked the readers what they would like?

Can you go on supporting it?

This is about the very real problem of ongoing costs. We may assume that once all the scans are produced, the project budget can be closed. In fact, it continues to cost you money to store, support, manage and steward your digitised collections; and that’s leaving aside the cost of long-term preservation, should you realise there’s permanent value in the digital material you have created. In short, it may cost more than you think.

My former colleague Patricia Sleeman did a survey of a number of HFE Institutions in 2009 who had received JISC funding to carry out digitisation projects over the previous decade. She found:

“Four principal themes surfaced through analysis of the preservation plans of the digitisation projects that relate the maturity of institution to the likely success of their digitisation efforts. These are the need for preservation policies; collection management procedures; robust preservation infrastructures; and sustainability. In short, institutions or consortia which have clarity in these four areas considerably reduce the risks associated with long term access to digitized collections.”

Both of these reports may have been aimed primarily at HFE audiences in a research context, but I think the lessons apply to any organisations, including those in the commercial sector who intend to digitise content.

You’re considering spending a lot of money on digitising this collection, and potentially committing the resources of people, technology, and time. If you proceed with the project on overly-optimistic assumptions, it can lead to difficulties in the future.

However, don’t let this discourage you…

When you’ve decided to say “yes”

The benefits of doing digitisation have probably occurred to you already (saves wear and tear on originals, disseminates more content to a wider audience, benefits the organisation, may help with income generation…). I also like to encourage project managers to rethink, if possible, what the collection’s potential is for engaging with its intended audience. Are we happy to continue the traditional model of the searcher visiting the searchroom and looking at a box of photographs with captions, only doing it in a “digital” manner? Wouldn’t we like to use web tools like page-turners and zoom devices to enhance and improve on the experience in some way?

The great thing is that if you’ve done scanning according to best practices, you can repurpose your resources (as Access Copies) in a myriad of ways, making the most of access technologies. You’re now opening the doors for a potential dialogue with your user community, responding to changes in user needs and repurposing the way you serve your content. All your hard work will have paid off.

Priorities for business scanning

A business may decide to scan all their current paperwork, but this is not quite the same as a managed digitisation project.

Quite often a project like this is undertaken for a number of reasons: to save money, to improve efficiency, and to save space occupied by paper. The dream of the “paperless office” has been haunting us for about 30 years now. It still hasn’t come true, at least not in the way they promised us. I can personally recall a time when scanning bureaux appeared in the UK almost overnight, offering to convert the contents of 25 file cabinets into digital scans, and put them all onto a single CD ROM.

The prospect of doing this often appealed to senior executives, especially as the next logical step in their minds would be to get all that paper destroyed (a suggestion that usually causes an information manager to shudder).

How it differs from a traditional digitisation project

Which brings us to the next aspect that interests me. How long are we intending to keep these scans? A digitisation project for a library or archive collection will most likely result in digital content which we wish to preserve and keep permanently, because it’s both a valuable digital asset and a digital surrogate of an important part of our collections. However, when we take on “scanning for business”, as I call it, it’s possible the scans might have a relatively short shelf-life.

This is where it starts to shade into a records management concern. In fact my ideal would be to see a scanning project owned by the records manager, with one eye on user satisfaction, another on protecting business and legal needs, and a third eye on the possible long-term retention needs.

Taking all this into account, ideally we’d try and frame this project with a different emphasis to the concerns we have when doing digitisation for preservation purposes. Our list of priorities when scanning for business might look a bit like this:

Metadata

People need to find stuff again, and any automated retrieval system will only work if there’s sufficient metadata for the objects stored in it. We’d like to think about using pre-determined metadata schema, depending on the nature of the content; tags, folders, and naming rules that will help users retrieve content. My point here is that metadata decisions will tend to be driven by immediate user needs, rather than archival or library cataloguing standards.

Image quality

For a long-term preservation project, our first thought would be of high-resolution image files encoded in robust, open-source formats. For business scanning, it’s highly likely we might be able to compromise on the quality. If we can get away with lower-resolution images in compressed files, it’s worth considering. It may depend on whether the staff want OCR as well as images, which is yet another consideration.

Retention and disposal

Our plan for scanning must align with records management plans. The content is still maintained for as long as there’s a business need, just the same as when it was in paper form. Likewise, we’d hope staff co-operate with our recommended best practices for file naming and description, to assist with those retention decisions.

Authenticity

We’re all concerned with creating “authentic” digital objects, but the business need in this scenario might be slightly different to how an archivist or a researcher regards an authentic digital object. In the archives scenario, the archivist wants to be sure the preserved object is a genuine representation of the original, and so do their users. In the business scenario, we not only need to be assured of that, but we also want hard evidence that is the case, for when the auditors start asking questions. We’re thus facing at least two tough tasks – ensuring the scans themselves are authentic when they’re created, and then making sure we maintain that authenticity through daily use of the scans. We’d certainly want some form of evidence chain and audit trail for that.

From here, we’ve got the bare bones of a successful business scanning project. We might soon be in a better position to safely destroy paper originals, if indeed that was one of the drivers or project goals. That destruction needs to be carried out with due care and attention. You’d certainly want all the digital content signed off as regards authenticity to prove the admissibility of digital objects as legal documents.

If however you succeed in secure shredding of a large number of boxes of paper, you’ve now freed up storage space and shelf space. That is something that has a cash value. If you keep metrics of progress in this area, you’re ready to start proving the value of your project to the organisation.

Projects like this aren’t necessarily easier to carry out than a library/archive focussed digitisation project, and they still require much planning and engagement with stakeholders. As I’ve tried to show above, the priorities have a slightly different emphasis. However, the results can be something of genuine benefit to your organisation, and will prove the value of the Information Manager/Records Manager/Archivist roles and services.

Five benefits of a digitisation programme

We see digitisation as a form of project management, and any managed project needs to have at least three core things – costs, risks, and benefits. It’s important to think about the benefits that a digitisation programme will bring, and not just to you as a collection manager, but to your users, and to your organisation. Sometimes these benefits can be overlooked, or not considered and assessed in detail. In this post we’ll pick out some of the possible benefits digitisation can bring.

Saves originals

Archivists and librarians will recognise the scenario – there’s a precious irreplaceable resource, or one that is fragile (the paper may be crumbling), or it’s the only available copy in the country. What’s more it’s in constant demand, so subjected to frequent handling every time it’s retrieved from the stacks by the staff, then further handling in the searchroom. These precious documents and books don’t like being out in the light too often. Digitisation eliminates all the above risks and provides what, in the old analogue world, would have been called a “surrogate” copy.

Main beneficiaries: archivists, librarians

Meets user needs

This may seem obvious, but it’s still surprising how some digitisation projects still start and end with the collection manager’s decision, and don’t take the audience of users into account. There ought to be a formal process of assessing user needs at the start of a project, and the application of metrics to determine whether user needs have actually been met. This doesn’t always happen; digitisation decisions can be driven instead by internal staff meetings, advisory boards, or the recommendations of external consultants.

It might be more beneficial to consider user-centric methods and approaches like focus groups, customer surveys, online questionnaires, and statistics on searchroom use. A successful digitisation project aimed directly at satisfying a real user need can reap visible dividends for the organisation, in terms of visits, web page hits, raised profile, user satisfaction, and user engagement.

Main beneficiaries: users, the institution

Improves or enhances access

This is surely one of the main benefits of digitising any resource. If planned and executed correctly it can result in a string of related benefits for you and your organisation. Increased access through the web, reaching more users, and increasing not just the numbers but the diversity of your audience. But it’s not enough to just throw an existing image collection on the web in a gallery browser and let the power of the internet do the rest.

Collection managers should take the opportunity to rethink the potential of the resources, listen to user needs, and use technology to provide more imaginative ways to recast and enhance access to the content. There are possibilities for discovery metadata as well as cataloguing metadata, for navigational links that allow many entry points to a collection instead of a traditional hierarchical catalogue, and plug-in tools that can deliver popular and attractive ways to serve the content to users.

One of the most prominent of these is the page-turner and zoom tool device, so common with online books. These things are not merely gimmicks to be used for their own sake, but can offer your users more direct engagement with your collections. And we haven’t even mentioned crowd-sourcing yet…

Main beneficiaries: users, researchers

Saves space

This scenario is a bit of an outlier, and it’s primarily more of a records management/organisational change story (although other information management professionals may consider it too). The common motivator here is that the office is running out of space and that it would be convenient to scan all the current papers into digital form, and start “working digitally”. Managers who have this bright idea can immediately see a cost saving in terms of storage space, with visions of now-empty filing cabinets being removed from costly office space.

True, space saving can be a massive benefit – but people still have to find the materials. A project like this has to be managed very carefully and with a lot of preparation, especially giving due attention to metadata, which doesn’t automatically appear from nowhere when you take folders out of an organised filing system. And scanning is not cost-neutral either. Even so, if you can do this right, you’ll be contributing a genuine improvement to current working practice, and you will save money and space.

Main beneficiaries: staff, organisation, managers

A step towards digital preservation

The gain here is that the digitisation process can seriously lengthen the life of your valuable resources. Through digitisation, you could begin the process of long-term digital preservation. The scenario would be that you continue to keep the original analogue materials, but also keep the digitised version you have created; after all, it has cost you a lot of money to create it (staff time, server space), and its ongoing value to the organisation is already being demonstrated.

Treat the digitised resource with as much care and respect as you would your archival originals, and you’re on the road to digital preservation. As part of the project planning you would want to factor in the long-term preservation goal, before you even lift the lid of the scanner.

Main beneficiaries: archivist, institution

These are just five of the many benefits that a well-managed digitisation project can bring. Other topics would include income generation, pro-active user engagement, and attracting new customers to your offering. Understanding benefits (along with the costs of risks) is a positive way of understanding the digitisation task and delivering the project successfully.