Turning Paper into Electronic Documents

I would like to offer three different strategies for managing the process of scanning the existing documents, these are:

Back File Scanning

A Hybrid System, part electronic part paper

Scanning on demand

Back File Scanning

The first of these options is often used as the first step in a strategy to remove paper from the office. I have said this before and I will repeat it again, it is unlikely that you will get any benefit from this until you have stopped the flow of paper into your organisation. Even after you have adopted all the ideas we have discussed in the previous sections it is dubious that you will still get any benefit. If I explain the process you will see where the problems lie. The first step is to obtain scanners large enough to accommodate the volumes needed, remember that while a file is being scanned it is out of use to your business, how long can this be before there is an impact? The cost of scanners increases exponentially as they grow in size, there are some very low cost personal use scanners around now and departmental scanners are reasonably priced, but the large scanners with the high volume feeders are very expensive. Some of this cost can be reduced by using a bureau service however using a bureau will only make some of the other disadvantages worse. Using a bureau system often means the documents have to leave your office, increasing the amount of time they are unavailable as well as increasing the risk of a breach of security, lost file etc.

To see if back file conversion is suitable for your archives you will need to understand the process. First the folders are opened, all clips and staples removed and the documents fed into the scanner. This job is normally performed by semiskilled labour, therefore they will not be able to understand the importance and significance of each piece of paper, so everything will be pushed through the scanner, this will include duplicates, misfiled papers and irrelevant papers. In order to make this process beneficial a senior employee of your organisation must over see this operation and filter out and sort each sheet, do you think this would be possible in your company? If not all that will result is that the mess that was contained in each of the files will now become an electronic mess. The original objective of making paper information easy to find has not been achieved as sifting through digital images can be harder than looking through a pile of papers.

Often the only time bulk scanning can be beneficial is when all the papers are of the same type, census forms for example. If you look at your customer files and find that all your contracts are in one file and correspondents in another then you are OK, if you keep a folder for each customer that contains an assortment of contracts letters etc. the back file scanning option is most likely not for you.

Hybrid System

It is possible and quite common for a company to run a hybrid system, that is have a combination of electronic and paper documents. Common policy is that all documents before the cut off date are retained on paper and all documents created after this date is electronic. This is a very cost effective solution for situations where a case has a definite close time. This option can be difficult for the users if they are working on a case that has both paper and electronic.

Scan on Demand

In the majority of cases this option will deliver the best return on investment and the cost of the hardware will be lower, it also has the benefit that the documents will be sorted and cases cleaned up over time. The process is simple and probably similar to those you already have in place. It brings together some of the principles we have already discussed.

It has to be assumed that you have already implemented at least some of the concepts already covered in this guide and you now have the majority of new information in electronic form. Because most document management systems can manage both electronic and paper based documents the user would search for a document and then attempt to check it out to work on it. If the document already exists in electronic form then the check out process will continue as described as in the previous sections of this guide. However if the document is still stored on paper the user will be presented with a “Document Request Form” (an e-form of course), the form will ask them for the specifics of the information needed and urgency etc. This form will then be sent to the archive or registry staff and appear in their electronic Inbox. The Archivist will take the document from the archive, scan it into the Document Management system and return a message to the originator who can then carry on working on the case. Now that the document is in electronic form any subsequent retrievals will call up the electronic document.

As you can see, over time all the active documents will be converted leaving all dead and unwanted files behind.

Reducing the cost of scanning

The costs associated with document scanning are generally found in two areas, the first is the hardware, ie scanners, PC’s network etc. this cost will be very specific to your own project. The remaining cost is in the amount of man hours required to complete the project.

There are four phases to the project;

Document Preparation
Document Scanning
Document Indexing
Document Re-Filing/destruction

For this the calculation of the cost is fairly easy. It is usual to use agency or temporary staff, first find their hourly rate, then you need to find the time it takes for each task.

For document preparation you should have access to example folders, if your staff cost you £10 per hour and they can prepare a file for scanning in 10 mins, then the cost to you is £1 per file. Then for 180,000 files the cost to you will be £180,000.

The same calculation can then be made for scanning, (here the time taken will depend upon scanner speed.) and Indexing ( here the time will mainly be the speed the system can display the image and indexing window.)

Here a few ideas on how you can reduce your costs;

Add indexing information at the document preparation time. Most scanning software can detect a barcode, if you pre-prepare stickers containing the company name and registration number, these can be added a blank page at the front of the pile of papers that make up that company file.

Use an existing system. Can you integrate to an existing system after the scanning, if you did it before you can use the information to reduce the amount of manual indexing. For example if the existing system contains the customer address then you can pull this information automatically into your imaging system.

When digitising A3 paper it is probable that a traditional Document Scanner will be very inefficient. A Document Scanner is only effective when working with high volumes of A4 paper. For mixed paper types a Horizontally mounted Digital Camera will deliver an effective and low cost Document Capture Device. The camera will be connected to a Networked PC that has access to the Document Management System. To capture a document the operator will spread the document between guide lines on the table below the camera. A foot switch will operate the camera, the quality of the image can be checked on the PC.

Extracting Information from Paper

Although we have been able to reduce the amount of paper and the burden of scanning, there will still be instances where paper is still best. When a piece of paper is scanned the contents of the document cannot be read by the system, as far as the document management systems is concerned the scanned version of a contract is the same as a scanned version of a piece of customer correspondence. So when we scan a document we also need to tell the Document Management System some information so that it can be retrieved. This is known as indexing, the usual method is for a computer user to be presented with an image of the paper on their screen, they then read the text and key in the indexing information into an electronic form, Document type = “invoice” Customer name = “Smith” etc.
To automate this process these are a few technologies available and I want to just summaries some of these and discuss if they are relevant.

Bar Coding

We have all seen barcodes in the supermarket and know that this is a very cheap and accurate way for a computer system to recognise information. If there are documents that you send out from your company, then make sure they have a barcode on them, then when returned the scanner can easily read the information and know where to store this document. Take your customer contract as an example, remember we discussed how the e-form could be printed for signature, if the e-form contains a barcode the form can be posted out, then when it returns matched up with the original e-form.

Optical Character Recognition.

This is the mechanism for reading type written text from documents. Nowadays because we no longer use typewriters but word processors so instead of using OCR you should be looking at capturing the documents in its original format as we discussed at the beginning of the guide.

Intelligent Character Recognition

This term ICR has changed it’s meaning over time but it is now generally used to describe the recognition of hand writing. This is a very expensive and inaccurate technology and cannot be justified in the majority of applications. ICR cannot be used for general hand written letters but for text that has been written inside pre-printed boxes, although suppliers of this software may claim “up to 90% accuracy” this is often per character, so if are reading a name with 10 letters, the chances are one will be wrong. It also does not take into account human nature, let me give you an example. If I send out a form and say, “Fill out this form for the chance to win a new Mercedes car, but if you write outside the boxes your entry will become invalid”. You will see some very neat handwriting, however lets look at another extreme. “Fill out this form so that I can collect your taxes, if you write outside the boxes it will make it harder for me to collect this money” then compare the neatness. I think you get the message.

If your office is in Central London or Manhattan where staff and office costs are extreme you may be able to justify this technology, if not and particularly where there is high unemployment then it is probably best to use manual indexing.