Automatic indexing with OCR, rubber band OCR, keyword document separation, image redaction, clipping and more…

CaptureBites MetaTool 384We refer to the CaptureBites MetaTool as our Swiss army knife. It turns Express into a powerful and intelligent data extraction and image editing software. You can use it to automate indexing with OCROCR stands for Optical Character Recognition and is a technique to convert a scanned image with machine printed text into data. and easy to configure rules.

You can separate documents based on keywords, reformat your index data, generate image clippings, cleanup & redact images, etc.  The Validation client makes clever use of  shortcuts, rubber band OCR, quick choice lists and database lookup to complete or correct data.

  • MetaTool functionality: press the buttons to learn more about each function

The MetaTool presents itself as a standard Kofax Express export connector and passes through extracted data and processed images to an export destination of choice such as EmailFolder StructureFTP ServerDatabaseMS SharePoint Server & OnlineAlfresco, OpenText Content ServerXerox DocuShare, or other DMS systems.
  • No coding but configuration

Instead of one complex setup screen with hundreds of options, the MetaTool setup is based on a range of easy to understand rules such as “Extract Zone with OCR”, “Find line containing Words”, “Replace Text”, “Format Date”, etc. Testing the rules is just one click away. You can try your rules on any image in your current Kofax Express batch using the viewer integrated in the MetaTool setup screen.  
With the CaptureBites MetaTool you can tune Kofax Express to your very specific needs without scripting or a single line of code. It’s customization without the disadvantages and cost of custom development. With custom development, you need expensive interventions for every change. With MetaTool changing or adding rules is a matter of tuning the setup yourself.  


  • Enabling MetaTool

To enable the CaptureBites MetaTool, just select it as the export connector in Kofax Express. Then press the setup button next to the MetaTool to configure it.

Short explanation of the MetaTool setup screen below

  • The MetaTool setup shows the images of your current batch in the left panel. You can navigate through all the images and uses them to test your configuration. The green zone indicates the extraction zone. In our example, the bottom half of the page.
  • The middle panel shows your extraction rules. You can add as many rules as required. MetaTool features 4 types of rules:
    1. Zonal extraction rules (a zone can be as large as the whole page)
    2. Find data rules
    3. Edit data rules
    4. Format data rules
  • The right panel shows the index fields defined in Kofax Express. Pressing the text button shows the result in the “Processed value” column.

In our example, we can automatically extract the Giro Code with the MetaTool using just two rules. The first rule extracts the full text of the bottom half of each page. The second rule extracts the Giro Code with a Find word with mask rule (a word in MetaTool is text or digits preceded and followed by a space).

The Kofax Express MetaTool job is now configured and Kofax Express can be used to scan money transfer documents and index the Giro Codes completely automatically.

Just scan a range of money transfer documents or import them in Kofax Express when using a multifunctional digital copier (MFP). Then press Export and the MetaTool will automatically extract the floating Giro Codes by searching for its format. In our example we named the resulting PDF files according to the extracted Giro Codes. The result after export looks like this:
  • Automatic indexing with OCROCR stands for Optical Character Recognition and is a technique to convert a scanned image with machine printed text into real text.

Extract index information completely automatically with OCR from fixed machine printed zones during export.  OCR works excellent on clean machine print. In many cases the recognition process will be 100% accurate and completely automatic. You won’t have to correct the results.  Just scan and export, that’s it.

This is how the OCR setup looks like. Just draw a zone around the text you want to extract and quickly check with the Test button what the result will be. You can test with any image in your current batch. Finally select the Kofax Express index field you want to automatically fill in with the result during export. Setup done!

  • Extracting floating data

You can also read large text blocks and then use a mask to extract floating data such as an account number, invoice number, total amount, product ID etc.  Anything that has a more or less fixed format can be extracted from a large block of text.

Below you can see how the Giro Code (a standardized Transation Code that describes the purpose of a European money transfer) is floating around in the bottom half of the page. In this case we will read the the full textreading the full text with OCR takes less than a second with a modern PC. of the bottom half of the page.

We will then extract the Giro Code based on its format. The standardized Giro Code consists of 3 groups of numbers separated by slashes and preceded and followed with 3 + signs. An example of a Giro Code is: +++004/3525/92888+++

  • Signature & Mark Detection

Use the Mark Detection rule to detect the presence of a date, signature or check mark in a box.

MetaTool Mark Detection setup
  • Reformat index data

You can also use editing and formatting rules, to split a bar code value in several segments or replace slashes with dashes in dates so you can put them in  the filename.

You can also change extracted index data to uppercase to generate consistent output.

In our example, the Giro Code is sometimes preceded with +++ and sometimes with ***.

To get consistent output, we can easily define a “Replace Text” rule to replace all * with + and convert all codes to use leading and trailing +++. Because we want to use the Giro Code to name the resulting PDF file, we will also replace all / (shlashes) with – (dashes). You can easily do this by using the Replace Text rule.
  • High speed data validation

The MetaTool also features a highly optimized validation viewer. You can configure it to only display the fields you want to validate. Based on validation rules, the viewer will only show documents that don’t pass the rules. The relevant zone on the image is highlighted while the rest of the image is dimmed but still readable. For consistent data entry, you can define multiple choice fields. Through single letter shortcuts, any of the choices can be selected with a single key press. For very critical data, you can force the user to always check a field even if it passes the validation rules.

Use database lookup to complete other index fields efficiently and rubber band OCR to fill in data by drawing a rectangle around the information.

  • Database lookup

Kofax Express features basic database lookup out of the box, but MetaTool adds:

  • Automatic lookup using extracted OCR data: For example extract the Tax ID (VAT Nr. ABN Nr,…) by means of MetaTool’s OCR and extraction rules and lookup the matching supplier name and email using database lookup.
  • Multi Search Field lookup: Search a record through multiple lookup fields. For example search for a supplier by Tax ID, Name or Telephone number
  • Drill down search: This is similar to searching for an address in a GPS – Sat / Nav device by drilling down from country to city to street. In business applications, drill down searches are used to search large database of employees, customers or products. Every search step filters the number of possible matches. For example, search first for the state, then for a company in that state, then for a customer name in that company.

As an example, we want to index foreign student records by looking up data in the student database. We want to be able to search for the students via ID number, name or city of origin. We also want the option to search the whole database, or only search the active records or closed records. By means of the File Status quick choice in below screen shot, we can filter the database and only look for active records, closed records or all records. If the ID number is on the document, you can type in the first digits and pick the correct student from the lookup list.

As soon as you select the correct student, all the other fields are automatically filled in.

If the ID Number is missing on the document, you can also lookup the student by Last Name or by City:

Machine printed student IDs or names will be extracted and looked up automatically without any manual intervention. MetaTool features an option to show the validation client if multiple hits occur. Very useful if an automatic name search results in multiple hits. The validation client, presents all the possible matches and the operator just selects the correct one. If the automatic lookup results in only one match, the document is processed automatically.

  • Rubber Band OCR

During validation, use rubber band OCR to perform on-the-fly OCR of a zone. For example to lookup the email address of incoming snail mail, just rubber band the name…

The OCR recognition is instantly filled out in the search field and other information is looked up without a single keystroke.

You can configure MetaTool to jump to the next mail item automatically if there is a unique match. In that way, you can just draw rectangles, one mail item after the other, MetaTool takes care of the rest.

  • Keyword document separation

Kofax Express features automatic document separation using bar codes and patch codes. But what if your documents don’t have any bar codes or patch codes?

With keyword document separation you can use MetaTool to search for a specific page number  (for example “1 of”), a form number or title in a specific place, keywords like “summary sheet”, a company name, etc.

When MetaTool finds the defined keyword(s) it triggers a document separation.

You can also separate on last page or use zonal bar code recognition to separate documents.

  • Merge documents with the same index field value

MetaTool_Merge_SetupIf you have a batch of unsorted documents and you want to merge all documents with the same bar code value for example, then you can use MetaTool’s merge feature.

All documents with the same bar code value will be merged in a single document regardless where the documents are positioned in the batch.

  • High speed image editing, cleanup & redaction

Kofax VRS already solves a lot of image quality problems during scanning. Yet, there are still some image cleanup and editing functions that cannot be automated. For manual cleanup the MetaTool’s Image Editor is the most efficient way to produce perfect images in the shortest possible time.

Redaction to erase sensitive or confidential information

Erase signatures or sensitive information from scanned images with the Erase Inside function. With automatic color detection, you can easily erases the selection with the surrounding background color. The Image Editor adjusts the erase color based on the place where you start drawing the selection.

Cleanup rough edges of damaged or old documents

With the erase outside function, you can erase anything outside the selection. The erase color is automatically selected based on where you start drawing the selection.

Irregular edges cannot be corrected by VRS auto-crop.
Just select the part you want to keep. Based on where you start drawing the erase color is automatically selected.
Release the mouse button and the area outside of the selection is filled with the correct color.

Crop part of an image.

Use the crop function when you need to crop photos to create security passes or store them in a database. Or if you want to store signature clippings in a signature verification database. In the MetaTool Image Editor setup, which can be configured per Kofax Express job, you can define a default selection size to obtain a consistent output size.

For example in the settings on the left, we have defined a square default selection of exactly 256 x 256 pixels. We will crop all our photos to that exact dimension and export them as JPEG files to a folder.

 The image editor’s setup, includes a viewer to draw the default selection. You can use any image of your current batch as sample image.

 For accurate work you can enter pixel precise dimension to define your default selection.

 We also enabled sticky selection mode to make the selection appear on every image automatically.

Then simply scan all your documents and export. The MetaTool Image Editor will open and the default selection will be prepositioned. If required, you can simply click inside the selection to move it.

Then press the Apply button or press ENTER (for left handed people) or SPACE BAR (for right handed people) to apply  the actual crop.

Next press ENTER or SPACE BAR again to move to the next image.

 Triple level undo

If you are not happy with the last action, press the undo button or press Backspace or CTRL+Z to undo the last action. A second undo clears the selection and a third undo restores the original image. You can apply triple level undo on any corrected image even after navigating away and back to it. Thanks to the undo function, there are no more annoying and inefficient “Are you really sure?” messages. If you make a mistake, just press the Backspace key.

  • Generate Clippings

The MetaTool can also be used to clip a portion of an image and save it as a separate image. In that way you can preserve the original and the clipped image. In combination with the CaptureBites Multi Export connector you can even export each version to a different destination. For example, by means of a special clipping viewer, you can select the address from an envelope and export it as a separate image to populate a database or generate a report. Our Certified Mail Solution is a good example how address clipping can be used to fill out registered mail forms quickly and accurately.

  • Platform for new functions

Thanks to its modular design based on a collection of simple Rules, the CaptureBites MetaTool will be regularly updated with new functionality. So visit this page often to check what’s new or contact us and let us know what you would like to see next in the CaptureBites MetaTool.

About the download and included demo jobs

The download button on top of this page installs a functional version of this CaptureBites product. It also includes some demo images and a demo job to show the functionality.

In demo mode, a demo seal will be stamped in all exported images. You can switch the demo version to full production mode by entering an activation code which you can purchase from our partners. You can continue using any of the jobs you configured in demo mode after activation of the software.