Automatic indexing with OCR, rubber band OCR, keyword document separation, image redaction, clipping and more…
We refer to the CaptureBites MetaTool as our Swiss army knife. It turns Express into a powerful and intelligent data extraction and image editing software. You can use it to automate indexing with OCROCR stands for Optical Character Recognition and is a technique to convert a scanned image with machine printed text into data. and easy to configure rules.
You can separate documents based on keywords, reformat your index data, generate image clippings, cleanup & redact images, etc. The Validation client makes clever use of shortcuts, rubber band OCR, quick choice lists and database lookup to complete or correct data.
MetaTool functionality: press the buttons to learn more about each function
No coding but configuration
Short explanation of the MetaTool setup screen below
- The MetaTool setup shows the images of your current batch in the left panel. You can navigate through all the images and uses them to test your configuration. The green zone indicates the extraction zone. In our example, the bottom half of the page.
- The middle panel shows your extraction rules. You can add as many rules as required. MetaTool features 4 types of rules:
- Zonal extraction rules (a zone can be as large as the whole page)
- Find data rules
- Edit data rules
- Format data rules
- The right panel shows the index fields defined in Kofax Express. Pressing the text button shows the result in the “Processed value” column.
In our example, we can automatically extract the Giro Code with the MetaTool using just two rules. The first rule extracts the full text of the bottom half of each page. The second rule extracts the Giro Code with a Find word with mask rule (a word in MetaTool is text or digits preceded and followed by a space).
The Kofax Express MetaTool job is now configured and Kofax Express can be used to scan money transfer documents and index the Giro Codes completely automatically.
Automatic indexing with OCROCR stands for Optical Character Recognition and is a technique to convert a scanned image with machine printed text into real text.
Extracting floating data
Below you can see how the Giro Code (a standardized Transation Code that describes the purpose of a European money transfer) is floating around in the bottom half of the page. In this case we will read the the full textreading the full text with OCR takes less than a second with a modern PC. of the bottom half of the page.
We will then extract the Giro Code based on its format. The standardized Giro Code consists of 3 groups of numbers separated by slashes and preceded and followed with 3 + signs. An example of a Giro Code is: +++004/3525/92888+++
Signature & Mark Detection
Reformat index data
You can also change extracted index data to uppercase to generate consistent output.
In our example, the Giro Code is sometimes preceded with +++ and sometimes with ***.
High speed data validation
Use database lookup to complete other index fields efficiently and rubber band OCR to fill in data by drawing a rectangle around the information.
- Automatic lookup using extracted OCR data: For example extract the Tax ID (VAT Nr. ABN Nr,…) by means of MetaTool’s OCR and extraction rules and lookup the matching supplier name and email using database lookup.
- Multi Search Field lookup: Search a record through multiple lookup fields. For example search for a supplier by Tax ID, Name or Telephone number
- Drill down search: This is similar to searching for an address in a GPS – Sat / Nav device by drilling down from country to city to street. In business applications, drill down searches are used to search large database of employees, customers or products. Every search step filters the number of possible matches. For example, search first for the state, then for a company in that state, then for a customer name in that company.
As an example, we want to index foreign student records by looking up data in the student database. We want to be able to search for the students via ID number, name or city of origin. We also want the option to search the whole database, or only search the active records or closed records. By means of the File Status quick choice in below screen shot, we can filter the database and only look for active records, closed records or all records. If the ID number is on the document, you can type in the first digits and pick the correct student from the lookup list.
As soon as you select the correct student, all the other fields are automatically filled in.
If the ID Number is missing on the document, you can also lookup the student by Last Name or by City:
Machine printed student IDs or names will be extracted and looked up automatically without any manual intervention. MetaTool features an option to show the validation client if multiple hits occur. Very useful if an automatic name search results in multiple hits. The validation client, presents all the possible matches and the operator just selects the correct one. If the automatic lookup results in only one match, the document is processed automatically.
Rubber Band OCR
The OCR recognition is instantly filled out in the search field and other information is looked up without a single keystroke.
You can configure MetaTool to jump to the next mail item automatically if there is a unique match. In that way, you can just draw rectangles, one mail item after the other, MetaTool takes care of the rest.
Keyword document separation
With keyword document separation you can use MetaTool to search for a specific page number (for example “1 of”), a form number or title in a specific place, keywords like “summary sheet”, a company name, etc.
When MetaTool finds the defined keyword(s) it triggers a document separation.
You can also separate on last page or use zonal bar code recognition to separate documents.
Merge documents with the same index field value
All documents with the same bar code value will be merged in a single document regardless where the documents are positioned in the batch.
High speed image editing, cleanup & redaction
Redaction to erase sensitive or confidential information
Erase signatures or sensitive information from scanned images with the Erase Inside function. With automatic color detection, you can easily erases the selection with the surrounding background color. The Image Editor adjusts the erase color based on the place where you start drawing the selection.
Cleanup rough edges of damaged or old documents
With the erase outside function, you can erase anything outside the selection. The erase color is automatically selected based on where you start drawing the selection.
Crop part of an image.
Use the crop function when you need to crop photos to create security passes or store them in a database. Or if you want to store signature clippings in a signature verification database. In the MetaTool Image Editor setup, which can be configured per Kofax Express job, you can define a default selection size to obtain a consistent output size.
The image editor’s setup, includes a viewer to draw the default selection. You can use any image of your current batch as sample image.
For accurate work you can enter pixel precise dimension to define your default selection.
We also enabled sticky selection mode to make the selection appear on every image automatically.
Then press the Apply button or press ENTER (for left handed people) or SPACE BAR (for right handed people) to apply the actual crop.
Next press ENTER or SPACE BAR again to move to the next image.
Triple level undo
If you are not happy with the last action, press the undo button or press Backspace or CTRL+Z to undo the last action. A second undo clears the selection and a third undo restores the original image. You can apply triple level undo on any corrected image even after navigating away and back to it. Thanks to the undo function, there are no more annoying and inefficient “Are you really sure?” messages. If you make a mistake, just press the Backspace key.
Platform for new functions
About the download and included demo jobs
The download button on top of this page installs a functional version of this CaptureBites product. It also includes some demo images and a demo job to show the functionality.
In demo mode, a demo seal will be stamped in all exported images. You can switch the demo version to full production mode by entering an activation code which you can purchase from our partners. You can continue using any of the jobs you configured in demo mode after activation of the software.