060-550 MetaTool Extraction – Mark Detection Rule

MetaTool’s Mark Detection rule makes it possible to detect the presence of a check mark in a check box or a signature in a signature box.

For example, legal documents, contracts and agreements need to contain a signature and if they are not signed, they should be flagged as invalid. Forms often contain check boxes and the Mark Detection rule can be used to detect if they are checked or not.

01 Mark Detection – Add Rule

Mark Detection is defined in the MetaTool Extract tab.
Press the Add button and select Zonal Extraction / Mark Detection to add the extraction rule.

The Mark Detection Setup window opens.

Select the index field to hold the extracted data. In this case, we want to detect the presence of a signature.

Next, select the zone you would like to extract from. The zone can be full page, top/bottom half or a custom zone specified with the lasso tool.

With the lasso tool we draw a zone around the signature location.

Next, we’ll adjust the Image Processing and/or OCR settings. We’ll start with the Image Processing settings first.

02 Mark Detection – Image Processing Settings

03 – Brightness: (represented by the small sun symbol)

By increasing the brightness value, you make the scanned image brighter. This can be very useful when working with documents that contain a lot of noise or background pattern, like the document in the screen shot below.

04 – Drop out:  when working with forms with lines and labels in red, green or blue, we can filter them by using the drop out setting.

The following example contains such a document. The check boxes are printed in red. We can remove them completely by setting the drop out filter to red. This makes it very easy to detect if the box is checked or not. Without a check mark, the zone will contain close to 0% black pixels. Any trace of a check mark will increase the black level percentage considerably.

Open the Drop out pick list and select the color you wish to filter, in this case Red. Press Test and the red check boxes completely disappear, only the check marks remain. This makes it easy to detect whether a check box is marked or not using the Black level.

05 – Thickening: in some cases, the documents have been signed or checked using a pencil.
You can use the thickening option to make the data bolder in the selected direction(s) to enhance faint check marks or signatures.

TIP: When designing a check box form, it is recommend to add clear instructions like: “Fill out this form using a black or blue pen”.
06 – Append to original value: the result will be added to the value that was already in the index field. Disable this option to overwrite the previous value with the new result.
07 – Clear original value if result is blank: when the Mark Detection process returns nothing, any value already in the index field generated by previous rules or by Kofax Express will be cleared.

08 Mark Detection – OCR Settings

09 – On Page(s): sometimes the information is on another page than page 1. With this option, you can exactly define which page to extract data from.
10 – First document only: only reads the pages of the first document.
11 – Align Zone: when documents in a batch are of varying sizes or mixed orientations (portrait and landscape mixed together), you can align your Mark Detection zone in relation to any of the 4 corners of the image: the top left or right corner or the bottom left or right corner. That way the Mark Detection zone will be positioned correctly on all sizes and orientations.
Bottom right alignment of a Mark Detection zone on a portrait oriented image
Bottom right alignment of the same Mark Detection on a landscape oriented image

12 – Black level: the Mark Detection tool calculates the amount of pixels, in percentage, in the Mark Detection zone. The Black level defines the threshold to consider a box being checked or not. Boxes filled with more black than the Black level are considered checked (or signed). Boxes filled with less black than the Black level are considered not checked (not signed).

If a box is checked, we can define the returned index value in the If value reached setting. If the box is not checked, the value defined in the Otherwise setting will be used.

In this example, when the Black level is greater than 1%, the “Bookmark” index value will be set to “Anesthesia Record”. Otherwise, when the Black level is less than 1%, the index value will be blank.

Checked box “Anesthesia Record” 
Unchecked box “Anesthesia Record”

If you want to see the complete job configurations, install MetaTool and inspect the “CB MetaTool OMR Separators” and “CB MetaTool Signature Detection” sample jobs.