060-530 MetaTool Extraction – OCR (extra languages) Rule
MetaTool’s Advanced OCR rule is recommended to handle traditional Western languages very accurately and very fast. However, when you have other languages using a different character set such as Russian (Cyrillic) or Arabic, you should use the OCR (extra languages) rule.
01 OCR (extra languages) – Add Rule
OCR (extra languages) is defined in the MetaTool Extract tab.
Press the Add button and select Zonal Extraction / OCR (extra languages) to add the extraction rule.
The OCR (extra languages) Setup window opens.
Select the index field to hold the extracted data.
Next, select the zone you would like to extract from. The zone can be full page, top/bottom half or a custom zone specified with the lasso tool.
In this example we want to read the whole page, so we select the Full Page Zone.
After this, we’ll adjust the OCR settings.
02 OCR (extra languages) – OCR Settings
08 – Languages: this setting enables the character set and dictionary of the selected languages. It is advised to only select the languages that are present on the documents.
In the example below, the Russian Language Setting is disabled. You can see that the returned test result contains weird characters that don’t match with the original text from the document.