160-020 MetaServer Convert – Convert to Searchable PDF
Important: To enable the searchable PDF action, you first need to install the MetaServer Searchable PDF Module.
With MetaServer’s Convert to Searchable PDF action, you can convert image-based (scanned) PDF, TIF and JPG files to searchable PDF files.
It has the unique capability to make a PDF partially searchable to reduce processing time. For example, if you would only want to make the first 3 pages searchable, you would specify pages “1-3” in the settings. More about this later.
To output Searchable PDF:
Step 1: add the Convert to Searchable PDF action just before the Export action(s) used to output the Searchable PDF.
Step 2: in your Export action(s), select “Processed PDF” as File source. More about that later.
01 Convert to Searchable PDF – Adding the Action
To add a Convert to Searchable PDF action, select the action after which you want to insert the Convert to Searchable PDF action and press Add -> Convert -> to Searchable PDF. The Setup window will automatically open.
You can also open the setup window of an existing Convert to Searchable PDF action by double-clicking the action or by pressing the setup button on the right side of the action or in the ribbon, as shown below.
In our example, we will make use of the “CB – DPE” workflow. This workflow is automatically installed with CaptureBites MetaServer.
01 – Page(s): the Convert to PDF action has the unique ability to make a PDF partially searchable. You can specify the page number or ranges separated by commas. To convert all pages, leave the field empty.
1-5 = convert the 1st page to the 5th page
1,3, -1 = convert the 1st, the 3rd and the last page
-1 = convert the last page
2–1 = convert the 2nd page to the last page
1, 3-5, -2 = convert the 1st page, 3rd to the 5th page and the page before the last .
02 – Languages: press the dropdown arrow to select the language used in most of your documents.
You can select multiple languages, but we recommend to only do so when you have many documents of all selected languages. If it’s exceptional that you have a document in another language, only select the main language. Selecting more languages will slow down the conversion process.
03 – Improve OCR on low resolution images: to create a searchable PDF, we make use of OCR (Optical Character Recognition) technology. This works best with documents scanned in a resolution between 300 and 400 DPI. If you have document scans with a lower resolution, like 100 or 200 DPI, we recommend enabling this option to improve the OCR result. This option won’t affect 300 DPI or higher resolution scans.
TIP: you can copy the current settings and paste it in another setup window of the same type. Do this by pressing the Settings button in the bottom left of the Setup window and by selecting Copy. Then open another setup window of the same type and select Paste.