Text Recognition (OCR)

How to Use Text Recognition (OCR):

Ui.Vision includes two different local OCR engines. OCR is used for the XClickTEXT/XClickTEXTRelative command and for screen scraping with OCRExtract/OCRSearch. A Javascript OCR engine and the XModule OCR engine. In both cases, absolutely no data is sent to the cloud. All OCR processing is done locally on your machine.


Local OCR with the Javascript OCR engine. See also How to use XClickText and XClickTextRelative?

Difference between Javascript OCR and XModule OCR

Both solutions work 100% local. After a fresh installation, the default is the local Javascript OCR.

The advantage of using Javascript OCR is:

  • - Works on Windows, Mac and Linux (by contrast XModule OCR supports Windows and Mac only)
  • - Good OCR results, especially on black text on white backgrounds (e. g. typical website, receipts, PDF)
  • - New OCR languages can be added on short notice. You can request new OCR languages in our RPA forum. Enterprise RPA users can request new languages directly from tech support.

The engine code for this OCR Engine is 98. So you can select it with store | 98 | !OCREngine.

The advantage of using XModule OCR is:

  • - Better OCR results. Especially for numbers and for text on complicated backgrounds (images, videos, games) this OCR engine is often better. This might be especially important for OCR screen scraping.
  • - Faster. The larger the OCR area, the larger the OCR input area, the larger the performance advantage.

The engine code for this OCR Engine is 99. So you can select it with store | 99 | !OCREngine.

Text processing with Cloud OCR

Before you can use Cloud OCR you need to enter an api key. You can get an OCR api key for free here: Get Free OCR API. Once you receive the api key, enter it on the Ui.Vision "OCR" tab and press "Test" to store it. If you need to change api keys, just enter the new key and press "Test" again. This will overwrite the old api key and store the new one instead.

You can test the cloud OCR quality here: Online Ocr.The advantage of using Cloud OCR is:

  • - Works in Windows, Mac and Linux.
  • - High quality Cloud OCR. The text recognition is done on powerful machines in remote datacenters with a very strict privacy policy.
  • Support for OCR.Space PRO/PRO PDF endpoints is built-in: If you have a commercial OCR API PRO or PRO PDF api key, you can use this api key. Ui.Vision detects the use of a PRO api key, and automatically uses the faster and redundant OCR.Space PRO endpoints.

The engine codes for Cloud OCR Engine are 1 and 2. This selects the OCR Engines 1 or 2 from OCR.Space. You can select it with store | 1 (or 2) | !OCREngine.

Wildcards for text matching

You can use the * and ? wildcards with any text matching command: XClickText, XClickTextRelative, XMoveText, XMoveTextRelative, OCRSearch, OCRExtract and OCRExtractRelative. The * wildcard character matches zero or more characters. The ? wildcard character matches any single character.

Anything wrong or missing on this page? Suggestions?

...then please contact us.

Fresh from the Ui.Vision Forum: The Latest 3 RPA Software Discussions

← Meet the Ui.Vision team and users on our RPA software forum.