Visual Desktop Automation

Desktop Automation with an open-source browser extension? Yes, no problem with UI Vision and its XModules.

Automate your desktop workflow the "Kantu-way" by using computer vision. UI.Vision RPA is the fastest way to create stable robotic process automation (RPA) scripts with image and text recognition on Windows, Mac and Linux.

Rock-stable visual desktop automation, screen scraping and application UI testing

UI.Vision RPA uses the latest image and text recognition technologies to automate applications just like a human does. Leave windows titles, windows handles, class names and other windows internals to the developers. And even if you are a developer – UI.Vision RPA gives you a break while it tests your app.

Even if you have the skills to automate your application by "coding" a test, wouldn’t you rather use your time to create the application, and not debugging and testing the test automation scripts themselves?

Screen Scraping

Data extraction (“Screen scraping” ) is a very important technique in data migration and integration scenarios with surface automation. With its accurate OCR screen scraping features UI.Vision RPA essentially adds an “Data API” to every Windows, Mac and Linux application. For more information please read screen scraping with OCR.

Citrix UI Automation

In many enterprises the end user applications are published via Citrix XenApp. A Citrix server only sends screenshots of the live application that runs on the Citrix server back to the client, so all classical automation tools fail because they cannot access the logical elements that make up the user interface. But Citrix automation is no problem with Kantu's computer vision: With its powerful image recognition engine and on-screen OCR (optical character recognition) UI.Vision RPA can automate Citrix applications just like a regular desktop application.

App Scripting via API

UI.Vision RPA contains a command-line application programming interface (API) to automate more complicated tasks and integrate with other programs or scripts for complete Robotic Process Automation (RPA).

This means that you can access Kantu’s web and desktop automation functionality from any programming language on Windows, Mac and Linux. Developers can use Python, PowerShell, C#, Java, SAP, VBS, Visual Basic,... or any other programming or scripting language to embed and control UI.Vision RPA directly in their applications.

Any task you can do on a computer can be scripted with UI.Vision RPA and thus automated. Awesome!
Frank Zimmermann, Lufthansa IT - More user quotes

Desktop UI Automation, also known as Robotic Process Automation (RPA)
UI.Vision RPA is the tool for Cross-Platform Robotic Process Automation (RPA). UI.Vision RPA runs on Windows, Mac and Linux.

User Manual: Desktop Automation with UI.Vision RPA

The DesktopAutomation XModule is a native app for Windows, Mac and Linux. It adds "hands" and "eyes" to the UI.Vision RPA core. The XModule directly interacts with the operating system and allows UI.Vision RPA to run computer vision directly on the desktop, move the mouse and simulate keystrokes. The DesktopAutomation XModule is included in the UI.Vision RPA XModules Installer.

How to create desktop automation macros:

  • In the UI.Vision RPA settings go to the VISION tab and select "Desktop Automation" as operating mode. This switches the UI.Vision RPA eyes from the web browser to the desktop. It also switches the "Select" and "Find" buttons operate on the desktop.

  • Visual macros are best constructed like a Lego car: Add XClick after XClick command to the macro, and build the macro step by step.

  • Note that the "Record" button is only for Selenium IDE type browser automation macros. Recording is not available for the XClick, XMove and XType commands.

desktop automation

XDesktopAutomation | true/false: You can use this command to switch the UI.Vision RPA eyes between browser and desktop. It overwrites the global UI.Vision RPA eyes setting on the "Vision" settings page for the current macro. Note that a switch between desktop and browser scope changes the coordinate system, too. In browser view XClick (0,0) is the top left point inside the browser viewport, and in desktop mode XClick (0,0) refers to the top left point on your desktop.

CaptureDesktopScreenshot | file name or full path: This command allows you to script taking desktop screenshots. If the screenshot name parameter is just a name (e. g. "LinuxScreenshot"), the screenshot is stored in the internal HTML5 storage, just the like "inside the browser" commands CaptureScreenshot and captureEntirePageScreenshot do. If parameter is an absolute path (e. g. "c:\test\desktopscreenshot.png"), the screenshot will be stored directly on the hard drive.

Desktop Automation Demo Macros

UI.Vision RPA ships with the DemoXDesktopAutomation and DemoXDesktopAutomation_OCR demo macros.

The DemoXDesktopAutomation macro uses UI.Vision RPA itself to demo visual GUI automation and GUI testing. For this purpose the macro firsts restricts the image search range to the area under test (clipping area) to the tab area. Then, one after another, the macro selects tabs, and tests the "Clear" button. Finally on the "Vision" tab it uses visualAssert to make sure that the test sequence was successful.

The DemoXDesktopAutomation_OCR macro automates exactly the same workflow, but instead of using images of the tabs to select, it uses text recognition to read the text on the screen. So it uses the string "Logs" to find the "Logs" tab. Text recognition (OCR) makes the macro very robust against color and font changes. And the macro is easier to create, since you do not have to create input images for each command, you can just type text.

Other UI automation and RPA demos: Browser Extension Testing with UI.Vision RPA (forum post).

Subscribe to the UI Vision RPA software newsletter . We'll send you updates on new releases that we're working on.