The XClick and the XMove commands allows you to send real user mouse clicks to web page elements. The advantage of this approach is that the web app reacts 100% the same way as if a real human user executes this click. In contrast the standard Selenium IDE Click and ClickAt commands operate on the Javascript level inside the browser DOM, and can sometimesr require dedicated debugging and tweaking to make them work (e. g. finding the right locator) or they can fail completely on complex websites, as the website logic swallows the click event and does not send it to right element. XClick avoids all these problems by simulating a real user.
This video shows how to build a macro with XClick and XMove. The video uses the desktop automation mode, but everything works the same in web automation, except that the screenshots are taken directly inside the web browser. Note that XClick and XMove commands can not be recorded like Selenium commands. Instead, you build these visual macros by adding one visual command after another.
The XClick, XMove and XType command need the RealUser XModules to be installed. And please note that the RealUser module requires n unlocked user desktop to simulate events - no real human user can work with a locked desktop either.
The classic Click commands as used by Ui.Vision RPA and any other Selenium IDE or browser automation tool do not simulate real mouse clicks! Instead they operate on the Javascript level inside the browser DOM. This the reason why these commands all first require a locator to identify a web page element, and then send a Javascript mouseclick event to this element. On simple websites this is often enough and works well. But on modern complex websites (e. g. Google Gmail, Google Spreadsheet, Microsoft Office 365, Facebook, Instagram,...) it is often difficult to find the right element locator in the first place. And even if you found the right locator, it can happen that the website logic swallows the Javascript click event event and it never reaches the target right element. In contrast the XClick and XMove commands avoid all these problems by simulating a real user click and mouse events.
Example of a web app click that fails with the standard CLICK command, but works just fine with XCLICK. Note that the same
locator is used in both cases.
The real-user commands XClick and XMove have one drawback: As they simulate real user input, they require an unlocked user desktop to simulate events, and the browser window has to be in the foreground. This is just the same requirement that you would have for a real human operator - no human can click on something if the desktop has been locked. Here the classic click/clickAt commands wins: They work just fine with the browser in the background and/or hidden, as everything happens just inside the web browser.
Multi-Monitor Support: Note that the the image search/OCR for XClick/XMove is always done on the main screen. So even if Ui.Vision runs e. g. on the 3rd monitor, the computer vision will search the main screen only. This means the browser or app that you want to automate with XClicks needs to be placed on the main screen. We plan to add multi-screen support for image search with a future update.
So for standard websites we recommend to first test if the classic Click command works. If it works, great. And if not, you have XClick in your toolbox. It will work for sure!
Automating websites with canvas elements (e. g. drawing apps like Sketchpad.io and most online games) using the ClickAt command could work, but it is impossible to find a XPath or CSS locator for ClickAt inside the canvas element. The best solution for canvas test automation is to use the computer vision image search of the RPA software. It searches visually for the image (e.g. a button image) that you want to click. For more details about image-driven test automation, see the Visual UI Testing page. RPA computer vision plus XClick/XMove/XType can automate everything inside canvas elements. More about using images as locator in the next paragraph below.
The standard selenium IDE way to trigger clicks inside an iframe is to first use selectFrame to switch to the right frame and then send a click event to the element. However, finding the right iframe number and the right locator can be tricky. And to make things even more complex, on some websites the iframe order and/or locator can change with each page reload . But the good news is that there is a better way to click inside iframes: Use XClick (image). The short screencast below shows how this is done in case of an embedded Youtube:
The locator input tells the command the location on the browser viewport where to send the native click and move events.
"Locator" for XClick/XMove | Comment |
---|---|
(xpath or ID) | XClick can work with a normal page element locator, exactly the same one as you would use for the standard Click or Type commands. In this case Ui.Vision RPA locates the element, runs element.getBoundingClientRect() and then sends a click event to the center of the rectangle. |
image@conf.Level | XClick can use images as input! - this works exactly the same way as with other visual commands like visualAssert. This is the easiest method to locate places to click, as it works 100% image-driven. Ui.Vision RPA's built-in computer vision finds the image in the active vision area and then sends a click event to the center of the image. If there is more than one image that matches the input image the best match is selected. But you can force Ui.Vision to select another match above threshold with #. So to click the 3rd matching image use image_dpi_96.png#3 or image_dpi_96.png6@0.75#3. Once # is added, all matches above the confidence level are treated as equal and are sorted from top/left to bottom/right. Related forum posts: Top/left to Bottom/Right sorting explained and How to click all matching images. |
XClickText|text to click (exact match) XClickText| *text to click* (wildcard) XClickText|*text to click*@pos=X |
XClickText can read just like a human: When you use XClickText | text, Ui.Vision RPA takes a screenshot of the page, reads it (OCR) and then finds the text to click. Since this works
on the rendered website image, it operates just like a human. Unlike with the HTML-based xpath text() command,
there are no tricky xpath, iframe and Javascript issues to take care.
The built-in computer vision of the RPA software finds the text in the active vision area and then
sends a click event to the center of the text. This work with text inside images, videos or PDF documents, too.
Before V9.2.0: The default text match mode is partial match: "XClickText | day" matches "day" but also "today" or "yesterday's".
For exact match use square brackets [text].
V9.2.0 and newer: We improved the text search and introduced wildcard support (* and ?). Now the Ui.Vision search behaves like a standard text search on Windows.
The default text match mode is exact match: "OCRSearch | day" matches "day" but not "today" or "yesterday's".
For partial text match use wildcards: "OCRSearch | *day*" matches "day" and also "today" and "yesterday's".
To click the X-th occurrence of a text string, use text@pos=X. The occurrences are counted from top left to bottom right. Another option
to exclude some matches is to limit the search area.
Related forum post: How to use XClickText and XClickTextRelative? . |
x,y | The raw way to send clicks by using x/y coordinates is also available. This can be useful if you know or can calculate the exact x/y position. For example, you can first do an image search, then calculate the coordinates as offset to this initial position. If you want to re-use x/y from a previous image search, you find them in ${!imagex} and ${!imagey}. Also, there are good tools to determine x/y manually. Note that in browser automation mode the XClick | 0,0 value is the top left of browser viewport. In desktop automation mode it is the top left of the main desktop. |
If you only want to check for the presence of an image you can use visualAssert.
XClick can simulate all kinds of mouse clicks.
Click event command | Comment |
---|---|
„“ (empty) or #left | Standard left mouse click |
#doubleclick | Simulates double-click of the left mouse button |
#tripleclick | Simulates triple-click of the left mouse button (related RPA tutorial video, starts at 2:50) |
#ctrlclick | Simulates CONTROL-Click with the left mouse button (related forum post) |
#shiftclick | Simulates SHIFT-Click with the left mouse button |
#middle | Simulates click of the middle mouse button |
#right | Simulates a mouse right-click. Typically this is used to bring up the context menu. |
XMove simulates mouse events. For simulating mouse clicks, use XClick.
Click event command | Comment |
---|---|
„“ (empty) or #move | Simulates mouse move events. A move over an element also triggers the mouse over effect. |
#up | Simulates mouse up event of the left mouse button |
#down | Simulates mouse down event of the left mouse button |
The DemoXClick macro that ships with Ui.Vision includes many XMove commands, too. It uses the command to move the mouse cursor when drawing a rectangle on the canvas element.
XMove is used for dragging & droping items in the web browser and on the desktop. Drag and Drop is a sequence of the two mouse events XMove |...|#DOWN and XMove |...|#UP:
Drag and Drop with XMove. For details see this forum post: Drag & Drop with RPA software
XClickText works like the XClick but uses TEXT instead of an image to find the place to click. And just like XClick automatically retries the image search if the image is not found, XClickText automatically retries the text search if the word is not found. It retries until the !timeout_wait limit is reached. So the retries work exactly as with XClick, but for text search instead of image search.
Text recognition (OCR) is very stable against screen resolution changes. Even changing the screen resolution/screen scaling from e. g. 100% to 200% does not confuse an XClickText command. See also this forum post: How to use XClickText and XClickTextRelative?
XMoveText works the same as XClickText, but triggers a "mouse over" event instead of a click.
XClickTextRelative is available for web automation and desktop automation. If you use it for desktop automation, see the Calibrate XClickTextRelative paragraph below. In web automation mode (OCR inside the web browser) you do need the calibration.
To get started with the command, we recommend that you run "DemoXClickTextRelative" to see the command in action. This macro is installed with Ui.Vision by default.
What does for example "#R5,-10" mean?
XClickTextRelative | Login#R5,-10 means that computer vision first finds the word "Login", then - from the
center of this word - the RPA software moves the cursor 5 chars to the right and 10 chars down (Down because it is -10, Up would be +10).
The term "#R" has no special meaning, it is only the separator between anchor word and coordinates.
XClickTextRelative works similar to XClickRelative. But instead of an anchor image, it looks for an anchor WORD. Then, from this word, it goes X times left/right and Y times up/down.
Image: You can also use the ...TextRelative command to shift the click position from the default (center of the bounding box).
XMoveTextRelative works the same as XClickTextRelative, but triggers a "mouse over" event instead of a mouse click.
If you use these three commands in desktop mode AND you are on Windows AND you are using a HiDPI screen AND you are using a screen scaling that is different from 100% then (and only then) you need to calibrative the relative distance calculation used by the RPA software. If you are on a different system and use the calibration, nothing bad happens. It is just not needed then.
To calibrate, do this: - Enter your screen scaling value in the box e. g. "200" for a 200% screen scaling. - Press the "Calibrate XClickTextRelative" button. This also calibrates XMoveTextRelative and OCRExtractRelative. - A desktop OCR preview window opens it. You can close it. The value in the OCRTEXTX box gets adjusted automatically by the calibration (if needed). For example, it might change from "6" to "13". The calibration is now done.
(This page is work in progress. You find a full example in the Demo-XClick that gets installed with the RPA software.)
Command | Target | Pattern/Text |
---|---|---|
open | https://ui.vision/ | |
XClick | button.png | #doubleclick |
echo | done |
Relative clicks allow you to click on something relative to something else. The idea is always that you find something (easy to locate) and then click on an often changing, very small or not unique element next to it. Or you find a label and then click in the input box to the right/left/top/below of it. Using Relative Clicks helps to create very stable visual automation scripts.
Screenshot: Using relative clicks. You draw the green box around the area to find, and the pink box indicates the area to click. So Ui.Vision RPA only searches
for the image inside the green box. If you want to use this feature, make sure to use an X... command that ends with "...Relative".
Screenshot: Make sure that the green and pink are not inside each other, that confuses the computer vision.
OCRExtractRelative is explained in this video. Creating the green/pink box works the same for all ....Relative commands. The green box is always the anchor image. The difference is the meaning of the pink box: (1) For OCR, the pink box marks the text area to read. (2) For visionLimitSearchAreaRelative the pink box marks the new search area for computer vision. (3) For XClickRelative/XMoveRelative the middle (center) of the pink box marks the point to click.
You can draw the green and pink boxes with any image editor, but it will be easiest to use the build-in editor. It is probably the most basic image editor in the world, as its only features are to draw green and pink box (XClickRelative forum post).
If you draw the boxes with an external image editor such as Photoshop, Microsoft Paint or Snagit, make sure that the colors of the boxes are code #00FF00 for green and #FE1492 for pink. Also make sure that the frames are one solid color, and that the tools do not apply any shadow effects, color gradients or similar visual effects to the border. This could disturb the box-finding logic. Of course, if you draw the boxes with the Ui.Vision RPA editor, this is automatically taken care of.
Ui.Vision RPA for Chrome, Edge and Ui.Vision RPA for Firefox with the XRealUser module installed.
The XType, XClick and XMove need the RealUser Simulation XModule to be installed.
XMouseWheel simulates mouse wheel scroll events. A negative value means to scroll down/zoom in and a positive value is used to scroll up or zoom out:
How to simulate mouse wheel scrolling with XMouseWheel: The video shows how to automate zooming in and out on Google maps. First we "mouse over" the city of Oxford with XMove and then we zoom in (-200) and out again (+200). We use two XMouseWheel commands for each scroll direction simply to slow the scrolling/zooming down a bit, so it is easier to see.>
Demo-XClick
The ready-to-import-and-run source code of all demo macros can be found in the Open-Source RPA software Github repository.
XMove, XType, XModules User Manual
As inspiration of what can be with the XClick, XMove and XType commands, we compiled an interesting list of web automation feature requests that we received from users over the last months before the release of the RealUser XModule. Now, all of these tasks can automated with the RealUser XModule.
How do I press "Ctrl C" and "Ctrl V" for copy and paste? It says can't find element - I'm trying to target cells on an excel sheet in google sheets and they don't have unique identifiers so I need to target "by coordinates" and then submit copy and paste WITHOUT caring what element i'm "sending" to.
I was trying to automate actions at https://www.ibm.com/... (URL removed) and would click on main documentation navigation headings (on the left) but the system couldn't detect when the AJAX response finished so it could verify an element. In retrospect I think it wasn't detecting the frame I was accessing.
It can't detect bootstrap modal.
Didn't work in Google Slides: I tried to record changing color and it didn't record anything except opening Google Slides (when it was already open)
I wanted to record click/drag operations in screen space, not on page elements. Ui.Vision RPA does not appear to support this.
add either the ability to send control keys or run an extension
I want mouseMove, mouseDown, mouseUp, keyboard events and timeline to record use cases for debugging purposes
Selenium IDE failed on my tests with our website (SPA, multiple frames). The error message was "can't access dead object"
compatibility with arrow, command/control etc. keys
In the website I'm testing, most links are accessed by hovering over a menu and then a submenu. When re-running a recorded macro, it failed after a time-out period when the link did not appear (because it did not perform the hover).
Probably outside the scope of the program, I'm looking for something with position based clicking rather than dom targeting.
flash support?
looked like it worked but didn't actually activate buttons within gmail app to delete emails. Nothing appeard in the Trash folder??
Add the possibility to handle the closing of the actual tab.
I needed to click on an icon extension, but it did not work
When the website is created with Angular JS, the widgets change dynamically so the IDs of the elements change and Ui.Vision RPA is not able to see the elements.
doubleclick not supported,
it doesnt work on calendars year navigation,
Don't record angular webs
I need it to simulate all mouse movements not just controls being clicked or text boxes being filled
All of this can be done now - so if one if these questions is similar to yours, and you are not sure how to solve it, please ask in our forum.
...then please contact us.