To check the runner's local API you first need to check if you have visual automation enabled and on what port.
Once you are sure, you need to go to this address in your browser .
If you have visual automation enabled on a port other than the primary 7777 you must change the port in the link
Debug
Enables or disables debug overlay. Example visualization for visual automation looking for chrome icon.
# Request
PUT http://localhost:7777/api/v1/debug
Parameters:
"enable": true, #enable/disable overlay
"time": 5 #number of seconds for which the overlay is to be visible
# Response
200 OK
Click
It is used to click around the screen and comes in several flavors. For the sake of order, let's have the coordinate system as we count in case there is any doubt - from the lower left corner of a given screen.
By coordinates
The application clicks in the given coordinates on the screen.
# Request
POST /api/v1/click/coordinates
Parameters:
"screen": 0, # defaults to 0, but counts from 0+
"button": 0, # default 0-left, push button code either "left", "right" or "middle"
"double": false, # defaults to false, whether it is a two-click (true when yes)
"x": 123,
"y": 1233
# Response
201 Created - When the click went through without a problem
422 Unprocessable Entity - When the click failed or the data is incorrect
By image
The application finds something that looks like the substituted image. Click on the "center" of the found image.
# Request
POST /api/v1/click/image
Parameters:
"screen": 0, # defaults to 0, but counts from 0+
"button": 0, # push button code either "left", "right" or "middle"
"double": false # defaults to false, whether it is a two-click (true when yes)
"index": 0, # as we find, for example, several occurrences of the same image, the first image is 0, the next is 1 and so on, we decide here which image we click on
"offset_x": 123, # can be + and -, for example, -100 or 0 or 100, the default is 0
"offset_y": 1233, # can be + and -, for example, -100 or 0 or 100, the default is 0
"threshold": 998 # accuracy in the range 0..1000 calculated internally as float x/1000.0 i.e. 999 => 0.999, optional parameter
+ uploaded image somewhere here as multipart etc in request to know what to look for (png/jpg)
# Response
200 ok
404 not found
422 invalid request data
503 lack of synchronization
By text
The application finds text on the screen and then clicks it.
# Request
POST /api/v1/click/text
Parameters:
"lang": "pl" # language of the text
"text": "lorem ipsum" # text to find
"screen": 0, # defaults to 0, but counts from 0+
"button": 0, # push button code either "left", "right" or "middle"
"double": false # defaults to false, whether it is a two-click (true when yes)
"index": 0, # as we find, for example, several occurrences of the same text, the first text is 0, the next is 1 and so on, we decide here which text we click on
"offset_x": 123, # can be + and -, for example, -100 or 0 or 100, the default is 0
"offset_y": 1233, # can be + and -, for example, -100 or 0 or 100, the default is 0
"roi_x": 0 # The area of interest in which the texts are to be searched. X coordinate. By default, the ROI is set for the entire screen (selected)
"roi_y": 0 # The area of interest in which the texts are to be searched. Y coordinate.
"roi_w": 1920 # The area of interest in which the texts are to be searched. Width NOTE: width and height plot the image to the right and down from the selected coordinates.
"roi_h": 1080 # The area of interest in which the texts are to be searched. Height
"black_text": true # Set to preprocess the screenshot before running OCR.
# A value of false optimizes detection for light text on a dark background at the expense of
# dark text on a light background. Optional value, useful if you have a problem
# with detecting light text on a dark background. The default value is true
"do_not_preprocess" : false # Optional parameter. Allows you to disable image preprocessing
# for OCR purposes (except scaling to ~300dpi). It can help in special
# cases if none of the 'black_text' values gave good results
# Response
200 ok
404 not found
422 invalid request data
Hover
Moves the cursor to the specified coordinates.
# Request
POST /api/v1/hover
Parameters:
"screen": 0, # defaults to 0, but counts from 0+
"x": 123,
"y": 1233
# Response
201 Created - When the click went through without a problem
422 Unprocessable Entity - When the click failed or the data is incorrect
Type
Simulates pressing keys on a keyboard.
# Request
POST http://localhost:7777/api/v1/type
keys: "[control][space]" # presses together
keys: "safari[enter]" # writes and presses enter
keys: "jasiu[control][c]" # writes, presses together
keys: "[control][v]" # presses together
The syntax of "keys consists, of regular characters and sticky characters.
Sticky characters if they are directly next to each other are executed together.
When a group of sticky characters contains a modifier is executed
as mod_down + keys_press + mod_up.
Supported characters 0-9 A-Z.
Supported keys: shift, space, escape, enter, control, win, alt, altgr.
Support directory expandable.
Default character input delay: ~100ms
# Response
201 Created - When the click went through without a problem
422 Unprocessable Entity - When the click failed or the data is incorrect
The keys parameter contains the sequence of keys pressed. Type simulates pressing real keys, which, unlike pasting text, allows you to use keyboard shortcuts native to your system such as Alt+F4 or CTRL+V.
Keys are pressed at 50ms intervals to faithfully replicate the way you type on a keyboard without the risk of generating an unpredictable situation in a text field. When pasting a value into a text field, many editors behave differently (a single event) than when there are dozens of key press events (a multi-event of changing the contents of a text field).
Modifier keys are so-called sticky keys (alt, win, control, altgr, shift). When executing a sequence, put the key codes in [].
Examples:
Pressing ctrl+c combination => keys"[control][c]"
Pressing alt+f4 combination => keys="[alt][f4]"
Press ctrl+l (go to browser bar), then type the address and press Enter => [control][l]www.cloudflare.com[enter]
Pressing ctrl+l and typing http://www.w3schools.com/HTML/tryit.asp?filename=tryhtml5_draganddrop in the browser bar and pressing Enter => [CONTROL][L]https[shift][;]//www.w3schools.com/[shift][h][shift][t][shift][M][shift][L]/tryit.asp[shift][?]filename=tryhtml5[shift][-]draganddrop[enter]
Maximize window with win key and up arrow => keys="[win][up]"
The key support catalog is expandable. The currently available keys can be seen in the screenshot below. In addition to the special keys, the characters 0-9 and A-Z are supported as standard.
Ocr
OCR has two options, the first is Find which finds the coordinates of the text you are looking for, and Get which retrieves the entire text from the screen.
Find
Find (text coordinates) on the screen.
# Request
POST http://localhost:7777/api/v1/ocr/find
Parameters:
"screen": 0 # defaults to 0, but counts from 0+
"language": "eng" # language of the text
"index": 0 # offset of the text we are looking for, 0 is the first found
"roi_x": 0 # The area of interest in which the texts are to be searched. X coordinate. By default, the ROI is set for the entire screen (selected)
"roi_y": 0 # The area of interest in which the texts are to be searched. Y coordinate.
"roi_w": 1920 # The area of interest in which the texts are to be searched. Width NOTE: width and height plot the image to the right and down from the selected coordinates.
"roi_h": 1080 # The area of interest in which the texts are to be searched. Height
"black_text": true # Set to preprocess the screenshot before running OCR.
# A value of false optimizes detection for light text on a dark background at the expense of
# dark text on a light background. Optional value, useful if you have a problem
# with detecting light text on a dark background. The default value is true
"do_not_preprocess" : false # Optional parameter. Allows you to disable image preprocessing
# for OCR purposes (except scaling to ~300dpi). It can help in special
# cases if none of the 'black_text' values gave good results
# Response
200 ok
404 not found
422 invalid request data
Get
Download all the text from the screen.
# Request
POST http://localhost:7777/api/v1/ocr/get
Parameters:
"screen": 0 # defaults to 0, but counts from 0+
"language": "eng" # language of the text
"index": 0 # offset of the text we are looking for, 0 is the first found
"roi_x": 0 # The area of interest in which the texts are to be searched. X coordinate. By default, the ROI is set for the entire screen (selected)
"roi_y": 0 # The area of interest in which the texts are to be searched. Y coordinate.
"roi_w": 1920 # The area of interest in which the texts are to be searched. Width NOTE: width and height plot the image to the right and down from the selected coordinates.
"roi_h": 1080 # The area of interest in which the texts are to be searched. Height
"black_text": true # Set to preprocess the screenshot before running OCR.
# A value of false optimizes detection for light text on a dark background at the expense of
# dark text on a light background. Optional value, useful if you have a problem
# with detecting light text on a dark background. The default value is true
"do_not_preprocess" : false # Optional parameter. Allows you to disable image preprocessing
# for OCR purposes (except scaling to ~300dpi). It can help in special
# cases if none of the 'black_text' values gave good results
# Response
200 ok
404 not found
422 invalid request data
{
"plain_text": "Lorem ipsum, Lorem ipsumn", # string with text dumped from the entire screen
"boxed_lines": [ # list of AltoTextLine objects
{
"x": 18,
"y": 8,
"w": 2299,
"h": 30,
"text_line": [
{
"x": 18,
"y": 8,
"w": 256, # width
"h": 30, # height
"certainty": 0.9,
"word": "Lorem"
},
{
"x": 18,
"y": 8,
"w": 256,
"h": 30,
"certainty": 0.3,
"word": "ipsum"
}
]
},
{
"x": 18,
"y": 8,
"w": 2299,
"h": 30,
"text_line": [
{
"x": 18,
"y": 8,
"w": 256,
"h": 30,
"certainty": 0.9,
"word": "Lorem"
},
{
"x": 18,
"y": 8,
"w": 256,
"h": 30,
"certainty": 0.3,
"word": "ipsum"
}
]
}
]
}
Visual automation
Finding a picture.
# Request
POST http://localhost:7777/api/v1/visual/find
"screen": 0,
"index": 0, # not required, if empty return a list
"threshold": 998 # not required
+ uploaded image somewhere here as multipart etc in request to know what to look for
(png/jpg)
# Response
200 - ok
404 not found
422 invalid request data
The structure of the response, if an index is specified will contain only one entry.
The x and y coordinates define the upper left corner of the detected area keeping
the coordinate system with (0,0) in the lower left corner.
[
{
"x": 765,
"y": 649,
"w": 99,
"h": 99,
"certainty": 1.000000
},
{
"x": 765,
"y": 1439,
"w": 99,
"h": 99,
"certainty": 1.000000
},
{
"x": 765,
"y": 1643,
"w": 99,
"h": 99,
"certainty": 1.000000
}
]
Drag and drop
# Request
POST http://localhost:7777/api/v1/drag_and_drop
"screen": 0,
"button": 1,
"start_x": 123,
"start_y": 213,
"end_x": 123,
"end_y": 213,
"speed": 1 # time of action in seconds, defaults to 1
# Response
201 Created - done
422 invalid request data
Screenshot
Downloads a PNG with a screenshot of the selected monitor.
# Request
GET http://localhost:7777/api/v1/screenshot
"screen": 0
# Response
200 OK
422 invalid request data
Developer guides
Finding by image
When searching for an image (visual find) or click image (click image) to search for, it is crucial to prepare the master image correctly.
The quality of the results and the ability to match are closely linked to the size of the reference image. Pay attention to the DPI of the monitor on which the image will be searched. If the monitor operates at a high DPI (high pixel density) the search using a low DPI image will fail. The same will not work in the opposite direction.
Searching with a higher DPI image is possible if the image is reduced accordingly based on the DPI of the monitor before searching. Information about the pixel density is available in the status.
This requires the preparation of a master image of the selected icon.
A well-prepared icon
Poorly prepared icon
If the reference icon contains margins that are too large, the icon search will start returning a match on almost every round icon located in the icon bar on the page. This happens because of the matching area. Percentage-wise, the white margin and "roundness" of the symbol becomes more important than the actual icon inside. Proper preparation of the icon even gives a 100% (1.0 certainty) match.
For example, let's take a search or click on the ruby icon on the website.