Skip to content

Computer Vision

With your project properly loaded, it is now possible to use the BotCity Studio to generate automatic code through computer vision.

It is through the UI tab that the magic happens, it is there that we will select the clippings and actions so that the BotCity Studio can generate code that interacts with these elements, reproducing the indicated action.

Capturing elements for interaction

Bringing interact screen to the studio

The first step to start the capture process is to bring into BotCity Studio the screen you want to interact with.

To do this, press 'Print Screen' on your keyboard to capture the screen.

BotCity Studio will automatically load this screenshot on the UI tab.


For macOS, we need to use F9 or ⏩ instead of Print Screen due to OS limitations.

As an example, let's open the BotCity website, and follow the steps above.

Take a look at how the BotCity Studio looks after the capture:


Selecting elements for interaction

With the screen adequately loaded in the UI tab. Click once on the element or image you want to interact with so that BotStudio can zoom in on it.

Now, click and drag the mouse to select the area you want to interact (outlined in red).

In this example, we will interact with the 'Sign Up' button, as displayed in the image below.


When the crop is finished, the crop actions window will open.

Here, you should name this crop and choose one of the available actions to be performed after the image is found.

Available actions

  • Find: Find an element defined by the label on the screen.
  • Click: Click on the found element.
  • Move: Move to the center position of the found element.
  • Click Relative: Click Relative on the found element.

After configuring everything, press submit and check the automatically generated code in the Code tab.



Capture Visual Documentation


BotCity Studio enables the visual documentation tool by clicking on the icon above.

This icon helps to document the construction of your automation.

With it, you can select an area of the screen or area you are interacting with at a given moment and display it with your code on BotCity Studio.

This feature is handy for refactorings, identifying errors quickly, and understanding how the screen was when the automation was built.



With BotCity Studio, you can operate any system, such as desktop, web, and legacy/terminal environments.