📒 Script Specifications

This section explains the specifications of scripts that can be used in PowerClicker.

Overview

  • PowerClicker uses Lua 5.2 as the scripting language.
  • In addition to standard Lua functions, PowerClicker provides custom functions (e.g., tap()) to define scenario actions.
  • When a scenario runs, run() function is called automatically. This function must be implemented in the script.
  • If a scenario requires screen capture for image detection or text detection, the script must implement the should_capture_screen() function and return true. If false is returned or if should_capture_screen() is missing, the scenario will terminate when attempting to perform image or text detection.
  • If an error occurs during script scenario execution, the execution will be interrupted. The Scenario Execution Log will display the error details, so please check the log to identify and resolve the issue.
  • The print() function in Lua is available for use. Any output from print() will be displayed in the Scenario Execution Log.

PowerClicker Custom Functions

delay(milliSeconds)

Pauses execution for the given time (in milliseconds).

  • Arguments
    • milliSeconds: Time to wait in milliseconds.
  • Return Value

    None

tap(x, y, duration)

Taps on the screen at the given (x, y) coordinates for the specified duration.

  • Arguments
    • x: X-coordinate of the tap position.
    • y: Y-coordinate of the tap position.
    • duration: Time (in milliseconds) to hold the tap.
  • Return Value

    None

gesture(path, duration)

Performs a gesture operation on the screen using a specified path. This function allows you to execute a single-finger gesture by providing a sequence of keyframes.

  • Arguments
    • path: Table. A list of keyframes defining the gesture path.
      Each keyframe is specified as {x, y, frame}, where:
      • x : X-coordinate of the touch position.
      • y : Y-coordinate of the touch position.
      • frame: Time in milliseconds from the start of the gesture when the touch point reaches this coordinate.

      Example: Swipe from (500, 500) to (100, 100) in 100 milliseconds

      gesture({{500,500, 0},{100, 100, 100}}, 100)
    • duration: The execution time for the gesture.
      • If the duration matches the last keyframe’s frame value, the gesture will move at a constant speed.
      • If a smaller duration is specified, the gesture will execute faster than the original timing defined in the keyframes.
  • Return Value

    None

gesture(path_list, duration)

Performs a gesture operation on the screen using multiple paths, allowing for multi-finger gestures (e.g., pinch, zoom, multi-finger swipes)

  • Arguments
    • path_list: Table. A list of path, where each path represents the movement of a single finger. Each path consists of keyframes defined as {x, y, frame}, where:
      • x : X-coordinate of the touch position.
      • y : Y-coordinate of the touch position.
      • frame: Time in milliseconds from the start of the gesture when the touch point reaches this coordinate.

      Example: Pinch Gesture (Zoom Out) Centered Around (500, 500) Over 1000ms

      gesture({{{450, 450, 0}, {200, 200, 100}}, {{550,550, 0},{800, 800, 100}}}, 1000)
    • duration: The execution time for the gesture.
      • If the duration matches the last keyframe’s frame value, the gesture will move at a constant speed.
      • If a smaller duration is specified, the gesture will execute faster than the original timing defined in the keyframes.
  • Return Value

    None

input_text(text)

Inputs text into the currently focused text input field.

  • Arguments
    • text: The text to be entered
  • Return Value

    Whether the text input was successful

input_text(element, text)

Inputs text into the specified text input field.

  • Arguments
    • element: The element information of the text input field.
      This can be obtained as the return value of detect_ui.
    • text: The text to be entered
  • Return Value

    Whether the text input was successful

detect_image(image_id, require_detection_count, parameters, timeout)

Performs image detection. The image used for detection must be pre-registered in the app.

When detect_image is used for image detection, the result is returned as a value.
You can use the detected bounding box from the result to perform actions such as tapping the detected image.

  • Example: Detect Image ID “3” and Tap its Center
    local result = detect_image("3", 1, {grayscale = true, error_tolerance = 70, scaling_level = 0, }, 1000)
    if result ~= nil then
        local r1 = result.rects[1] -- The first detection rectangle
        -- Tap Detected Image            
        tap(r1[1] + r1[3] * 0.5 ,r1[2] + r1[4] * 0.5, 100)
        delay(100)
    else
    end
  • Arguments
    • image_id: String. The ID of the image to detect.
    • require_detection_count: The minimum number of successful detections required to return a result. If fewer than this number of images are detected, the function returns nil. Use 2 or higher when handling multiple detections (e.g., drag between detected targets).
    • parameters: A table specifying detection parameters. If omitted, default values are used.
      • max_detection: Maximum number of detections to return. Must be ≥ require_detection_count. Default is 1.
      • target_rect: Specifies the detection area in rectangle format. Default is fullscreen.
      • grayscale: Enables grayscale detection. Default is true.
      • error_tolerance: Default is 70. Lower values make detection stricter(range: 0–100).
      • scaling_level: Controls scaling for detection. Smaller values shrink the image for efficiency. Scaling factor is 2^(scaling_level). Default is 0.

      For more details, refer to Image Detection Step Settings.

    • timeout: Specifies the detection timeout in milliseconds.
  • Return Value

    The function returns a table containing the detection result. If detection fails, it returns nil.

    The table format is { rects = list of detected bounding boxes } . Each detected bounding box is in rectangle format.

detect_text(text, require_detection_count, parameters, timeout)

Performs text detection.

When detect_text is used, the result is returned as a value.
The detected bounding box can be used to perform actions such as tapping the detected text.

  • Example

    Detect the text “Test” or “Scenario”, and tap the center of the detected bounding box.

    local result = detect_text({"Test", "Scenario"}, 1, {matches_only_element = false, scaling_level = 0, }, 1000)
    if result ~= nil then
        local r1 = result.rects[1] -- The first detection rectangle
        -- Tap Detected Text            
        tap(r1[1] + r1[3] * 0.5 ,r1[2] + r1[4] * 0.5, 100)
        delay(100)
    else
    end
  • Arguments
    • text: The text to detect. Can be a string or an array of strings.
    • require_detection_count: The minimum number of detections required. Returns a value only if this number or more detections are found. Use 2 or more for operations like dragging between detected targets.
    • parameters: A table specifying detection parameters. If omitted, default values are used.
      • target_rect: Specifies the detection area in rectangle format. Default is fullscreen.
      • matches_only_element: If true, detection succeeds only when an entire detected element matches. Default is false.
      • scaling_level: Defines scaling level for detection. Default is 0. Lower values shrink the detection area for efficiency. The scale factor is 2^(scaling_level).

      See Text Detection Step Settings for more details.

    • timeout: Specifies the detection timeout in milliseconds.
  • Return Value

    A table representing the text detection result. Returns nil if detection fails.

    The table format is:

    {
        rects = { ... },  -- Array of detected bounding boxes
        text = "detected_text",  -- Detected text string
        rect_map = { ["detected_text"] = { bounding_boxes } }  -- Mapping of detected texts to bounding boxes
    }

    Since multiple texts can be detected at once, all detected information is included in rect_map.

detect_ui(condition, require_detection_count, settings, timeout)

Performs screen element detection.

When detect_ui is called, it returns the detection result as a return value.
You can use the detection result, including the bounding rectangles and element information, to perform actions such as text input on the detected element.

  • Example
    Detect a text input field that contains the text “Type” and input “Test” into it:
    local result = detect_ui(
        {
            class_name = "android.widget.EditText",
            contained_text = "Type"
        }, 
        1, 
        {
            match_settings = {
                class_name = "required",
                contained_text = "required"
            },
            text_match_settings = {
                contained_text = "exact"            
            }
        },
        1000
    )
    if result ~= nil then
        input_text(result.candidates[1], "Test")
        delay(100)
    else
    end
  • Arguments

    To specify attributes like app name in the condition table, use the following keys:

    Attribute Attribute Name Description
    App Name package_name The identifier (package name) of the app that contains the target screen element.
    Element Type class_name The type of screen element to detect.
    Internal ID view_id The internal ID (viewId) assigned to the screen element. This can be used when defined by the app developer.
    Text Content text The text displayed on the screen element.
    Contained Text contained_text Text that is displayed inside the specified element.
    Description content_description The content description assigned to the screen element. This may differ from the visually displayed text.
    Element Position bounds The rectangular area (bounding box) representing the position and size of the element on the screen.
    Element Structure path A number representing the order of the element in the screen hierarchy, indicating its position within the structure.
    • condition: A table defining the target element conditions. Each attribute accepts:
    • require_detection_count: The minimum number of detections required. Returns a value only if this number or more detections are found. Use 2 or more for operations like dragging between detected targets.
    • settings: A table to configure matching behavior.
      • clickable_only: A boolean value that specifies whether to limit detection to clickable elements. Defaults to false if omitted.
      • editable_only: A boolean value that specifies whether to limit detection to text-editable elements. Defaults to false if omitted.
      • match_settings: A table that defines how each attribute should be used for matching. Unspecified attributes are ignored. Valid values:
        Usage mode Value Description
        Required required If this does not match, the element will be excluded.
        Preferred preferred If this matches, the element will be prioritized among candidates.
        Ignored ignored This attribute will not be used for comparison.
      • text_match_settings: A table defining how to match text-related fields (text, contained_text, content_description). Valid values:
        Text matching method Value Description
        Exact Match exact Matches elements whose text exactly equals the given text.
        Contains contains Matches elements that contain the given text.
        Starts With starts_with Matches elements that start with the given text.
        Ends With ends_with Matches elements that end with the given text.
    • timeout: Specifies the detection timeout in milliseconds.
  • Return Value

    A table representing the result of screen element detection.

    The result is in the format { rects: array of bounding rectangles, candidates: array of detected elements }. Both arrays are sorted based on the “preferred” match settings.
    The most relevant result is rects[1] and candidates[1].
    You can pass candidates[1] to input_text or other related functions.

launch_app(package_name)

Launches the specified app.

  • Arguments
    • package_name: The package name of the app to launch.
  • Return Value

    None

loop_continue()

Performs the same action as the “Repeat this scenario” step.

  • Arguments

    None

  • Return Value

    None

scenario_return()

Performs the same action as the “Return to previous scenario” step.

  • Arguments

    None

  • Return Value

    None

stop_all()

Performs the same action as the “End all scenario playback” step.

  • Arguments

    None

  • Return Value

    None

Custom Format

RectangleFormat

Represents a rectangle in the format:

{top-left x, top-left y, width, height} as a numerical array.

Example: A rectangle with a top-left position at (0, 100), a width of 200, and a height of 50:

{0, 100, 200, 50}