cortex desktop

Cross-platform desktop automation that provides agent tools for GUI interaction. Supports screenshots, mouse actions, keyboard input, and clipboard management across Linux, macOS, and Windows.

Usage

cortex desktop <subcommand> [options]

Subcommands

SubcommandDescription
dockerfileGenerate a Dockerfile for containerized desktop automation
entrypointGenerate an entrypoint script for the desktop container
screenshotCapture a screenshot (PNG or JPEG)
clickMouse click at specified coordinates
typeType text at the current focus
clipboardRead or write clipboard contents

Platform Backends

PlatformBackend
Linuxxdotool (mouse/keyboard), import (screenshots), xclip (clipboard)
macOSAppleScript + screencapture
WindowsPowerShell + Win32 APIs (Add-Type for GDI)

Desktop Actions

ActionDescription
screenshotCapture screen (PNG/JPEG format)
clickSingle mouse click
dblclickDouble mouse click
typeType text string
keypressPress a key with optional modifiers
dragMouse drag from point to point
get_clipboardRead clipboard text
set_clipboardWrite text to clipboard
waitWait specified milliseconds
moveMove mouse to coordinates
scrollScroll mouse wheel

Containerized Execution

Desktop automation can run in Docker containers using the generated Dockerfile and entrypoint scripts. This provides isolation and reproducibility for GUI automation tasks.

Examples

# Capture a screenshot
cortex desktop screenshot

# Generate Dockerfile for containerized desktop automation
cortex desktop dockerfile

# Generate entrypoint script
cortex desktop entrypoint