LogoCua Documentation

Quickstart (for Developers)

Get started with cua in 5 steps

Get up and running with cua in 5 simple steps.

Introduction

cua combines Computer (interface) + Agent (AI) for automating desktop apps. Computer handles clicks/typing, Agent provides the intelligence.

Set Up Your Computer Environment

Choose how you want to run your cua computer. Cloud containers are recommended for the easiest setup:

Easiest & safest way to get started

  1. Go to trycua.com/signin
  2. Navigate to Dashboard > Containers > Create Instance
  3. Create a Medium, Ubuntu 22 container
  4. Note your container name and API key

Your cloud container will be automatically configured and ready to use.

  1. Install lume cli
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh)"
  1. Start a local cua container
lume run macos-sequoia-cua:latest
  1. Enable Windows Sandbox (requires Windows 10 Pro/Enterprise or Windows 11)
  2. Install pywinsandbox dependency
pip install -U git+git://github.com/karkason/pywinsandbox.git
  1. Windows Sandbox will be automatically configured when you run the CLI
  1. Install Docker Desktop or Docker Engine

  2. Pull the CUA Ubuntu container

docker pull --platform=linux/amd64 trycua/cua-ubuntu:latest

Install cua

pip install "cua-agent[all]" cua-computer

# or install specific providers
pip install "cua-agent[openai]"        # OpenAI computer-use-preview support
pip install "cua-agent[anthropic]"     # Anthropic Claude support
pip install "cua-agent[omni]"          # Omniparser + any LLM support
pip install "cua-agent[uitars]"        # UI-TARS
pip install "cua-agent[uitars-mlx]"    # UI-TARS + MLX support
pip install "cua-agent[uitars-hf]"     # UI-TARS + Huggingface support
pip install "cua-agent[glm45v-hf]"     # GLM-4.5V + Huggingface support
pip install "cua-agent[ui]"            # Gradio UI support
npm install @trycua/computer

Using Computer

from computer import Computer

async with Computer(
    os_type="linux",
    provider_type="cloud",
    name="your-container-name",
    api_key="your-api-key"
) as computer:
    # Take screenshot
    screenshot = await computer.interface.screenshot()

    # Click and type
    await computer.interface.left_click(100, 100)
    await computer.interface.type("Hello!")
import { Computer, OSType } from '@trycua/computer';

const computer = new Computer({
  osType: OSType.LINUX,
  name: "your-container-name",
  apiKey: "your-api-key"
});

await computer.run();

try {
  // Take screenshot
  const screenshot = await computer.interface.screenshot();

  // Click and type
  await computer.interface.leftClick(100, 100);
  await computer.interface.typeText("Hello!");
} finally {
  await computer.close();
}

Using Agent

from agent import ComputerAgent

agent = ComputerAgent(
    model="anthropic/claude-3-5-sonnet-20241022",
    tools=[computer],
    max_trajectory_budget=5.0
)

messages = [{"role": "user", "content": "Take a screenshot and tell me what you see"}]

async for result in agent.run(messages):
    for item in result["output"]:
        if item["type"] == "message":
            print(item["content"][0]["text"])

Next Steps