Human-In-The-Loop

The Agent SDK provides a human tool, with native support for using a human-in-the-loop as a way to evaluate your environment, tools, or to create demonstrations. You can use it by doing grounding_model+human/human or human/human directly.

Getting Started

To start the human agent tool, simply run:

python -m agent.human_tool

The UI will show you pending completions. Select a completion to take control of the agent.

Usage Examples

Direct Human Agent

from agent import ComputerAgent
from agent.computer import computer

agent = ComputerAgent(
    "human/human",
    tools=[computer]
)

async for _ in agent.run("Take a screenshot, analyze the UI, and click on the most prominent button"):
    pass

Composed with Grounding Model

agent = ComputerAgent(
    "huggingface-local/HelloKKMe/GTA1-7B+human/human",
    tools=[computer]
)

async for _ in agent.run("Navigate to the settings page and enable dark mode"):
    pass

Features

The human-in-the-loop interface provides:

Interactive UI: Web-based interface for reviewing and responding to agent requests
Image Display: Screenshots with click handlers for direct interaction
Action Accordions: Support for various computer actions (click, type, keypress, etc.)
Tool Calls: Full OpenAI-compatible tool call support
Real-time Updates: Smart polling for responsive UI updates

Use Cases

Evaluation: Have humans evaluate agent performance and provide ground truth responses
Demonstrations: Create training data by having humans demonstrate tasks
Interactive Control: Take manual control when automated agents need human guidance
Testing: Validate agent, tool, and environment behavior manually