ScreenSpot-Pro
High-resolution GUI grounding benchmark
ScreenSpot-Pro is a benchmark for evaluating click prediction accuracy on high-resolution GUI screenshots with complex layouts.
Usage
# Run the benchmark
cd libs/python/agent/benchmarks
python ss-pro.py
# Run with custom sample limit
python ss-pro.py --samples 50
Results
Model | Accuracy | Failure Rate | Samples |
---|---|---|---|
Coming Soon | - | - | - |
Results will be populated after running benchmarks with various models.