LogoCua Documentation

Configuration

Detection Parameters

Box Threshold (0.3)

Controls the confidence threshold for accepting detections:

Illustration of confidence thresholds in object detection, with a high-confidence detection accepted and a low-confidence detection rejected.
  • Higher values (0.3) yield more precise but fewer detections
  • Lower values (0.01) catch more potential icons but increase false positives
  • Default is 0.3 for optimal precision/recall balance

IOU Threshold (0.1)

Controls how overlapping detections are merged:

Diagram showing Intersection over Union (IOU) with low overlap between two boxes kept separate and high overlap leading to merging.
  • Lower values (0.1) more aggressively remove overlapping boxes
  • Higher values (0.5) allow more overlapping detections
  • Default is 0.1 to handle densely packed UI elements

OCR Configuration

  • Engine: EasyOCR

    • Primary choice for all platforms
    • Fast initialization and processing
    • Built-in English language support
    • GPU acceleration when available
  • Settings:

    • Timeout: 5 seconds
    • Confidence threshold: 0.5
    • Paragraph mode: Disabled
    • Language: English only

Performance

Hardware Acceleration

MPS (Metal Performance Shaders)

  • Multi-scale detection (640px, 1280px, 1920px)
  • Test-time augmentation enabled
  • Half-precision (FP16)
  • Average detection time: ~0.4s
  • Best for production use when available

CPU

  • Single-scale detection (1280px)
  • Full-precision (FP32)
  • Average detection time: ~1.3s
  • Reliable fallback option

Example Output Structure

examples/output/
├── {timestamp}_no_ocr/
│   ├── annotated_images/
│   │   └── screenshot_analyzed.png
│   ├── screen_details.txt
│   └── summary.json
└── {timestamp}_ocr/
    ├── annotated_images/
    │   └── screenshot_analyzed.png
    ├── screen_details.txt
    └── summary.json