Introduction
agent-device is a CLI for automating iOS simulators + physical devices and Android emulators + devices from agents. It provides:
- Accessibility snapshots for UI understanding
- Deterministic interactions (tap, type, scroll)
- Session-aware workflows and replay
If you know agent-browser, this is the mobile-native counterpart for iOS/Android UI automation.
For exploratory QA and bug-hunting workflows, see skills/dogfood/SKILL.md in this repository.
What it’s good at
- Capturing structured UI state for LLMs
- Driving common UI actions with refs or semantic selectors
- Replaying flows for regression checks
Platform support highlights
- iOS core runner commands:
snapshot,diff snapshot,wait,click,fill,get,is,find,press,long-press,focus,type,scroll,scrollintoview,back,home,app-switcher,open(app),close,screenshot,apps,appstate,install,install-from-source,reinstall,trigger-app-event. - iOS
appstateis session-scoped on the selected target device. - iOS simulator-only:
alert,pinch,settings,push,clipboard. - Session performance metrics:
perf/metricsis available on iOS and Android and currently reports startup timing sampled fromopencommand round-trip duration. - iOS
recordsupports simulators and physical devices.- Simulators use native
simctl io ... recordVideo. - Physical devices use runner screenshot capture (
XCUIScreen.main.screenshot()frames) stitched into MP4, so FPS is best-effort (not guaranteed 60 even with--fps 60). - Physical-device recording requires an active app session context (
open <app>first). - Physical-device recording defaults to 15 FPS and supports
--fpscaps.
- Simulators use native
- Android supports the same core interaction set, plus
pushnotification simulation,clipboard read/write, andkeyboard status|get|dismissvia adb shell commands. - App-event triggers are available on iOS and Android through app-defined deep-link hooks (
trigger-app-event), using active session context or explicit device selectors.
Architecture (high level)
- CLI sends requests to the daemon.
- The daemon manages sessions and dispatches to platform drivers.
- iOS uses XCTest runner for snapshots and input on simulators and physical devices.
- Android uses ADB-based tooling.
