# Introduction

`agent-device` is an agent-native CLI for app verification and QA from coding agents. It gives agents structured UI access, deterministic interactions, debugging evidence, performance signals, and replayable flows across iOS, Android, tvOS, Android TV, macOS, and Linux desktop targets.

Use it when an agent needs to inspect and operate a real app, not just reason about source code or screenshots.

## Where it shines

- **App verification for agents**: run the app, inspect visible UI, act through refs/selectors, and verify expected state.
- **Token-efficient UI context**: accessibility snapshots give agents compact structure instead of screenshot-only reasoning.
- **Runtime evidence**: capture screenshots, recordings, logs, network traffic, traces, CPU/memory/perf snapshots, and crash-related logs when the happy path breaks.
- **Replayable checks**: turn stable exploratory sessions into `.ad` replay scripts that can run again without AI.
- **React Native and Expo workflows**: pair device automation with optional React DevTools profiling for component trees, props/state/hooks, slow renders, and rerenders.
- **Local devices and app surfaces**: drive simulators, emulators, physical devices, TV targets, and desktop apps through one CLI.

If you know `agent-browser`, `agent-device` is the app/device counterpart for mobile, TV, and desktop targets.

## Development loop

`agent-device` closes the agentic development loop: agents can write code, run the real app, verify the UI end-to-end, collect screenshots/videos/logs/perf evidence, and feed bugs, crashes, or performance findings back into the next fix iteration before a human reviews the PR.

![Sketch showing agent-device as the live app verification layer in the agentic development loop](/agentic-development-loop.svg)

## How agents use it

The normal loop is:

```bash
agent-device apps --platform ios
agent-device open <app-or-url> --platform ios
agent-device snapshot -i
agent-device press @e12
agent-device diff snapshot -i
agent-device close
```

Installed CLI help is the version-matched operating guide. Start there before planning device work:

```bash
agent-device help workflow
agent-device help debugging
agent-device help react-devtools
agent-device help dogfood
```

Use [AI Agent Setup](/agent-device/docs/agent-setup.md) for Cursor, Codex, Claude Code, Windsurf, Cline, Goose, skills, and MCP setup. Use [Commands](/agent-device/docs/commands.md) for detailed command groups and platform behavior.

## Where it fits

`agent-device` is for agents, but humans still install it, grant permissions, review artifacts, and decide what ships.

It complements scripted test frameworks such as Appium, Maestro, Detox, XCTest, and Espresso. Keep those for stable human-authored coverage. Use `agent-device` when an agent needs to explore, reproduce, debug, profile, collect evidence, or record a replay from live app behavior.

MCP support is intentionally a thin discovery router for install/status/help. App and device automation remains explicit CLI activity in the terminal.

## Next steps

- Install the CLI: [Installation](/agent-device/docs/installation.md)
- Set up an agent client: [AI Agent Setup](/agent-device/docs/agent-setup.md)
- Run the first commands: [Quick Start](/agent-device/docs/quick-start.md)
- Inspect all command groups: [Commands](/agent-device/docs/commands.md)
- Collect runtime evidence: [Debugging & Profiling](/agent-device/docs/debugging-profiling.md)
- Record deterministic flows: [Replay & E2E](/agent-device/docs/replay-e2e.md)


---
url: /agent-device/404.md
---

404

# PAGE NOT FOUND

[Take me home](/agent-device/)


---
url: /agent-device/docs/agent-setup.md
---

# AI Agent Setup

`agent-device` is built for AI agents, but humans usually install it, grant device permissions, and decide which agent client should use it.

Use this page to wire Cursor, Codex, Claude Code, Windsurf, Cline, Goose, or another coding agent into mobile, TV, and desktop app verification. It covers skills, project rules, and MCP setup for React Native QA, Expo app verification, iOS Simulator automation, Android Emulator automation, tvOS checks, Android TV checks, debugging, profiling, and exploratory QA.

The short version: install the CLI, make the agent read version-matched help, and let the agent run CLI commands in a terminal. MCP is available for discovery and help, not broad device control.

## Prerequisite: install the CLI

```bash
npm install -g agent-device@latest
agent-device --version
agent-device help workflow
```

For one-off use without a global install:

```bash
npx -y agent-device@latest --version
npx -y agent-device@latest help workflow
```

Global install is better for normal agent workflows because repeated commands, skills, and terminal sessions resolve to one stable version.

For Node, Xcode, Android SDK, macOS, and iOS device prerequisites, see [Installation](/agent-device/docs/installation.md).

## Install the skill

Install the skill when your agent runtime supports skills:

```bash
npx skills add callstackincubator/agent-device
```

The bundled [agent-device skill](https://github.com/callstackincubator/agent-device/blob/main/skills/agent-device/SKILL.md) is the canonical router for skill-aware clients. It intentionally points agents back to installed CLI help instead of duplicating the command manual.

## Recommended agent rule

Add this as a project rule, custom instruction, or skill equivalent when your agent client supports it:

```text
Use agent-device only for app/device automation tasks. Before planning commands, run `agent-device --version` and read `agent-device help workflow`. For exploratory QA, read `agent-device help dogfood`. For logs, network, traces, or runtime failures, read `agent-device help debugging`. For React Native component trees, props/state/hooks, slow renders, or rerenders, read `agent-device help react-devtools`.

Use the CLI in the integrated terminal. MCP is only a discovery/help router and does not expose device automation tools. Prefer `open -> snapshot -i -> act -> re-snapshot -> verify -> close`. Use current refs such as `@e3` for exploration and selectors for durable replay. Keep mutating commands against one session serial. Capture screenshots, logs, network, perf, traces, recordings, and `.ad` replay scripts only when they add evidence.
```

## MCP router

`agent-device mcp` starts the official stdio MCP router for discovery-oriented clients. It exposes only `status`, `install`, and `help` tools plus workflow prompts/resources. Device automation still runs through the CLI commands returned by version-matched help.

Global install configuration:

```json
{
  "mcpServers": {
    "agent-device": {
      "command": "agent-device",
      "args": ["mcp"]
    }
  }
}
```

No global install variant:

```json
{
  "mcpServers": {
    "agent-device": {
      "command": "npx",
      "args": ["-y", "agent-device@latest", "mcp"]
    }
  }
}
```

Registry metadata uses MCP name `io.github.callstackincubator/agent-device`, npm package `agent-device`, stdio transport, `mcpName` package verification, `server.json`, and `smithery.yaml`.

## Cursor

Use Agent mode with the integrated terminal. Add the recommended rule above as a project rule, then run:

```bash
agent-device help workflow
agent-device apps --platform ios
agent-device open <app-or-url> --platform ios
agent-device snapshot -i
```

Optional: paste the [MCP router](#mcp-router) configuration into `.cursor/mcp.json`.

## Codex

Put the recommended rule in `AGENTS.md` or the project instructions. Let Codex run `agent-device` in the terminal:

```bash
agent-device help workflow
agent-device boot --platform ios
agent-device open <app-or-url> --platform ios
agent-device snapshot -i
```

For reviews or planning-only tasks, tell the agent not to run devices unless explicitly requested.

## Claude Code

Use the bundled skill when your Claude setup supports skills. Otherwise put the recommended rule in `CLAUDE.md`.

```bash
agent-device --version
agent-device help workflow
agent-device help dogfood
```

If you configure MCP, keep using CLI commands for automation. The MCP router gives Claude install/status/help context only.

## Windsurf, Cline, Goose, and other MCP clients

Use the [MCP router](#mcp-router) configuration when the client supports `mcpServers`, then tell the agent to run device commands through the terminal.

If the client has project rules or custom instructions, add the recommended agent rule above. If it does not, start the conversation by asking the agent to run `agent-device help workflow` before planning.

## Why this setup works

The CLI stays the auditable automation surface, installed help stays version-matched with the commands, skills and rules route agents toward the right help topics, and MCP gives discovery-oriented clients a small install/status/help entry point.

For the broader positioning, supported targets, observability features, and how `agent-device` differs from scripted test frameworks, see [Introduction](/agent-device/docs/introduction.md). For exact command groups and platform behavior, see [Commands](/agent-device/docs/commands.md).

For the local execution model, permissions, artifacts, and sensitive data guidance, see [Security & Trust](/agent-device/docs/security-trust.md).

## Agent-readable docs

Use [llms-full.txt](https://incubator.callstack.com/agent-device/llms-full.txt) when an agent needs a single text bundle of the current docs. The installed CLI remains authoritative for exact command syntax:

```bash
agent-device help
agent-device help workflow
agent-device help dogfood
```


---
url: /agent-device/docs/batching.md
---

# Batching

Use `batch` to run multiple commands in a single daemon request.

This is useful for agent workflows that already know the next sequence of actions and want to reduce orchestration overhead.

## CLI examples

From a file:

```bash
agent-device batch \
  --session sim \
  --platform ios \
  --udid 00008150-001849640CF8401C \
  --steps-file /tmp/batch-steps.json \
  --json
```

Inline for small payloads:

```bash
agent-device batch --steps '[{"command":"open","positionals":["settings"]},{"command":"wait","positionals":["100"]}]'
```

## Step payload format

`batch` accepts a JSON array of steps:

```json
[
  { "command": "open", "positionals": ["settings"], "flags": {} },
  { "command": "wait", "positionals": ["label=\"Privacy & Security\"", "3000"], "flags": {} },
  { "command": "click", "positionals": ["label=\"Privacy & Security\""], "flags": {} },
  { "command": "get", "positionals": ["text", "label=\"Tracking\""], "flags": {} }
]
```

Notes:

- `positionals` is optional (defaults to `[]`).
- `flags` is optional (defaults to `{}`).
- Unknown top-level step fields are rejected. Supported keys are `command`, `positionals`, `flags`, and `runtime`.
- nested `batch` and `replay` steps are rejected.
- `--on-error stop` is the supported behavior.

## Response shape

Success:

```json
{
  "success": true,
  "data": {
    "total": 4,
    "executed": 4,
    "totalDurationMs": 1810,
    "results": [
      { "step": 1, "command": "open", "ok": true, "durationMs": 1020 },
      { "step": 2, "command": "wait", "ok": true, "durationMs": 320 },
      { "step": 3, "command": "click", "ok": true, "durationMs": 260 },
      { "step": 4, "command": "get", "ok": true, "durationMs": 210, "data": { "text": "..." } }
    ]
  }
}
```

In non-JSON mode, `batch` also prints a short per-step summary after the overall completion line.

Failure:

```json
{
  "success": false,
  "error": {
    "code": "COMMAND_FAILED",
    "message": "Batch failed at step 3 (click): ...",
    "details": {
      "step": 3,
      "command": "click",
      "positionals": ["label=\"Privacy & Security\""],
      "executed": 2,
      "total": 4,
      "partialResults": [
        { "step": 1, "command": "open", "ok": true },
        { "step": 2, "command": "wait", "ok": true }
      ]
    }
  }
}
```

## Agent best practices

- Batch only one related screen flow at a time.
- After mutating steps (`open`, `click`, `fill`, `swipe`), add a sync guard (`wait`, `is exists`) before critical reads.
- Treat prior refs/snapshots as stale after UI changes.
- Prefer `--steps-file` over inline JSON.
- Keep batches moderate (about 5-20 steps).
- Replan from the failing step using `details.step` and `details.partialResults`.

## Canonical recipes

Open app -> open thread -> type -> send

```json
[
  { "command": "open", "positionals": ["com.example.chat"], "flags": { "platform": "android" } },
  { "command": "wait", "positionals": ["text", "Inbox", "3000"], "flags": {} },
  { "command": "press", "positionals": ["label=\"Inbox\" role=button"], "flags": {} },
  { "command": "press", "positionals": ["label=\"Adam Horodyski\""], "flags": {} },
  {
    "command": "fill",
    "positionals": ["label=\"Message\" role=text-field", "filed the expense"],
    "flags": {}
  },
  { "command": "press", "positionals": ["label=\"Send\" role=button"], "flags": {} },
  { "command": "wait", "positionals": ["text", "filed the expense", "3000"], "flags": {} }
]
```

Open app -> open action menu -> choose option -> verify

```json
[
  { "command": "open", "positionals": ["com.example.app"], "flags": { "platform": "android" } },
  { "command": "wait", "positionals": ["text", "Home", "3000"], "flags": {} },
  { "command": "press", "positionals": ["label=\"More actions\" role=button"], "flags": {} },
  { "command": "wait", "positionals": ["text", "Camera scan", "2000"], "flags": {} },
  { "command": "press", "positionals": ["label=\"Camera scan\""], "flags": {} },
  { "command": "wait", "positionals": ["text", "Expense created", "15000"], "flags": {} },
  { "command": "is", "positionals": ["visible", "label=\"Expense created\""], "flags": {} }
]
```

## Stale accessibility tree risk

Rapid UI changes can outpace accessibility tree updates. Mitigate by inserting explicit waits and splitting long workflows into phases:

1. navigate
2. verify/extract
3. cleanup


---
url: /agent-device/docs/client-api.md
---

# Typed Client

Use `createAgentDeviceClient()` when you want to drive the daemon from application code instead of shelling out to the CLI.

For remote Metro-backed flows, import the reusable Node APIs instead of spawning the `agent-device` binary. The CLI uses the same helpers internally.

Public subpath API exposed for Node consumers:

- `agent-device/io`
  - artifact adapter types, file input refs, and file output refs
- `agent-device/metro`
  - `prepareRemoteMetro(options)`
  - `ensureMetroTunnel(options)`
  - `reloadRemoteMetro(options)`
  - `stopMetroTunnel(options)`
  - `buildIosRuntimeHints(baseUrl)`
  - `buildAndroidRuntimeHints(baseUrl)`
  - types: `PrepareRemoteMetroOptions`, `PrepareRemoteMetroResult`, `EnsureMetroTunnelOptions`, `EnsureMetroTunnelResult`, `ReloadRemoteMetroOptions`, `ReloadRemoteMetroResult`, `StopMetroTunnelOptions`, `MetroRuntimeHints`, `MetroBridgeResult`
- `agent-device/batch`
  - `runBatch(req, sessionName, invoke)`
  - `validateAndNormalizeBatchSteps(steps, maxSteps)`
  - `buildBatchStepFlags(parentFlags, stepFlags)`
  - `DEFAULT_BATCH_MAX_STEPS`
  - `BATCH_BLOCKED_COMMANDS`
  - `INHERITED_PARENT_FLAG_KEYS`
  - types: `BatchInvoke`, `BatchRequest`, `BatchStep`, `BatchStepResult`, `NormalizedBatchStep`
- `agent-device/remote-config`
  - `resolveRemoteConfigPath(options)`
  - `resolveRemoteConfigProfile(options)`
  - types: `RemoteConfigProfile`, `RemoteConfigProfileOptions`, `ResolvedRemoteConfigProfile`
- `agent-device/contracts`
  - types: `SessionRuntimeHints`, `DaemonInstallSource`, `DaemonLockPolicy`, `DaemonRequestMeta`, `DaemonRequest`, `DaemonArtifact`, `DaemonResponseData`, `DaemonError`, `DaemonResponse`
- `agent-device/selectors`
  - `parseSelectorChain(expression)`
  - `tryParseSelectorChain(expression)`
  - `resolveSelectorChain(nodes, chain, options)`
  - `findSelectorChainMatch(nodes, chain, options)`
  - `formatSelectorFailure(chain, diagnostics, options)`
  - `isNodeVisible(node)`
  - `isSelectorToken(token)`
  - `isNodeEditable(node, platform)`
  - types: `Selector`, `SelectorChain`, `SelectorDiagnostics`, `SelectorResolution`, `SnapshotNode`
- `agent-device/finders`
  - `findBestMatchesByLocator(nodes, locator, query, requireRectOrOptions)`
  - `normalizeRole(value)`
  - `normalizeText(value)`
  - `parseFindArgs(args)`
  - types: `FindLocator`, `FindMatchOptions`, `SnapshotNode`
- `agent-device/install-source`
  - `ARCHIVE_EXTENSIONS`
  - `isBlockedIpAddress(address)`
  - `isBlockedSourceHostname(hostname)`
  - `isTrustedInstallSourceUrl(sourceUrl)`
  - `materializeInstallablePath(options)`
  - `validateDownloadSourceUrl(url)`
  - types: `MaterializeInstallSource`, `MaterializedInstallable`
- `agent-device/artifacts`
  - `resolveAndroidArchivePackageName(archivePath)`
- `agent-device/android-snapshot-helper`
  - `ensureAndroidSnapshotHelper(options)`
  - `captureAndroidSnapshotWithHelper(options)`
  - `parseAndroidSnapshotHelperOutput(output)`
  - `parseAndroidSnapshotHelperXml(xml, metadata?, options?, maxNodes?)`
  - `prepareAndroidSnapshotHelperArtifactFromManifestUrl(options)`
  - `verifyAndroidSnapshotHelperArtifact(artifact)`
  - types: `AndroidAdbExecutor`, `AndroidSnapshotHelperArtifact`, `AndroidSnapshotHelperManifest`, `AndroidSnapshotHelperOutput`, `AndroidSnapshotHelperParsedSnapshot`
- `agent-device/android-adb`
  - `createAndroidPortReverseManager(provider)`
  - `createLocalAndroidAdbProvider(device)`
  - `captureAndroidLogcatWithAdb(executor, options?)`
  - `streamAndroidLogcatWithAdb(provider, options?)`
  - `readAndroidClipboardWithAdb(executor)` / `writeAndroidClipboardWithAdb(executor, text)`
  - `getAndroidKeyboardStatusWithAdb(executor)` / `dismissAndroidKeyboardWithAdb(executor)`
  - `openAndroidAppWithAdb(executor, packageName, options?)`
  - `forceStopAndroidAppWithAdb(executor, packageName)`
  - `resolveAndroidLaunchComponentWithAdb(executor, packageName, categories?)`
  - `listAndroidAppsWithAdb(executor, options?)`
  - `getAndroidAppStateWithAdb(executor)`
  - types: `AndroidAdbProvider`, `AndroidAdbExecutor`, `AndroidAdbExecutorOptions`, `AndroidAdbExecutorResult`, `AndroidAdbProcess`, `AndroidAdbSpawner`, `AndroidPortReverseProvider`

The `contracts`, `selectors`, `finders`, `install-source`, `android-adb`, `artifacts`, `batch`, `metro`, `remote-config`, and `io` subpaths are the supported Node entry points. The former compatibility subpaths `agent-device/android-apps` and `agent-device/daemon`, plus hosted-runtime subpaths `agent-device/commands`, `agent-device/backend`, `agent-device/testing/conformance`, and `agent-device/observability`, are no longer published.

## Basic usage

```ts
import { createAgentDeviceClient } from 'agent-device';

const client = createAgentDeviceClient({
  session: 'qa-ios',
  lockPolicy: 'reject',
  lockPlatform: 'ios',
});

const devices = await client.devices.list({ platform: 'ios' });
const apps = await client.apps.list({ platform: 'ios', appsFilter: 'user-installed' });
const ensured = await client.simulators.ensure({
  device: 'iPhone 16',
  boot: true,
});

await client.apps.open({
  app: 'com.apple.Preferences',
  platform: 'ios',
  udid: ensured.udid,
  runtime: {
    metroHost: '127.0.0.1',
    metroPort: 8081,
  },
});

const snapshot = await client.capture.snapshot({ interactiveOnly: true });

await client.sessions.close();
```

## Android snapshot helper providers

Remote Android providers should import `agent-device/android-snapshot-helper` and inject their own
ADB-shaped executor. The executor receives arguments after `adb`, so local callers may wrap
`adb -s <serial>`, while cloud providers can route the same operations through an ADB tunnel.

```ts
import {
  captureAndroidSnapshotWithHelper,
  ensureAndroidSnapshotHelper,
  parseAndroidSnapshotHelperXml,
  prepareAndroidSnapshotHelperArtifactFromManifestUrl,
} from 'agent-device/android-snapshot-helper';

const artifact = await prepareAndroidSnapshotHelperArtifactFromManifestUrl({
  manifestUrl:
    'https://github.com/callstackincubator/agent-device/releases/download/v0.13.3/agent-device-android-snapshot-helper-0.13.3.manifest.json',
});

await ensureAndroidSnapshotHelper({
  adb: runProviderAdb,
  artifact,
  installPolicy: 'missing-or-outdated',
});

const output = await captureAndroidSnapshotWithHelper({
  adb: runProviderAdb,
  timeoutMs: 8000,
});

const snapshot = parseAndroidSnapshotHelperXml(output.xml, output.metadata);
```

Helper captures report `metadata.captureMode` as `interactive-windows` when Android returns
interactive window roots, or `active-window` when the helper falls back to
`getRootInActiveWindow()`. `metadata.windowCount` is the number of serialized roots.

## Android ADB providers

Use `agent-device/android-adb` when a bridge owns Android device access but wants upstream command
behavior for ADB-shaped operations. Executors receive arguments after `adb`; local callers can use
`createLocalAndroidAdbProvider(device)`, while remote bridges can route the same argument arrays
through an abstract provider backed by an ADB tunnel, websocket API, or another remote transport.

The provider contract covers normal stdout/stderr commands, binary stdout, stdin, timeout/signal
cancellation, optional long-running spawn support for logcat-style streams, and optional reverse
support for port mappings. Public helpers accept an executor/provider directly and do not expose the
daemon's scoped adb interception internals.

`streamAndroidLogcatWithAdb(provider, options?)` requires `provider.spawn`; exec-only providers can
use `captureAndroidLogcatWithAdb(executor, options?)`.

Providers can also expose `reverse` for first-class port reverse ownership. Plain executors do not
advertise reverse support automatically; call `createAndroidPortReverseManager(providerOrExecutor)`
only when the provider supports `adb reverse` argument semantics. The manager makes duplicate setup
idempotent for the same owner and rejects conflicting owners for the same local endpoint.

```ts
import {
  getAndroidAppStateWithAdb,
  listAndroidAppsWithAdb,
} from 'agent-device/android-adb';

const provider = {
  exec: async (args, options) => await runAdbThroughRemoteTunnel(args, options),
};

const apps = await listAndroidAppsWithAdb(provider.exec, { filter: 'user-installed' });
const foreground = await getAndroidAppStateWithAdb(provider.exec);
```

## Command methods

Use `client.command.<method>()` for command-level device actions. It uses the same daemon transport path as the higher-level client methods, including session metadata, tenant/run/lease fields, normalized daemon errors, and remote artifact handling.

Results are daemon-shaped objects with typed known fields, so command semantics stay aligned with the CLI.

```ts
await client.command.wait({
  text: 'Continue',
  timeoutMs: 5_000,
});

await client.command.keyboard({
  action: 'dismiss',
});

await client.command.clipboard({
  action: 'write',
  text: 'hello from Node',
});

await client.command.back({
  mode: 'system',
});

await client.command.appSwitcher();
```

Supported command methods:

- `wait`
- `appState`
- `back`
- `home`
- `rotate`
- `appSwitcher`
- `keyboard`
- `clipboard`
- `alert`

Additional CLI-backed methods are exposed on their domain groups with typed option objects so Node consumers do not need to build raw daemon requests:

- `client.devices.boot()`
- `client.apps.push()`
- `client.apps.triggerEvent()`
- `client.capture.diff()`
- `client.interactions.click()`, `press()`, `longPress()`, `swipe()`, `focus()`, `type()`, `fill()`, `scroll()`, `pinch()`, `get()`, `is()`, `find()`
- `client.replay.run()` and `client.replay.test()`
- `client.batch.run()`
- `client.observability.perf()`, `logs()`, and `network()`
- `client.recording.record()` and `client.recording.trace()`
- `client.settings.update()`

`client.observability.perf()` returns daemon-shaped JSON so local and remote transports expose the same metrics payload. On Android and supported Apple targets, `data.metrics.fps.droppedFramePercent` is the primary frame-smoothness value. Android derives it from the current `adb shell dumpsys gfxinfo <package> framestats` window; connected iOS devices derive it from `xcrun xctrace` Animation Hitches for the active app process. Frame samples include `windowStartedAt`, `windowEndedAt`, and `worstWindows` so agents can correlate dropped-frame clusters with logs, network entries, and their own session actions. A successful Android read resets Android frame stats; `open <app>` resets the Android frame window too, so agents can call `perf`, perform a transition or gesture, then call `perf` again to inspect that focused window. iOS simulator and macOS app sessions report frame health as unavailable rather than inventing FPS or dropped-frame values.

`client.recording.record({ action: 'start', path, quality: 5 })` starts a smaller 50% resolution video; omit `quality` to keep native/current resolution.

## Batch orchestration for custom transports

Use `agent-device/batch` when a bridge or in-process runner receives daemon-shaped requests but owns command dispatch itself. The helper keeps validation, inherited flags, serial execution, partial results, and error envelopes aligned with the daemon batch command.

```ts
import { runBatch } from 'agent-device/batch';
import type { BatchRequest } from 'agent-device/batch';
import type { DaemonResponse } from 'agent-device/contracts';

async function handleBatch(req: BatchRequest): Promise<DaemonResponse> {
  return await runBatch(req, req.session ?? 'default', async (stepReq) => {
    try {
      return { ok: true, data: await dispatch(stepReq) };
    } catch (error) {
      return bridgeErrorToDaemonResponse(error);
    }
  });
}
```

## Android `installFromSource()`

```ts
const androidClient = createAgentDeviceClient({ session: 'qa-android' });

const installed = await androidClient.apps.installFromSource({
  platform: 'android',
  retainPaths: true,
  retentionMs: 60_000,
  source: { kind: 'url', url: 'https://example.com/app.apk' },
});

await androidClient.apps.open({
  platform: 'android',
  app: installed.launchTarget,
});

console.log(installed.packageName, installed.launchTarget);

if (installed.materializationId) {
  await androidClient.materializations.release({
    materializationId: installed.materializationId,
  });
}

await androidClient.sessions.close();
```

On Android, a successful `installFromSource()` response returns enough app identity to relaunch the installed app:

- `packageName`
- `launchTarget`

If the daemon cannot determine installed app identity, the request fails instead of returning an empty success payload.

## URL source rules

`installFromSource()` URL sources are intentionally limited:

- Private and loopback hosts are blocked by default.
- Archive-backed URL installs are only supported for trusted artifact services, currently GitHub Actions and EAS.
- For existing reachable artifact URLs, use `source: { kind: 'url', url: ... }`.
- For local artifacts, use `source: { kind: 'path', path: ... }` or the CLI `install`/`reinstall` commands.
- For compatible remote daemons that resolve CI artifacts server-side, pass a GitHub Actions artifact source:

```ts
await client.apps.installFromSource({
  platform: 'android',
  source: {
    kind: 'github-actions-artifact',
    owner: 'acme',
    repo: 'mobile',
    artifactId: 1234567890,
  },
});
```

Remote daemons may also support `{ kind: 'github-actions-artifact', owner, repo, artifactName }` or `{ kind: 'github-actions-artifact', owner, repo, runId, artifactName }`. The local client preserves these payloads and does not perform GitHub authentication or artifact download.

Direct Android `.apk` and `.aab` URL sources can still resolve package identity from the downloaded install artifact. Trusted GitHub Actions and EAS archive URLs may contain one installable `.apk`, `.aab`, `.ipa`, or iOS `.app` tar archive.

## Remote Metro helpers

```ts
import {
  prepareRemoteMetro,
  reloadRemoteMetro,
  stopMetroTunnel,
} from 'agent-device/metro';
import { resolveRemoteConfigProfile } from 'agent-device/remote-config';

const remoteConfig = resolveRemoteConfigProfile({
  configPath: './agent-device.remote.json',
  cwd: process.cwd(),
});

const prepared = await prepareRemoteMetro({
  projectRoot: remoteConfig.profile.metroProjectRoot!,
  kind: remoteConfig.profile.metroKind ?? 'auto',
  proxyBaseUrl: remoteConfig.profile.metroProxyBaseUrl,
  proxyBearerToken: remoteConfig.profile.metroBearerToken,
  bridgeScope: {
    tenantId: remoteConfig.profile.tenant!,
    runId: remoteConfig.profile.runId!,
    leaseId: remoteConfig.profile.leaseId!,
  },
  profileKey: remoteConfig.resolvedPath,
});

console.log(prepared.iosRuntime, prepared.androidRuntime);

await reloadRemoteMetro({
  runtime: prepared.iosRuntime,
});

await stopMetroTunnel({
  projectRoot: remoteConfig.profile.metroProjectRoot!,
  profileKey: remoteConfig.resolvedPath,
});
```

Use `agent-device/remote-config` for profile loading and path resolution, `agent-device/metro` for Metro preparation, reload, and tunnel lifecycle, and `agent-device/contracts` when a server consumer needs daemon request or runtime contract types. For bridged remote Metro, `proxyBaseUrl` is the bridge origin and `publicBaseUrl` is optional; the bridge descriptor supplies cloud iOS wildcard HTTPS hints and Android runtime-route hints. `reloadRemoteMetro()` calls Metro's `/reload` endpoint, matching the terminal `r` reload path for connected React Native apps.

## Selector helpers

Use `agent-device/selectors` when a remote daemon or bridge needs to parse and match selector expressions without deep-importing daemon internals. Matching is platform-aware because role normalization and editability checks differ by backend.

```ts
import { findSelectorChainMatch, parseSelectorChain } from 'agent-device/selectors';

const chain = parseSelectorChain('role=button label="Continue" visible=true');

const match = findSelectorChainMatch(snapshot.nodes, chain, {
  platform: 'android',
  requireRect: true,
});

if (!match) {
  // Build a daemon-shaped error with formatSelectorFailure(...) if needed.
}
```


---
url: /agent-device/docs/commands.md
---

# Commands

This page summarizes the primary command groups.

For persistent defaults and project-scoped CLI settings, see [Configuration](/agent-device/docs/configuration.md).

For agent workflow guidance that is matched to the installed CLI, run:

```bash
agent-device help
agent-device help workflow
agent-device help debugging
agent-device help react-devtools
agent-device help remote
agent-device help macos
agent-device help dogfood
```

Skills are recommended for auto-routing when your agent runtime supports them, but they are not required. The CLI help topics are the version-matched operating contract.

For MCP-aware clients that need discovery instead of direct device control, run:

```bash
agent-device mcp
```

The MCP router exposes `status`, `install`, and `help` tools plus workflow prompts/resources. It does not expose device automation or generic shell execution over MCP.

## Navigation

```bash
agent-device boot
agent-device boot --platform ios
agent-device boot --platform android
agent-device boot --platform android --device Pixel_9_Pro_XL --headless
agent-device open [app|url] [url]
agent-device open --platform macos --surface frontmost-app
agent-device open --platform macos --surface desktop
agent-device close [app]
agent-device back
agent-device back --in-app
agent-device back --system
agent-device home
agent-device rotate portrait
agent-device rotate landscape-left
agent-device app-switcher
```

- `boot` ensures the selected target is ready without launching an app.
- `boot` requires either an active session or an explicit device selector.
- `--platform apple` is an alias for the Apple automation backend (`ios`, `tvOS`, `macOS` selection).
- Use `--target mobile|tv|desktop` with `--platform` (required) to select phone/tablet vs TV-class vs desktop-class targets.
- `boot` is mainly needed when starting a new session and `open` fails because no booted simulator/emulator is available.
- Android: `boot --platform android --device <avd-name>` launches that emulator in GUI mode when needed.
- Android: add `--headless` to launch without opening a GUI window.
- `open [app|url] [url]` already boots/activates the selected target when needed.
- `open <url>` deep links are supported on Android and iOS.
- `open <app> <url>` opens a deep link on iOS.
- `open --platform macos --surface app|frontmost-app|desktop|menubar` selects the macOS session surface explicitly. `app` is the default when an app argument is provided.
- `back` now defaults to app-owned back navigation. On Apple targets that means visible in-app back UI only. On Android this currently maps to the same back keyevent because Android routes in-app back through that platform event.
- `back --in-app` is an explicit alias for the default app-owned behavior.
- `back --system` asks for system back input explicitly. On Android this is the normal back keyevent. On iOS and tvOS it uses the platform back gesture or Siri Remote menu action. On macOS, where there is no generic system back input, `back --system` reports unavailable instead of falling back to app-owned navigation.
- `rotate <orientation>` forces a mobile device into `portrait`, `portrait-upside-down`, `landscape-left`, or `landscape-right`.
- `rotate` is supported on iOS and Android mobile targets. macOS and tvOS do not expose it.
- On iOS devices, `http(s)://` URLs open in Safari when no app is active. Custom scheme URLs require an active app in the session.
- `AGENT_DEVICE_SESSION` and `AGENT_DEVICE_PLATFORM` can pre-bind a default session/platform for CLI automation runs, so normal commands (`open`, `snapshot`, `press`, `fill`, `screenshot`, `devices`, and `batch`) do not need those flags repeated on every call.
- A configured `AGENT_DEVICE_SESSION` implies bound-session lock mode by default. The CLI forwards that policy to the daemon, which enforces the same conflict handling for CLI, typed client, and direct RPC requests.
- `--session-lock reject|strip` sets the lock policy for a single CLI invocation, including nested batch steps.
- `AGENT_DEVICE_SESSION_LOCK=reject|strip` sets the default lock policy for bound-session automation runs. The older `--session-locked`, `--session-lock-conflicts`, `AGENT_DEVICE_SESSION_LOCKED`, and `AGENT_DEVICE_SESSION_LOCK_CONFLICTS` forms remain supported as compatibility aliases.
- Direct RPC callers can pass `meta.lockPolicy` and optional `meta.lockPlatform` on `agent_device.command` requests for the same daemon-enforced behavior.
- In `batch`, steps that omit `platform` still inherit the parent batch `--platform`; lock-mode defaults do not override that parent setting.
- Tenant-scoped daemon runs can pass `--tenant`, `--session-isolation tenant`, `--run-id`, and `--lease-id` to enforce lease admission.
- Remote daemon clients can pass `--daemon-base-url http(s)://host:port[/base-path]` to skip local daemon discovery/startup and call a remote HTTP daemon directly.
- Use `--daemon-auth-token <token>` (or `AGENT_DEVICE_DAEMON_AUTH_TOKEN`) for explicit service/API-token automation against non-loopback remote daemon URLs; the client sends it in both the JSON-RPC request token and HTTP auth headers.
- For human cloud access, `connect` can discover a cloud connection profile, while `connect --remote-config ...` uses a local profile. Both refresh a stored CLI session into a short-lived `adc_agent_...` token when needed. If no CLI session exists, interactive shells start login automatically; CI and non-interactive shells fail with API-token setup instructions. Use `--no-login` to disable implicit login. `AGENT_DEVICE_CLOUD_BASE_URL` is the bridge/control-plane API origin; its `/api-keys` route may redirect to the dashboard for token creation.
- For remote `connect` and `connect --remote-config` flows, see [Remote Metro workflow](#remote-metro-workflow).
- Android React Native relaunch flows require an installed package name for `open --relaunch`; install/reinstall the APK first, then relaunch by package. `open <apk|aab> --relaunch` is rejected because runtime hints are written through the installed app sandbox.
- For Metro-backed React Native JS changes, use `metro reload` before `open <app> --relaunch`; it mirrors pressing `r` in the Metro terminal and keeps the native process alive.
- Remote daemon screenshots and recordings are downloaded back to the caller path, so `screenshot page.png` and `record start session.mp4` remain usable when the daemon runs on another host.

```bash
agent-device open "https://example.com" --platform ios           # open link in web browser
agent-device open MyApp "myapp://screen/to" --platform ios       # open deep link to MyApp
agent-device back --platform ios                                 # tap visible app back UI only
agent-device back --system --platform ios                        # use edge-swipe or remote back action
agent-device reinstall MyApp /path/to/app-debug.apk --platform android --serial emulator-5554
agent-device open com.example.myapp --platform android --serial emulator-5554 --session my-session --relaunch
agent-device metro reload
```

## Device isolation scopes

```bash
agent-device devices --platform ios --ios-simulator-device-set /tmp/tenant-a/simulators
agent-device devices --platform android --android-device-allowlist emulator-5554,device-1234
```

- `--ios-simulator-device-set <path>` constrains simulator discovery and simulator command execution via `xcrun simctl --set <path> ...`.
- `--android-device-allowlist <serials>` constrains Android discovery/selection to comma or space separated serials.
- Scope is applied before selectors (`--device`, `--udid`, `--serial`), so out-of-scope selectors fail with `DEVICE_NOT_FOUND`.
- With iOS simulator-set scope enabled, iOS physical devices are not enumerated.
- Environment equivalents:
  - iOS: `AGENT_DEVICE_IOS_SIMULATOR_DEVICE_SET` (compat: `IOS_SIMULATOR_DEVICE_SET`)
  - Android: `AGENT_DEVICE_ANDROID_DEVICE_ALLOWLIST` (compat: `ANDROID_DEVICE_ALLOWLIST`)
- CLI scope flags override environment values unless bound-session lock mode is active with `strip`, in which case conflicting per-call selectors are ignored.

## Device discovery

```bash
agent-device devices
agent-device devices --platform ios
agent-device devices --platform android
agent-device devices --platform ios --ios-simulator-device-set /tmp/tenant-a/simulators
agent-device devices --platform android --android-device-allowlist emulator-5554,device-1234
```

- `devices` lists available targets after applying any platform selector or isolation scope flags.
- Use `--platform` to narrow discovery to Apple-family (`ios`, `tvOS`, `macOS`) or Android targets.
- Use `--ios-simulator-device-set` and `--android-device-allowlist` when you need tenant- or lab-scoped discovery.

## Simulator provisioning

```bash
agent-device ensure-simulator --device "iPhone 16" --platform ios
agent-device ensure-simulator --device "iPhone 16" --runtime com.apple.CoreSimulator.SimRuntime.iOS-18-4 --ios-simulator-device-set /tmp/tenant-a/simulators
agent-device ensure-simulator --device "iPhone 16" --ios-simulator-device-set /tmp/tenant-a/simulators --boot
```

- `ensure-simulator` ensures a named iOS simulator exists inside a device set, creating it via `simctl create` if missing.
- Requires `--device <name>` (the simulator name / device type, e.g. `"iPhone 16 Pro"`).
- `--runtime <id>` pins a specific CoreSimulator runtime (e.g. `com.apple.CoreSimulator.SimRuntime.iOS-18-4`). Omit to use the newest compatible runtime.
- `--boot` boots the simulator after ensuring it exists.
- Reuse of an existing matching simulator is the default; the command is idempotent.
- JSON output includes `udid`, `device`, `runtime`, `ios_simulator_device_set`, `created`, and `booted`.
- Does not require an active session — safe to call before `open`.

## TV targets

```bash
agent-device open YouTube --platform android --target tv
agent-device apps --platform android --target tv
agent-device open Settings --platform ios --target tv
agent-device screenshot apple-tv.png --platform ios --target tv
```

- AndroidTV app launch and app listing resolve TV launchable activities via `LEANBACK_LAUNCHER`.
- TV target selection supports both simulator/emulator and connected physical devices (AppleTV + AndroidTV).
- tvOS supports the same runner-driven interaction/snapshot flow as iOS (`snapshot`, `wait`, `press`, `fill`, `get`, `scroll`, `back`, `home`, `app-switcher`, `record`, and related selector flows).
- On tvOS, runner `back`/`home`/`app-switcher` map to Siri Remote actions (`menu`, `home`, double-home).
- tvOS follows iOS simulator-only command semantics for helpers like `pinch`, `settings`, and `push`.

## Desktop targets

```bash
agent-device devices --platform macos
agent-device open TextEdit --platform macos
agent-device open --platform macos --surface desktop
agent-device snapshot -i --platform apple --target desktop
```

- `--platform macos` selects the host Mac as a `desktop` target.
- `--platform apple --target desktop` selects the same macOS backend through the Apple-family alias.
- Use `app` sessions for normal app control: `open`, `snapshot`, `click`, `fill`, `press`, `scroll`, `back`, `screenshot`, `record`.
- Use `frontmost-app`, `desktop`, and `menubar` when you need to inspect desktop-global UI before choosing one app.
- `open --platform macos --surface frontmost-app` inspects the currently focused app without naming it first.
- `open --platform macos --surface desktop` inspects visible windows across the desktop.
- `open --platform macos --surface menubar` inspects the active app menu bar and system menu extras.
- `open <app> --platform macos --surface menubar` targets one menu bar app's extras bar, which is useful for status-item apps.
- Status-item apps often expose little or no useful UI through the default macOS `app` surface. Prefer `--surface menubar` for discovery when the app lives in the top menu bar.
- Use `frontmost-app`, `desktop`, and `menubar` mainly for `snapshot`, `get`, `is`, and `wait`.
- If you inspect with `desktop` or `menubar` and then need to click or fill inside one app, open that app in a normal `app` session.
- macOS also supports `clipboard read|write`, `trigger-app-event`, `logs`, `network dump`, `alert`, `pinch` in app sessions, `settings appearance`, and `settings permission <grant|reset> <accessibility|screen-recording|input-monitoring>`.
- In macOS app sessions, `screenshot` captures the target app window bounds rather than the full desktop.
- Prefer selector or `@ref`-driven interactions on macOS. Window position can shift between runs, so raw x/y point commands are less stable than snapshot-derived targets.
- Use `click --button secondary` for context menus on macOS, then run `snapshot -i` again.
- Mobile-only helpers remain unsupported on macOS: `boot`, `home`, `rotate`, `app-switcher`, `install`, `reinstall`, `install-from-source`, and `push`.

Recommended loops:

```bash
# One app, full interaction
agent-device open TextEdit --platform macos
agent-device snapshot -i
agent-device fill @e3 "hello"
agent-device screenshot textedit.png
agent-device close

# Desktop-global inspection first
agent-device open --platform macos --surface desktop
agent-device snapshot -i
agent-device is visible 'role="window" label="Notes"'
agent-device screenshot desktop.png --fullscreen
agent-device close

# Menubar / menu-extra inspection
agent-device open --platform macos --surface menubar
agent-device snapshot -i
agent-device wait 'label~="Wi-Fi|Control Center|Battery"'
agent-device close

# Targeted menu bar app inspection
agent-device open MenuBarApp --platform macos --surface menubar
agent-device snapshot -i
agent-device close
```

## Snapshot and inspect

```bash
agent-device snapshot [--diff] [-i] [-c] [-d <depth>] [-s <scope>] [--raw]
agent-device diff snapshot [-i] [-c] [-d <depth>] [-s <scope>] [--raw]
agent-device get text @e1
agent-device get attrs @e1
```

- iOS snapshots use XCTest on simulators and physical devices.
- Android snapshots use the bundled Android snapshot helper when the npm package includes it. The
  first helper-backed snapshot verifies and installs the helper APK if it is missing or outdated;
  helper failures fall back to stock UIAutomator and include `androidSnapshot.fallbackReason` in
  typed results. Source checkouts without a bundled helper use stock UIAutomator. The helper
  serializes Android interactive window roots when available, so keyboard and system-overlay nodes
  can appear alongside the app root; `androidSnapshot.captureMode` and
  `androidSnapshot.windowCount` describe the capture.
- `diff snapshot` compares the current snapshot with the previous session baseline and then updates baseline.
- `snapshot --diff` is an alias for `diff snapshot`.

## Wait and alerts

```bash
agent-device wait 1500
agent-device wait text "Welcome back"
agent-device wait @e12
agent-device wait 'role="button" label="Continue"' 5000
agent-device alert
agent-device alert get
agent-device alert wait 3000
agent-device alert accept
agent-device alert dismiss
```

- `wait` accepts a millisecond duration, `text <value>`, a snapshot ref (`@eN`), or a selector.
- `wait <selector> [timeoutMs]` polls until the selector resolves or the timeout expires.
- `wait @ref [timeoutMs]` requires an existing session snapshot from a prior `snapshot` command.
- `wait @ref` resolves the ref to its label/text from that stored snapshot, then polls for that text; it does not track the original node identity.
- Because `wait @ref` is text-based after resolution, duplicate labels can match a different element than the original ref target.
- `wait` shares the selector/snapshot resolution flow used by `click`, `fill`, `get`, and `is`.
- `alert` inspects or handles system alerts on iOS simulator and macOS desktop targets.
- `alert` without an action is equivalent to `alert get`.
- `alert wait [timeout]` waits for an alert to appear before returning it.
- If an iOS permission sheet is visible in `snapshot` or `screenshot` but `alert accept` reports no alert, fall back to a scoped `snapshot -i -s "<visible label>"` plus `press @ref`; not every simulator permission surface is exposed as a native XCTest alert.

## Interactions

```bash
agent-device click @e1
agent-device click @e1 --button secondary   # macOS secondary click / context menu
agent-device focus @e2
agent-device fill @e2 "text"          # Clear then type
agent-device fill @e2 "search" --delay-ms 80
agent-device type "text"              # Type into focused field without clearing
agent-device type "query" --delay-ms 80
agent-device press 300 500
agent-device press 300 500 --count 12 --interval-ms 45
agent-device press 300 500 --count 6 --hold-ms 120 --interval-ms 30 --jitter-px 2
agent-device swipe 540 1500 540 500 120
agent-device swipe 540 1500 540 500 120 --count 8 --pause-ms 30 --pattern ping-pong
agent-device longpress 300 500 800
agent-device scroll down 0.5
agent-device scroll down --pixels 320
agent-device pinch 2.0          # zoom in 2x (Apple simulator or macOS app session)
agent-device pinch 0.5 200 400 # zoom out at coordinates (Apple simulator or macOS app session)
```

`fill` clears then types. `type` does not clear.
`type` accepts text only. Do not pass `@ref` to `type`; use `fill @ref "text"` to target a field directly, or `press @ref` then `type "text"` to append in the focused field.
Use `--delay-ms` on `type` or `fill` for debounced search fields and search-as-you-type inputs that miss characters when text is injected too quickly.
Delayed typing prefers paced character entry over clipboard-style fallbacks so the target field still receives incremental updates.
On Android, `fill` also verifies text and performs one clear-and-retry pass on mismatch.
Some Android images cannot enter non-ASCII text over shell input; in that case use a trusted ADB keyboard IME and verify APK checksum/signature before install.
`click --button secondary` is the desktop context-menu flow on macOS.
`click --button middle` is reserved for future runner support and currently returns an explicit unsupported-operation error on macOS.
`swipe` accepts an optional `durationMs` argument (default `250ms`, range `16..10000`).
On iOS, swipe duration is clamped to a safe range (`16..60ms`) to avoid longpress side effects.
`scroll` accepts either a relative amount (`0.5` means roughly half of the viewport on that axis) or `--pixels <n>` for a fixed-distance gesture. Large distances are clamped to the usable drag band so the gesture stays reliable across Android, iOS, and macOS.
Default snapshot text output is visible-first, so off-screen interactive content is summarized instead of shown as tappable refs.
When a target only appears in an off-screen summary, use `scroll <direction>` and then take a fresh `snapshot -i`. For repeated checks, a small shell loop is enough:

```bash
previous=''
for _ in 1 2 3 4 5 6; do
  current="$(agent-device snapshot -i)"
  printf '%s\n' "$current"
  printf '%s\n' "$current" | grep -q 'Sign in' && break
  [ "$current" = "$previous" ] && break
  previous="$current"
  agent-device scroll down 0.5 >/dev/null
done
```

`longpress` is supported on iOS and Android.
`pinch` is supported on Apple simulators and macOS app sessions.

## Find (semantic)

```bash
agent-device find "Sign In" click
agent-device find label "Email" fill "user@example.com"
agent-device find role button click
```

## Assertions

```bash
agent-device is visible 'role="button" label="Continue"'
agent-device is exists 'id="primary-cta"'
agent-device is hidden 'text="Loading..."'
agent-device is editable 'id="email"'
agent-device is selected 'label="Wi-Fi"'
agent-device is text 'id="greeting"' "Welcome back"
```

- `is` evaluates UI predicates against a selector expression and exits non-zero on failure.
- Supported predicates are `visible`, `hidden`, `exists`, `editable`, `selected`, and `text`.
- `is visible` checks whether the resolved element is present in the current visible snapshot viewport. A node without its own rect still passes when a visible ancestor within the viewport provides the on-screen geometry.
- `is exists` only checks whether the selector matches in the current snapshot.
- `wait text` is a text-presence wait, not a hittability assertion.
- `is text <selector> <value>` compares the resolved element text against the expected value.
- `is` does not accept snapshot refs like `@e3`; use a selector expression instead.
- `is` accepts the same selector-oriented snapshot flags as `click`, `fill`, `get`, and `wait`.

## Replay

```bash
agent-device open Settings --platform ios --session e2e --save-script [path]
agent-device replay ./session.ad      # Run deterministic replay from .ad script
agent-device test ./suite             # Run every .ad file in a folder or glob serially
agent-device test ./suite --timeout 60000 --retries 1
agent-device replay -u ./session.ad   # Update selector drift and rewrite .ad script in place
```

- `replay` runs deterministic `.ad` scripts.
- `test` runs one or more `.ad` scripts as a serial suite from files, directories, or glob inputs.
- `test --platform <platform>` filters suite files by `context platform=...` metadata instead of overriding the script target.
- `test --timeout <ms>` and `test --retries <n>` apply per script attempt; `context timeout=...` and `context retries=...` can be declared inside the `.ad` header. Retries are capped at `3`, duplicate metadata keys are rejected, and timeouts are cooperative.
- `test --artifacts-dir <path>` overrides the default suite artifact root at `.agent-device/test-artifacts`.
- `test` prints failures and flaky passed-on-retry tests by default, and prints a short `Running replay suite...` line before dispatch; add `--verbose` to print pass and skip lines too.
- `replay -u` updates stale recorded actions and rewrites the same script.
- `--save-script` records a replay script on `close`; optional path is a file path and parent directories are created.

See [Replay & E2E (Experimental)](/agent-device/docs/replay-e2e.md) for recording and CI workflow details.

## Batch

```bash
agent-device batch --steps-file /tmp/batch-steps.json --json
agent-device batch --steps '[{"command":"open","positionals":["settings"]}]'
```

- `batch` runs a JSON array of steps in a single daemon request.
- Each step has `command`, optional `positionals`, optional `flags`, and optional `runtime`.
- Unknown top-level step fields are rejected.
- Stop-on-first-error is the supported behavior (`--on-error stop`).
- Use `--max-steps <n>` to tighten per-request safety limits.
- Batch requests inherit the same daemon lock policy and session binding metadata as the parent command.
- In non-JSON mode, successful batches print a short per-step summary.

See [Batching](/agent-device/docs/batching.md) for payload format, response shape, and usage guidelines.

## App install (in-place)

```bash
agent-device install com.example.app ./build/app.apk --platform android
agent-device install com.example.app ./build/MyApp.app --platform ios
```

- `install <app> <path>` installs from binary path without uninstalling first.
- Supports Android devices/emulators, iOS simulators, and iOS physical devices.
- Useful for upgrade flows where you want to keep existing app data when supported by the platform.
- Remote daemons automatically upload local app artifacts for `install`; prefix the path with `remote:` to use a daemon-side path verbatim.
- Supported binary formats: Android `.apk`/`.aab`, iOS `.app`/`.ipa`.
- `.aab` requires `bundletool` in `PATH`, or `AGENT_DEVICE_BUNDLETOOL_JAR=<absolute-path-to-bundletool-all.jar>` with `java` in `PATH`.
- Optional: `AGENT_DEVICE_ANDROID_BUNDLETOOL_MODE=<mode>` overrides bundletool `build-apks --mode` (default: `universal`).
- `.ipa` installs by extracting `Payload/*.app`; if multiple app bundles exist, `<app>` is used as a bundle id/name hint to select one.

## App reinstall (fresh state)

```bash
agent-device reinstall com.example.app ./build/app.apk --platform android
agent-device reinstall com.example.app ./build/MyApp.app --platform ios
```

- `reinstall <app> <path>` uninstalls and installs in one command.
- Supports Android devices/emulators, iOS simulators, and iOS physical devices.
- Useful for login/logout reset flows and deterministic test setup.
- Remote daemons automatically upload local app artifacts for `reinstall`; prefix the path with `remote:` to use a daemon-side path verbatim.
- Supported binary formats: Android `.apk`/`.aab`, iOS `.app`/`.ipa`.
- `.aab` accepts the same bundletool requirements and optional `AGENT_DEVICE_ANDROID_BUNDLETOOL_MODE` override as `install`.
- `.ipa` uses `<app>` as the selection hint when multiple `Payload/*.app` bundles are present.

## App install from source URL

```bash
agent-device install-from-source https://example.com/builds/app.apk --platform android
agent-device install-from-source https://example.com/builds/app.aab --platform android
agent-device install-from-source --github-actions-artifact thymikee/RNCLI83:6635342232 --platform android
```

- `install-from-source <url>` installs from a URL source through the normal daemon artifact flow.
- `install-from-source --github-actions-artifact <owner/repo:artifact>` passes a typed GitHub Actions artifact source through to a compatible remote daemon. Numeric artifacts are sent as `artifactId`; non-numeric artifacts are sent as `artifactName`.
- Repeat `--header <name:value>` for authenticated or signed artifact requests.
- Supports the same device coverage as `install`: Android devices/emulators, iOS simulators, and iOS physical devices.
- Use `install` or `reinstall` for local `.apk`, `.aab`, `.app`, and `.ipa` paths; use `install-from-source` when the artifact already exists at a URL reachable by the daemon.
- Direct Android URL sources may be `.apk` or `.aab`.
- Trusted artifact service URLs may resolve to archives containing one installable `.apk`, `.aab`, `.ipa`, or iOS `.app` tar archive. Prefer `--github-actions-artifact` for GitHub Actions artifacts that a compatible remote daemon can resolve with its own credentials.
- `--retain-paths` keeps retained materialized artifact paths after install, and `--retention-ms <ms>` sets their TTL.
- URL downloads follow the same `installFromSource()` safety checks and host restrictions as the JS client API.

## Push notification simulation

```bash
agent-device push com.example.app ./payload.apns --platform ios
agent-device push com.example.app '{"aps":{"alert":"Welcome","badge":1}}' --platform ios
agent-device push com.example.app '{"action":"com.example.app.PUSH","extras":{"title":"Welcome","unread":3,"promo":true}}' --platform android
```

- `push <bundle|package> <payload.json|inline-json>` simulates push notification delivery.
- iOS push simulation is simulator-only (`xcrun simctl push`) and requires an APNs-style JSON object payload.
- Android uses `adb shell am broadcast` and accepts payload shape:
  `{"action":"<intent-action>","receiver":"<optional component>","extras":{"key":"value","flag":true,"count":3}}`.
- Android extras support `string`, `boolean`, and `number` values.
- `push` works with the active session device, or with explicit selectors (`--platform`, `--device`, `--udid`, `--serial`).

## App event triggers (app hook)

```bash
agent-device trigger-app-event screenshot_taken '{"source":"qa"}'
```

- `trigger-app-event <event> [payloadJson]` dispatches app-defined events via deep link.
- `trigger-app-event` requires either an active session or explicit device selectors (`--platform`, `--device`, `--udid`, `--serial`).
- On macOS, use `AGENT_DEVICE_MACOS_APP_EVENT_URL_TEMPLATE` to override the desktop deep-link template.
- On iOS physical devices, custom-scheme deep links require active app context (open app first in the session).
- Configure one of:
  - `AGENT_DEVICE_APP_EVENT_URL_TEMPLATE`
  - `AGENT_DEVICE_IOS_APP_EVENT_URL_TEMPLATE`
  - `AGENT_DEVICE_MACOS_APP_EVENT_URL_TEMPLATE`
  - `AGENT_DEVICE_ANDROID_APP_EVENT_URL_TEMPLATE`
- Template placeholders: `{event}`, `{payload}`, `{platform}`.
- Example template: `myapp://agent-device/event?name={event}&payload={payload}`.
- `payloadJson` must be a JSON object.
- This is app-hook-based simulation and does not inject OS-global notifications.

## Settings helpers

```bash
agent-device settings wifi on
agent-device settings wifi off
agent-device settings airplane on
agent-device settings airplane off
agent-device settings location on
agent-device settings location off
agent-device settings animations off
agent-device settings animations on
agent-device settings appearance light
agent-device settings appearance dark
agent-device settings appearance toggle
agent-device settings faceid match
agent-device settings faceid nonmatch
agent-device settings faceid enroll
agent-device settings faceid unenroll
agent-device settings touchid match
agent-device settings touchid nonmatch
agent-device settings touchid enroll
agent-device settings touchid unenroll
agent-device settings fingerprint match
agent-device settings fingerprint nonmatch
agent-device settings permission grant camera
agent-device settings permission deny microphone
agent-device settings permission grant photos limited
agent-device settings permission reset notifications
agent-device settings permission grant accessibility --platform macos
agent-device settings permission reset screen-recording --platform macos
```

- iOS `settings` support is simulator-only except for `settings appearance` and the macOS permission subset on macOS.
- macOS supports only `settings appearance <light|dark|toggle>` and `settings permission <grant|reset> <accessibility|screen-recording|input-monitoring>`.
- `settings wifi|airplane|location|animations` remain intentionally unsupported on macOS.
- Android `settings animations off|on` toggles the global `window_animation_scale`, `transition_animation_scale`, and `animator_duration_scale` values. Use it as an opt-in stabilizer for automation runs with heavy system or app animations, then restore with `settings animations on` when needed.
- `settings appearance` maps to macOS appearance, iOS simulator appearance, and Android night mode.
- Face ID and Touch ID controls are iOS simulator-only.
- Fingerprint simulation is supported on Android targets where `cmd fingerprint` or `adb emu finger` is available.
  On physical Android devices, only `cmd fingerprint` is attempted.
- Permission actions are scoped to the active session app.
- iOS permission targets: `camera`, `microphone`, `photos` (`full` or `limited`), `contacts`, `notifications`.
- Android permission targets: `camera`, `microphone`, `photos`, `contacts`, `notifications`.
- macOS permission targets: `accessibility`, `screen-recording`, `input-monitoring`.
- On macOS, `settings permission grant ...` checks/request access and opens System Settings guidance when needed; it does not silently grant TCC permissions.
- On macOS, `settings permission deny ...` is intentionally unsupported.
- Android uses `pm grant|revoke` for runtime permissions (`reset` maps to revoke) and `appops` for notifications.
- `full|limited` mode is supported only for iOS `photos`; other targets reject mode.
- Use `match`/`nonmatch` to simulate valid/invalid Face ID, Touch ID, and Android fingerprint outcomes.

## App state and app lists

```bash
agent-device appstate
agent-device apps --platform ios
agent-device apps --platform ios --all
agent-device apps --platform android
agent-device apps --platform android --all
```

- Android `appstate` reports live foreground package/activity.
- iOS `appstate` is session-scoped and reports the app tracked by the active session on the target device.
- `apps` includes default/system apps by default (use `--user-installed` to filter).

## Clipboard

```bash
agent-device clipboard read
agent-device clipboard write "https://example.com"
agent-device clipboard write ""   # clear clipboard
```

- `clipboard read` returns clipboard text for the selected target.
- Treat `clipboard read` output as sensitive data; it can include secrets copied by the user or app.
- `clipboard write <text>` updates clipboard text on the selected target.
- Works with an active session device or explicit selectors (`--platform`, `--device`, `--udid`, `--serial`).
- Supported on macOS, Android emulator/device, and iOS simulator.
- iOS physical devices currently return `UNSUPPORTED_OPERATION` for clipboard commands.

## Keyboard

```bash
agent-device keyboard status
agent-device keyboard get
agent-device keyboard dismiss
```

- `keyboard status` (or `keyboard get`) returns keyboard visibility and best-effort input type classification on Android.
- `keyboard dismiss` attempts a non-navigation keyboard dismissal on Android and a native dismiss gesture/control on iOS, including common safe controls such as a keyboard toolbar `Done` button, then confirms the keyboard is hidden.
- If the keyboard remains visible after the platform-native dismiss path, the command returns an explicit `UNSUPPORTED_OPERATION` error instead of falling back to back navigation.
- On iOS, `keyboard dismiss` is best-effort and can fail when the active app exposes no native dismiss gesture/control. Prefer a visible app dismiss control, or use `back --system` only when system navigation is an acceptable side effect.
- Works with active sessions and explicit selectors (`--platform`, `--device`, `--udid`, `--serial`).
- `keyboard status|get` is supported on Android emulator/device.
- `keyboard dismiss` is supported on Android emulator/device and best-effort on iOS simulator/device.

## Performance metrics

```bash
agent-device perf --json
agent-device metrics --json
```

- `perf` (alias: `metrics`) returns a session-scoped metrics JSON blob.
- Without `--json`, `perf` prints a compact summary: frame health when reliable frame data is available, otherwise CPU/memory when those samples are available.
- `startup` is sampled from `open-command-roundtrip`: elapsed wall-clock time around each `open` command dispatch for the active session app target.
- Android app sessions with an active package also sample:
  - `fps` frame health from `adb shell dumpsys gfxinfo <package> framestats`, with `droppedFramePercent` as the primary value and `worstWindows` for dropped-frame clusters
  - `memory` from `adb shell dumpsys meminfo <package>` with values reported in kilobytes (`kB`)
  - `cpu` from `adb shell dumpsys cpuinfo`, aggregated across matching package processes and reported as a recent percentage snapshot
- Apple app sessions with an active bundle ID also sample:
  - `fps` frame health from `xcrun xctrace` Animation Hitches on connected iOS devices, with `droppedFramePercent` as the primary value and `worstWindows` for hitch clusters
  - `memory` from process RSS snapshots reported in kilobytes (`kB`)
  - `cpu` from process CPU usage snapshots reported as a recent percentage
- Platform support:
  - `startup`: iOS simulator, iOS physical device, Android emulator/device
  - `memory` and `cpu`: Android emulator/device, macOS app sessions, iOS simulators with an active app session (`open <app>` first), and iOS physical devices with an active app session
  - `fps`: Android emulator/device app sessions and connected iOS device app sessions. iOS simulator and macOS frame health is reported unavailable because Apple tooling does not expose trustworthy app hitch data there.
- If no startup sample exists yet for the session, run `open <app|url>` first and retry `perf`.
- Android URL/deep-link opens infer the foreground package after launch when possible, including Expo Go/dev-client shells. If the session still has no app package/bundle ID, package-bound metrics remain unavailable until you `open <app>`.
- Android frame health is reset after each successful `perf` read and after `open <app>`, so run `perf`, perform the interaction, then run `perf` again for a focused window.
- On physical iOS devices, `perf` records short `xcrun xctrace` Activity Monitor and Animation Hitches samples. Keep the device unlocked, connected, and the app active in the foreground while sampling.
- Interpretation note: this startup metric is command round-trip timing and does not represent true first frame / first interactive app instrumentation.
- CPU data is a lightweight process snapshot, so an idle app may legitimately read as `0`.

## React Native component internals

```bash
agent-device react-devtools status
agent-device react-devtools wait --connected
agent-device react-devtools get tree --depth 3
agent-device react-devtools get component @c5
agent-device react-devtools find Button
agent-device react-devtools profile start
agent-device react-devtools profile stop
agent-device react-devtools profile slow --limit 5
agent-device react-devtools profile rerenders --limit 5
```

- `react-devtools` dynamically runs pinned `agent-react-devtools@0.4.0` through npm and passes arguments through 1:1.
- The first run may download the pinned package from npm; later runs can reuse the npm cache.
- `agent-device` global flags work before or after `react-devtools`. Use `--` before downstream flags only when they intentionally share an `agent-device` global flag name.
- Use it when a React Native workflow needs component hierarchy, props, state, hooks, render causes, slow components, or re-render counts.
- Keep using `snapshot`, `press`, `fill`, `logs`, `network`, and `perf` for device/app runtime evidence. Use `react-devtools` for React internals.
- React Native development builds can connect to the DevTools daemon on port 8097. For Android emulators or physical devices, run `adb reverse tcp:8097 tcp:8097` if the app cannot reach the host. If Metro is local, also run `adb reverse tcp:8081 tcp:8081`.
- For Android and iOS sessions connected through a remote bridge profile, `react-devtools` registers a lease-scoped companion tunnel to the sandbox-local DevTools daemon at `127.0.0.1:8097`. Android bridge profiles use the bridge-owned remote `adb reverse` mapping; iOS bridge profiles use the bridge-owned wildcard Metro host tunnel. The CLI keeps the companion alive until `agent-device react-devtools stop` or `agent-device disconnect`.
- For remote iOS bridge sessions, open the app once to create the bridge session, run `agent-device react-devtools start`, then relaunch the same bundle id with `agent-device open <bundle-id> --platform ios --relaunch` before `wait --connected`. React Native attempts the legacy DevTools websocket during JavaScript startup, so starting DevTools after the first launch can miss that connection attempt.
- Remote bridge React DevTools assumes the React Native-bundled DevTools behavior in React Native 0.83+. Older browser/Chromium DevTools workflows are not assumed to exist inside remote sandboxes. Expo projects should be verified against the SDK's bundled React Native version before relying on this path; this release does not claim a separately verified Expo SDK version.
- For cross-platform validation with explicit target selectors, prefer an isolated `--state-dir` over separate named sessions. Named sessions enable bound-session locks during setup. Restart `react-devtools` between iOS and Android runs.

## Metro reload

```bash
agent-device metro reload
agent-device metro reload --metro-host localhost --metro-port 8081
agent-device metro reload --bundle-url "http://localhost:8081/index.bundle?platform=ios"
```

- `metro reload` calls Metro's `/reload` endpoint, the same mechanism used by pressing `r` in the Metro terminal.
- Use it for React Native dev builds that are already connected to Metro when JS changes should be loaded without restarting the native app process.
- If an active remote connection has Metro runtime hints, `metro reload` uses those saved hints. Otherwise it defaults to `http://localhost:8081/reload`.
- Pass `--metro-host`, `--metro-port`, or `--bundle-url` when you need to target a specific Metro instance.
- Fall back to `open <app> --relaunch` when the app is not connected to Metro, reload fails, or the native process itself must restart.

## Media and logs

```bash
agent-device screenshot                 # Auto filename
agent-device screenshot page.png        # Explicit screenshot path
agent-device screenshot page.png --max-size 1024  # Downscale longest edge for agent-friendly artifacts
agent-device screenshot page.png --overlay-refs  # Draw current @eN refs and target rectangles onto the PNG
agent-device screenshot textedit.png    # App-session window capture on macOS
agent-device screenshot --fullscreen    # Force full-screen capture on macOS app sessions
agent-device open --platform macos --surface desktop && agent-device screenshot desktop.png
agent-device diff screenshot --baseline baseline.png --out diff.png
agent-device diff screenshot --baseline baseline.png current.png --out diff.png
agent-device diff screenshot --baseline baseline.png --out diff.png --overlay-refs
agent-device record start               # Start screen recording to auto filename
agent-device record start session.mp4   # Start recording to explicit path
agent-device record start session.mp4 --fps 30  # Override iOS device runner FPS
agent-device record start session.mp4 --quality 7 # Scale recording resolution to 70%
agent-device record stop                # Stop active recording
```

- Recordings always produce a video artifact. When touch visualization is enabled, they also produce a gesture telemetry sidecar that can be used for post-processing or inspection.
- `screenshot --max-size <px>` preserves aspect ratio and only downscales when the saved PNG's longest edge is larger than the requested size.
- `screenshot --overlay-refs` captures a fresh full snapshot and burns visible `@eN` refs plus their target rectangles into the saved PNG.
- `screenshot --max-size <px> --overlay-refs` writes a smaller image and draws refs for that final image size; avoid very small max sizes when text, icons, or labels need to remain readable.
- `diff screenshot` compares the current live screenshot to `--baseline`, or compares `--baseline` to an optional saved `current.png` path without requiring an active session, then prints ranked changed regions with screen-space rectangles, shape, size, density, average color, and luminance, and writes a diff PNG with a light grayscale current-screen context, red-tinted changed pixels, and outlined changed regions when `--out` is provided. JSON also includes normalized bounds.
- If `tesseract` is installed, `diff screenshot` also adds best-effort OCR text deltas, movement clusters, and bbox size-change hints to the text and JSON output. OCR improves descriptions only; it does not change the pixel comparison or the diff PNG.
- When OCR is available, `diff screenshot` also reports best-effort non-text visual deltas by masking OCR text boxes out of the diff and clustering remaining residuals. These are hints for icons, controls, and separators, not semantic icon recognition.
- `diff screenshot --overlay-refs` additionally writes a separate current-screen overlay guide for live captures without using that annotated image for the pixel comparison. If current-screen refs intersect changed regions, the output lists the best ref matches under those regions. Saved-image comparisons do not have live accessibility refs, so `--overlay-refs` is unavailable when a `current.png` path is provided.
- In `--json` mode, each overlay ref also includes a screenshot-space `center` point for coordinate fallback like `press <x> <y>`.
- Burned-in touch overlays are exported only on macOS hosts, because the overlay pipeline depends on Swift + AVFoundation helpers.
- On Linux or other non-macOS hosts, `record stop` still succeeds and returns the raw video plus telemetry sidecar, and includes `overlayWarning` when burn-in overlays were skipped.

**Session app logs (token-efficient debugging):** Logging is off by default in normal flows. Enable it on demand for debugging. Logs are written to a file so agents can grep instead of loading full output into context.

```bash
agent-device logs path                  # Print session log file path (e.g. ~/.agent-device/sessions/default/app.log)
agent-device logs start                 # Start streaming app stdout/stderr to that file (requires open first)
agent-device logs stop                  # Stop streaming
agent-device logs clear                 # Truncate app.log + remove rotated app.log.N files (requires stopped stream)
agent-device logs clear --restart       # Stop stream, clear log files, and start streaming again
agent-device logs doctor                # Show logs backend/tool checks and readiness hints
agent-device logs mark "before submit"  # Insert timeline marker into app.log
agent-device network dump 25            # Parse recent HTTP(s) requests (method/url/status) from session app log
agent-device network dump 25 --include all # Include parsed headers/body when available (truncated)
```

- Supported on iOS simulator, iOS physical device, and Android.
- Preferred debug entrypoint: `logs clear --restart` for clean-window repro loops.
- `logs start` appends to `app.log` and rotates to `app.log.1` when the file exceeds 5 MB.
- `network dump [limit] [summary|headers|body|all]` parses recent HTTP(s) entries from `app.log`; `network log ...` is an alias.
- Prefer `--include headers|body|all` when you want explicit detail level without relying on positional ordering.
- On macOS, `logs` and `network dump` are app-scoped and parse Unified Logging output associated with the active session app.
- Network dump limits: scans up to 4000 recent log lines, returns up to 200 entries, and truncates payload/header fields at 2048 characters.
- Android `network dump` also surfaces logcat timestamps and can backfill status and duration from adjacent GIBSDK packet lines when the URL is logged separately.
- Android log streaming automatically rebinds to the app PID after process restarts.
- iOS simulator log capture now streams from inside the simulator with `simctl spawn <udid> log ...`, and `network dump` can recover recent simulator log history with `simctl log show` when the live app-log window is sparse.
- iOS log capture still relies on Unified Logging signals (for example `os_log`); plain stdout/stderr output may be limited depending on app/runtime.
- On iOS, `network dump` can return zero HTTP entries for real app activity when the app does not emit request metadata into Unified Logging. The response notes now distinguish between an empty repro window and a non-network app log window.
- Retention knobs: set `AGENT_DEVICE_APP_LOG_MAX_BYTES` and `AGENT_DEVICE_APP_LOG_MAX_FILES` to override rotation limits.
- Optional write-time redaction patterns: set `AGENT_DEVICE_APP_LOG_REDACT_PATTERNS` to a comma-separated regex list.

**Grepping app logs:** Use `logs path` to get the file path, then run `grep` (or `grep -E`) on that path so only matching lines enter context—keeping token use low.

```bash
# Get path first (e.g. ~/.agent-device/sessions/default/app.log)
agent-device logs path

# Then grep the path; -n adds line numbers for reference
grep -n "Error\|Exception\|Fatal" ~/.agent-device/sessions/default/app.log
grep -n -E "Error|Exception|Fatal|crash" ~/.agent-device/sessions/default/app.log
grep -n -E "agent-device.*mark|before submit" ~/.agent-device/sessions/default/app.log

# Last 50 lines only (bounded context)
tail -50 ~/.agent-device/sessions/default/app.log
```

- Use `-n` to include line numbers. Use `-E` for extended regex and `|` without escaping in the pattern.

- Prefer targeted patterns (e.g. `Error`, `Exception`, your log tags) over reading the whole file.

- `logs mark "before submit"` lines are prefixed with `[agent-device][mark][...]`, so grep for `agent-device.*mark` when you need timing markers back quickly.

- iOS `record` works on simulators and physical devices.

- iOS simulator recording uses native `simctl io ... recordVideo`.

- Physical iOS device capture is runner-based and built from repeated `XCUIScreen.main.screenshot()` frames (no native video stream/audio capture).

- Physical iOS device recording requires an active app session context (`open <app>` first).

- Physical iOS device capture is best-effort: dropped frames are expected and true 60 FPS is not guaranteed even with `--fps 60`.

- Physical-device capture defaults to 15 FPS.

- `--fps <n>` (1-120) applies to physical iOS device recording as an explicit FPS cap.

- `--quality <5-10>` scales recording resolution from 50% through native resolution without changing FPS. Omitting it preserves the platform's current/native recording resolution.

## Tracing

```bash
agent-device trace start
agent-device trace start session.trace
agent-device trace stop
agent-device trace stop session.trace
```

- `trace start [path]` begins trace-log capture for the active session.
- `trace stop [path]` stops capture and optionally writes or finalizes the trace artifact at the provided path.
- `trace` is intended for lower-level session diagnostics than `record` or `logs`.

## Remote Metro workflow

When the cloud control plane owns the connection profile, connect can discover it directly:

```bash
agent-device connect
agent-device open com.example.myapp --relaunch
agent-device snapshot -i
agent-device disconnect
```

For local profile files, create an `agent-device.remote.json`:

```json
{
  "daemonBaseUrl": "https://bridge.example.com/agent-device",
  "daemonTransport": "http",
  "tenant": "acme",
  "runId": "run-123",
  "session": "adc-ios",
  "sessionIsolation": "tenant",
  "platform": "ios",
  "leaseBackend": "ios-instance",
  "metroProjectRoot": ".",
  "metroProxyBaseUrl": "https://bridge.example.com"
}
```

```bash
agent-device connect --remote-config ./agent-device.remote.json
agent-device open com.example.myapp --relaunch
agent-device snapshot -i
agent-device disconnect
```

For self-contained scripts, pass the same profile to each step:

```bash
agent-device install-from-source https://example.com/builds/Demo.app.zip --remote-config ./agent-device.remote.json --platform ios
agent-device open com.example.myapp --remote-config ./agent-device.remote.json --relaunch
agent-device snapshot --remote-config ./agent-device.remote.json -i
agent-device disconnect --remote-config ./agent-device.remote.json
```

- `connect` without `--remote-config` authenticates to cloud when needed, fetches the connection profile, writes a generated local profile, stores the remote scope locally, and defers tenant lease allocation plus Metro preparation until a later command needs them.
- Cloud connection profile responses must return a JSON object at `connection.remoteConfigProfile`. The older `connection.remoteConfig` JSON string shape is no longer accepted.
- `--remote-config <path>` points to a local remote workflow profile that captures stable host, tenant/run, and any optional session, platform, lease backend, or Metro overrides for `connect`.
- `connect --remote-config ...` follows the same state and deferred-preparation flow using the local profile instead of cloud discovery.
- Auth management commands are available for inspection and recovery: `agent-device auth status`, `agent-device auth login`, and `agent-device auth logout`. Human login stores a revocable CLI session locally; it does not create or persist an `adc_live_...` service token.
- Cloud auth uses three credential classes: `adc_agent_...` short-lived command tokens, revocable CLI session refresh credentials, and explicit `adc_live_...` service/API tokens for CI. The CLI implements credential selection, CI refusal, local storage permissions, logout, and output redaction; the cloud API must enforce token expiry, tenant/run scope, revocation, one-time device approval, polling rate limits, and dashboard/API separation.
- `AGENT_DEVICE_CLOUD_BASE_URL` should point at the bridge/control-plane API origin, not necessarily the dashboard origin. API-token setup links use `/api-keys` on that origin so the bridge can redirect users to the right dashboard page.
- Deferred Metro preparation also applies to `batch` when any step opens an app and the batch does not provide its own per-step runtime.
- After `connect`, `install-from-source`, `open`, `snapshot`, `devices`, `press`, `fill`, `screenshot`, and other normal commands reuse active connection state so agents do not repeat remote host/session/lease selectors inline. If `connection status` shows `leaseId=pending`, the first platform-bound command allocates or refreshes the lease. Passing the same `--remote-config` to a normal command is also supported for self-contained scripts; the CLI reuses matching saved state or creates it before dispatch.
- Self-contained remote scripts should end with `disconnect --remote-config <path>` or `disconnect` to release the lease and stop the owned Metro companion.
- Explicit command-line flags override connected defaults. When `open` uses explicit remote daemon or tenant flags without saved runtime hints, the CLI warns because React Native apps may launch without Metro bundle/runtime hints.
- `metroProxyBaseUrl` is the bridge origin. Do not prebuild `/api/metro/...` paths in the client profile; the CLI calls the bridge endpoints itself.
- For cloud stock React Native iOS, the bridge descriptor supplies direct wildcard HTTPS Metro hints such as `<runtime>.metro.agent-device.dev:443`. The XCTest runner package is still used for runner-backed device commands, not for Metro reachability.
- Android keeps using bridge-provided runtime routes such as `/api/metro/runtimes/<runtimeId>/...`.
- `metroPublicBaseUrl` is only needed for direct/non-bridge bundle hints. Bridged profiles can omit it and rely on `metroProxyBaseUrl`.
- `metro prepare --remote-config ...` remains an advanced inspection/debug path and can still write a `--runtime-file <path>` artifact when needed.
- The local Metro companion runs on the same machine as the React Native project and Metro. `disconnect` stops the companion owned by the connection, but it does not stop the user’s Metro server.

### Cloud profile response migration

`/api/control-plane/connection-profile` must return an object at `connection.remoteConfigProfile`, for example `{"connection":{"remoteConfigProfile":{"daemonBaseUrl":"https://bridge.example.com/agent-device","daemonTransport":"http","tenant":"acme","runId":"run-123"}}}`. The old `connection.remoteConfig` JSON-string wrapper is rejected.

## Session inspection

```bash
agent-device session list
agent-device session list --json
```

- `session list` shows active daemon sessions and their tracked device/app context.
- Use `--json` when you want to inspect or script against the raw session metadata.

## iOS device prerequisites

- Xcode + `xcrun devicectl` available.
- Paired physical device with Developer Mode enabled.
- Use Automatic Signing in Xcode, or pass optional env overrides:
  - `AGENT_DEVICE_IOS_TEAM_ID`
  - `AGENT_DEVICE_IOS_SIGNING_IDENTITY` (optional)
  - `AGENT_DEVICE_IOS_PROVISIONING_PROFILE`
  - `AGENT_DEVICE_IOS_BUNDLE_ID` (runner bundle-id base; tests use `<id>.uitests`)
- Free Apple Developer (Personal Team) accounts can fail on unavailable generic bundle IDs; set `AGENT_DEVICE_IOS_BUNDLE_ID` to a unique reverse-DNS value.
- If first-run XCTest setup/build is slow, increase daemon request timeout:
  - `AGENT_DEVICE_DAEMON_TIMEOUT_MS=120000` (default is `90000`)
- For daemon startup troubleshooting:
  - follow stale metadata hints for `<state-dir>/daemon.json` and `<state-dir>/daemon.lock` (`state-dir` defaults to `~/.agent-device`)


---
url: /agent-device/docs/configuration.md
---

# Configuration

Create an `agent-device.json` file to set persistent CLI defaults instead of repeating flags on every command.

## Config file locations

agent-device checks these sources in priority order:

| Priority    | Location                      | Scope                   |
| ----------- | ----------------------------- | ----------------------- |
| 1 (lowest)  | `~/.agent-device/config.json` | User-level defaults     |
| 2           | `./agent-device.json`         | Project-level overrides |
| 3           | `AGENT_DEVICE_*` env vars     | Override config values  |
| 4 (highest) | CLI flags                     | Override everything     |

Project-level values override user-level values. Environment variables override both. CLI flags always win.

Use `--config <path>` or `AGENT_DEVICE_CONFIG` to load one specific config file instead of the default locations.

## Config format

Config files use JSON objects with camelCase keys matching existing CLI flag names.

Environment variables follow the same fields using `AGENT_DEVICE_*` uppercase snake case names, for example:

- `session` -> `AGENT_DEVICE_SESSION`
- `daemonBaseUrl` -> `AGENT_DEVICE_DAEMON_BASE_URL`
- `iosSimulatorDeviceSet` -> `AGENT_DEVICE_IOS_SIMULATOR_DEVICE_SET`
- `androidDeviceAllowlist` -> `AGENT_DEVICE_ANDROID_DEVICE_ALLOWLIST`

When a CLI option has multiple flag aliases, config and environment sources use the canonical option value rather than a flag alias. Example:

- config: `"appsFilter": "user-installed"`
- env: `AGENT_DEVICE_APPS_FILTER=user-installed`
- not: `--user-installed`

Legacy compatibility env vars are still accepted for device scoping:

- `IOS_SIMULATOR_DEVICE_SET`
- `ANDROID_DEVICE_ALLOWLIST`

Example:

```json
{
  "platform": "ios",
  "device": "iPhone 16",
  "session": "qa-ios",
  "snapshotDepth": 3,
  "daemonBaseUrl": "http://127.0.0.1:4310/agent-device"
}
```

For non-loopback remote daemon URLs, also set `daemonAuthToken` or `AGENT_DEVICE_DAEMON_AUTH_TOKEN`. The client rejects non-loopback remote daemon URLs without auth.

Common keys include:

- `stateDir`
- `daemonBaseUrl`
- `daemonAuthToken`
- `daemonTransport`
- `daemonServerMode`
- `tenant`
- `sessionIsolation`
- `runId`
- `leaseId`
- `leaseBackend`
- `sessionLock`
- `sessionLocked`
- `sessionLockConflicts`
- `platform`
- `target`
- `device`
- `udid`
- `serial`
- `iosSimulatorDeviceSet`
- `androidDeviceAllowlist`
- `session`
- `verbose`
- `json`

Command-specific defaults are supported too, for example `snapshotDepth`, `snapshotScope`, `activity`, `relaunch`, `shutdown`, `fps`, `quality`, `stepsFile`, or `saveScript`.

`install-from-source` can also read a structured GitHub Actions artifact source from config when a compatible remote daemon resolves CI artifacts server-side:

```json
{
  "platform": "android",
  "installSource": {
    "type": "github-actions-artifact",
    "repo": "thymikee/RNCLI83",
    "artifact": "rn-android-emulator-debug-pr-19"
  }
}
```

Use a numeric `artifact` value for an artifact ID. Use a string `artifact` value for an artifact name.

Bound-session defaults use the same config and env mapping too:

- `sessionLock` -> `AGENT_DEVICE_SESSION_LOCK`
- `sessionLocked` -> `AGENT_DEVICE_SESSION_LOCKED`
- `sessionLockConflicts` -> `AGENT_DEVICE_SESSION_LOCK_CONFLICTS`

## Command-specific defaults

Command-specific keys are applied only when the current command supports them.

Examples:

- A default `snapshotDepth` applies to `snapshot`, `diff snapshot`, `click`, `fill`, `get`, `wait`, `find`, and `is`.
- The same `snapshotDepth` value is ignored for commands like `open`, `close`, or `devices`.

This keeps one shared config file usable across different command families.

## Failure behavior

- If `--config` or `AGENT_DEVICE_CONFIG` points to a missing file, agent-device fails during CLI parse before contacting the daemon.
- Invalid JSON, unknown keys, or invalid values in config files also fail during CLI parse with `INVALID_ARGS`.


---
url: /agent-device/docs/debugging-profiling.md
---

# Debugging & Profiling

Use `agent-device` when the task moves past UI automation and you need runtime evidence from the app or device layer.

## What `agent-device` covers well

- Session app logs for targeted debugging windows
- Network inspection from recent HTTP(s) entries in app logs via `network dump`
- Performance snapshots with `perf` / `metrics`
- Screenshots, recordings, and replayable repro flows

## React Native component internals

If the task needs the React Native component tree, props, state, hooks, or render profiling, use the `react-devtools` passthrough:

```bash
agent-device react-devtools status
agent-device react-devtools get tree --depth 3
agent-device react-devtools get component @c5
agent-device react-devtools profile start
agent-device react-devtools profile stop
agent-device react-devtools profile slow --limit 5
```

`agent-device` remains centered on the device and app runtime layer. The `react-devtools` command dynamically runs pinned `agent-react-devtools` commands for React internals.

## Fast path

```bash
agent-device open MyApp --platform ios
agent-device logs clear --restart
agent-device network dump 25 --include headers
agent-device perf --json
agent-device logs path
```

Use this flow when you need a clean repro window with logs, recent network activity, and a quick perf sample from the active app session.

## Core commands

### Logs

```bash
agent-device logs start
agent-device logs stop
agent-device logs clear --restart
agent-device logs path
agent-device logs doctor
agent-device logs mark "before submit"
```

- Logging is off by default; enable it only for focused debugging windows.
- Prefer `logs clear --restart` for clean repro loops.
- Use `logs path` and then grep the file instead of loading whole logs into agent context.

### Network inspection

```bash
agent-device network dump 25
agent-device network dump 25 --include headers
agent-device network dump 25 --include all
```

- `network dump` parses recent HTTP(s) entries from the session app log.
- `network log` is an alias for `network dump`.
- Parsed results depend on what the app emits into the platform log backend.

### Performance snapshots

```bash
agent-device perf --json
agent-device metrics --json
```

- `perf` returns session-scoped startup and, where supported, CPU, memory, and Android frame-health samples.
- Startup is measured around the `open` command; it is not first-frame instrumentation.
- CPU, memory, and Android frame-health availability depend on platform and whether the active session is bound to an app/package.
- On Android, use `metrics.fps.droppedFramePercent` for the health check and `metrics.fps.worstWindows` to line up jank clusters with logs, network activity, or recent actions.

## Where to go deeper

- Full command reference: [Commands](/agent-device/docs/commands.md)
- Typed client observability APIs: [Typed Client](/agent-device/docs/client-api.md)
- Session behavior and lifecycle: [Sessions](/agent-device/docs/sessions.md)


---
url: /agent-device/docs/installation.md
---

# Installation

Install `agent-device` on the machine where the coding agent will run terminal commands.

## Global install

```bash
npm install -g agent-device@latest
agent-device --version
agent-device help
```

Use global install for normal agent workflows. It gives agents a stable `agent-device` command and version-matched help topics:

```bash
agent-device help workflow
agent-device help debugging
agent-device help react-devtools
```

For Cursor, Codex, Claude Code, Windsurf, Cline, Goose, skills, project rules, and MCP configuration, see [AI Agent Setup](/agent-device/docs/agent-setup.md). For the first app automation commands, see [Quick Start](/agent-device/docs/quick-start.md).

Interactive CLI runs periodically check for a newer published `agent-device` package in the background. When an upgrade is available, the CLI suggests reinstalling the package globally:

```bash
npm install -g agent-device@latest
agent-device --version
```

Set `AGENT_DEVICE_NO_UPDATE_NOTIFIER=1` to disable the notice.

## Agent clients and MCP

The official MCP router is intentionally thin. It helps clients discover install/status/help, while app and device automation remains explicit CLI activity in the terminal.

```bash
agent-device mcp
```

Use [AI Agent Setup](/agent-device/docs/agent-setup.md#mcp-router) for copy-paste MCP client configuration.

## Without installing

```bash
npx -y agent-device@latest --version
npx -y agent-device@latest help workflow
npx -y agent-device@latest open Settings --platform ios
```

One-off `npx` usage is fine for humans and scripts. For agents, prefer global install so repeated commands and any installed skills resolve to the same CLI version. If an agent cannot rely on skills, it should run `agent-device help` or `agent-device help workflow` before planning device commands.

## Requirements

- Node.js 22+
- Xcode for iOS simulator/device automation (`simctl` + `devicectl`)
- Android SDK / ADB for Android
- On macOS desktop targets, Swift 5.9+ / Xcode command-line tools are used to build the local `agent-device-macos-helper` on first use from source checkouts

## macOS desktop notes

- The macOS desktop path uses a local `agent-device-macos-helper` for permission checks (`settings permission ...`), alert handling, and helper-backed desktop snapshot surfaces (`frontmost-app`, `desktop`, `menubar`).
- Source checkouts build the helper lazily on first use and cache it under `~/.agent-device/macos-helper/current/`.
- Release distribution should ship a stable signed/notarized helper build so macOS trust/TCC state is tied to a durable code signature instead of an ad-hoc local binary.
- Local helper overrides through `AGENT_DEVICE_MACOS_HELPER_BIN` are intended for operators and packaged distributions; the value must be an absolute executable path.

## iOS physical device prerequisites

- Device is paired and visible in `xcrun devicectl list devices`.
- Developer Mode enabled on device.
- Signing configured in Xcode (Automatic Signing recommended), or use:
- `AGENT_DEVICE_IOS_TEAM_ID`
- `AGENT_DEVICE_IOS_SIGNING_IDENTITY`
- `AGENT_DEVICE_IOS_PROVISIONING_PROFILE`
- `AGENT_DEVICE_IOS_BUNDLE_ID` (optional runner bundle-id base override)
- Free Apple Developer (Personal Team) accounts can fail with "bundle identifier is not available" for generic IDs; set `AGENT_DEVICE_IOS_BUNDLE_ID` to a unique reverse-DNS value (for example `com.yourname.agentdevice.runner`).
- If device setup is slow, increase daemon timeout:
  - `AGENT_DEVICE_DAEMON_TIMEOUT_MS=120000` (default is `90000`)
- If daemon startup reports stale metadata, remove stale files and retry:
  - `<state-dir>/daemon.json`
  - `<state-dir>/daemon.lock`
  - default state dir is `~/.agent-device` unless `AGENT_DEVICE_STATE_DIR` or `--state-dir` is set
- Optional remote tenancy/lease controls:
  - `AGENT_DEVICE_MAX_SIMULATOR_LEASES=<n>`
  - `AGENT_DEVICE_LEASE_TTL_MS=<ms>`
  - `AGENT_DEVICE_LEASE_MIN_TTL_MS=<ms>`
  - `AGENT_DEVICE_LEASE_MAX_TTL_MS=<ms>`


---
url: /agent-device/docs/known-limitations.md
---

# Known Limitations

Platform constraints that affect automation behavior.

## iOS: "Allow Paste" dialog suppressed under XCUITest

iOS 16+ shows an "Allow Paste" system prompt when an app reads `UIPasteboard.general` in the foreground. When an app is launched or activated through the XCUITest runner (which `agent-device` uses for iOS), the iOS runtime detects the testing context and silently grants pasteboard access — the prompt never appears.

This is an Apple platform constraint that affects all XCUITest-based automation tools.

**Workarounds:**

- **Pre-fill the pasteboard via simctl** — set clipboard content without triggering the dialog:
  ```bash
  echo "some text" | xcrun simctl pbcopy booted
  ```
- **Test the dialog manually** — the "Allow Paste" UX cannot be exercised through XCUITest-based automation.

## Android: non-ASCII text over `adb shell input text`

Some Android system images fail to inject non-ASCII text (for example Chinese characters or emoji) through `adb shell input text`.

**Workarounds:**

- **Use an ADB keyboard IME for test runs**:
  ```bash
  adb -s <serial> install <path-to-adbkeyboard.apk>
  adb -s <serial> shell ime enable com.android.adbkeyboard/.AdbIME
  adb -s <serial> shell ime set com.android.adbkeyboard/.AdbIME
  ```
- **Use trusted APK sources only** (official project: [https://github.com/senzhk/ADBKeyBoard](https://github.com/senzhk/ADBKeyBoard) or F-Droid: [https://f-droid.org/packages/com.android.adbkeyboard/](https://f-droid.org/packages/com.android.adbkeyboard/)), and verify checksum/signature before installing.
- **Revert to your normal IME after automation**:
  ```bash
  adb -s <serial> shell ime list -s
  adb -s <serial> shell ime set <previous-ime-id>
  ```


---
url: /agent-device/docs/quick-start.md
---

# Quick Start

For client-specific setup in Cursor, Codex, Claude Code, Windsurf, Cline, Goose, and other coding agents, see [AI Agent Setup](/agent-device/docs/agent-setup.md). For a single text bundle that agents can ingest, use [llms-full.txt](https://incubator.callstack.com/agent-device/llms-full.txt).

Every device automation follows this pattern:

```bash
# 1. Discover the installed app identifier when needed
agent-device apps --platform ios # or android

# 2. Navigate
agent-device open SampleApp --platform ios # or android

# 3. Snapshot to get element refs
agent-device snapshot -i
# Output:
# @e1 [heading] "Sample App"
# @e2 [button] "Settings"
# [off-screen below] 2 interactive items: "Privacy", "Battery"

# 4. Interact using refs
agent-device click @e2

# 5. Re-snapshot before next interactions; if a target only appears in an off-screen summary, scroll and re-snapshot first
agent-device snapshot -i

# 6. Optional: see structural changes since last baseline
agent-device diff snapshot
# or, from snapshot-focused help/examples:
agent-device snapshot --diff
```

React Native dev or debug builds often show warning or error overlays that can intercept taps or hide the real UI state. Check for them near app open and after major transitions. If they are not the requested behavior, dismiss them and continue, but mention them in your summary if you saw them.

Boot target if there is no ready device/simulator:

```bash
agent-device boot --platform ios # or android
# Android emulator launch by AVD name (GUI mode):
agent-device boot --platform android --device Pixel_9_Pro_XL
# Android headless emulator boot (AVD name):
agent-device boot --platform android --device Pixel_9_Pro_XL --headless
```

## Common commands

```bash
agent-device apps --platform android    # Discover the exact package name when unsure
agent-device open SampleApp
agent-device snapshot -i                 # Get visible interactive elements with refs
agent-device diff snapshot               # Preferred exploration form for structural deltas
agent-device click @e2                   # Click by ref
agent-device fill @e3 "test@example.com" # Clear then type (Android verifies and retries once if needed)
agent-device press @e3
agent-device type " more" --delay-ms 80  # Append into the already focused field
agent-device get text @e1                # Get text content
agent-device screenshot page.png         # Save to specific path
agent-device install com.example.app ./build/app.apk     # Install app binary in-place
agent-device install-from-source https://example.com/builds/app.apk --platform android
agent-device reinstall com.example.app ./build/app.apk   # Fresh-state uninstall + install
agent-device close
```

`install`/`reinstall` binary format support:

- Android: `.apk` and `.aab`
- iOS: `.app` and `.ipa`
- `.aab` requires `bundletool` in `PATH`, or `AGENT_DEVICE_BUNDLETOOL_JAR=<absolute-path-to-bundletool-all.jar>` with `java` in `PATH`.
- Optional: `AGENT_DEVICE_ANDROID_BUNDLETOOL_MODE=<mode>` overrides bundletool `build-apks --mode` (default: `universal`).
- `.ipa` installs extract `Payload/*.app`; if multiple app bundles exist, `<app>` selects the target by bundle id or bundle name.
- Use `install-from-source` for existing artifact URLs, including direct Android `.apk`/`.aab` URLs and trusted archives with one installable artifact. Use `install-from-source --github-actions-artifact <owner/repo:artifact>` for daemon-resolved GitHub Actions artifacts.

If `open` fails because no booted simulator/emulator/device is available, run `boot --platform ios|android` and retry.
If `open` fails because the app id is wrong or missing, run `apps` and retry with the discovered package or bundle id instead of guessing.

## Fast batching

When an agent already knows a short sequence of actions, batch them:

```bash
agent-device batch \
  --platform ios \
  --steps-file /tmp/batch-steps.json \
  --json
```

Example batch payload for a known chat flow:

```json
[
  { "command": "open", "positionals": ["ChatApp"], "flags": { "platform": "android" } },
  { "command": "click", "positionals": ["label=\"Travel chat\""], "flags": {} },
  { "command": "wait", "positionals": ["label=\"Message\"", "3000"], "flags": {} },
  { "command": "fill", "positionals": ["label=\"Message\"", "Filed the expense"], "flags": {} },
  { "command": "press", "positionals": ["label=\"Send\""], "flags": {} }
]
```

See [Batching](/agent-device/docs/batching.md) for payload format, failure handling, and best practices.

## Semantic discovery

Use `find` for human-readable targeting without refs:

```bash
agent-device find "Sign In" click
agent-device find label "Email" fill "user@example.com"
agent-device find role button click
```

## Replay (experimental)

For deterministic replay scripts and E2E guidance, see [Replay & E2E (Experimental)](/agent-device/docs/replay-e2e.md).

## Scrolling

Navigate content that extends beyond the viewport:

```bash
agent-device scroll down 0.5            # Scroll down half screen
agent-device scroll up 0.3              # Scroll up 30%
agent-device scroll down --pixels 320   # Scroll down by a fixed distance
```

## Settings helpers

Toggle device settings directly:

```bash
agent-device settings wifi on
agent-device settings airplane on
agent-device settings appearance toggle
agent-device settings location off
agent-device settings permission grant camera
```

Note: iOS `settings` commands are simulator-only. On macOS, only `settings appearance ...` and `settings permission <grant|reset> <accessibility|screen-recording|input-monitoring>` are supported.

## JSON output

For programmatic parsing in scripts:

```bash
agent-device snapshot --json
agent-device get text @e1 --json
```

Note: The default text output is more compact and preferred for AI agents.


---
url: /agent-device/docs/replay-e2e.md
---

# Replay & E2E Testing (Experimental)

Agents use refs for exploration and authoring. Replay scripts are deterministic runs that can be used for E2E testing.

## Core model

Two-pass workflow:

1. Agent pass: discover and interact with refs (`snapshot` -> `click @e..` / `fill @e..`).
2. Deterministic pass: run recorded `.ad` script with `replay`.

## Record a replay script

Enable recording during a session:

```bash
agent-device open Settings --platform ios --session e2e --save-script
agent-device snapshot -i --session e2e
agent-device click @e13 --session e2e
agent-device close --session e2e
```

By default, on `close`, a replay script is written to:

```text
~/.agent-device/sessions/<session>-<timestamp>.ad
```

You can also provide a custom output file path:

```bash
agent-device open Settings --platform ios --session e2e --save-script ./workflows/e2e-settings.ad
```

- `--save-script` value is treated as a file path.
- Parent directories are created automatically when they do not exist.
- For ambiguous bare values, use `--save-script=workflow.ad` or a path-like value such as `./workflow.ad`.

## Run replay

```bash
agent-device replay ~/.agent-device/sessions/e2e-2026-02-09T12-00-00-000Z.ad --session e2e-run
```

- Replay reads `.ad` scripts.

## Run a lightweight `.ad` suite

```bash
agent-device test ./workflows
agent-device test "./workflows/**/*.ad" --platform android
agent-device test ./workflows --timeout 60000 --retries 1
agent-device test ./workflows --artifacts-dir ./tmp/agent-device-artifacts
```

- `test` discovers `.ad` files from files, directories, or globs and runs them serially.
- `context platform=...` inside each `.ad` file is the target source of truth for suite execution.
- `--platform` is a filter for suite discovery; files without platform metadata are skipped when a filter is present.
- `context timeout=...` and `context retries=...` can be declared per script; CLI flags override metadata. Retries are capped at `3`, and duplicate keys in the context header fail fast instead of silently overriding each other.
- By default, suite artifacts are written under `.agent-device/test-artifacts/<run-id>/...`. Each attempt writes `replay.ad` and `result.txt`; failed attempts also keep copied logs and artifact files when the replay produced them.
- Timeouts are cooperative: the runner marks the attempt failed at the timeout boundary, then gives the underlying replay a short grace period to stop before session cleanup.
- The default text reporter prints the suite summary, failed tests, and passed-on-retry flaky tests; use `--verbose` to print every test result.
- When `--fail-fast` and retries are both set, the current test still consumes its retries before the suite stops.

## Parametrise `.ad` scripts

Substitute `${VAR}` tokens in `.ad` scripts using values from the CLI, shell env, script-local `env` directives, or built-ins.

```sh
context platform=android
env APP_ID=settings
env WAIT_SHORT=500

open ${APP_ID} --relaunch
wait ${WAIT_SHORT}
click "label=${APP_ID}"
```

### Precedence

| Source                       | Priority | Example                                                                 |
| ---------------------------- | -------- | ----------------------------------------------------------------------- |
| CLI `-e KEY=VALUE`           | highest  | `agent-device test flow.ad -e APP_ID=demo`                              |
| Shell env prefixed `AD_VAR_` |          | `AD_VAR_APP_ID=demo agent-device test flow.ad` (imported as `APP_ID`)   |
| Script `env KEY=VALUE`       |          | `env APP_ID=settings` in header                                         |
| Built-ins                    | runtime  | `AD_PLATFORM`, `AD_SESSION`, `AD_FILENAME`, `AD_DEVICE`, `AD_ARTIFACTS` |

### Built-ins

Built-ins are provided by replay/test runtime and use the reserved `AD_*` namespace.

- `AD_PLATFORM` - matches `context platform=...` or the selected platform when available
- `AD_SESSION` - active session name
- `AD_FILENAME` - path of the running `.ad` file
- `AD_DEVICE` - device identifier (when `--device` is set)
- `AD_ARTIFACTS` - attempt artifacts directory (when running under `test`)

User-defined keys starting with `AD_` are rejected in `env`, `-e`, and shell imports such as `AD_VAR_AD_FOO`, so built-ins cannot be overridden.

Substitution happens inside parsed string values. It does not create extra arguments, so quote selectors or text values that contain spaces:

```sh
env SETTINGS="label=Account || label=Profile"
click "${SETTINGS}"
```

### Fallback and escape

```sh
wait ${WAIT_MS:-500}
```

`${VAR:-default}` yields `default` when `VAR` is unset.

```sh
echo "Price: \${APP}"
```

`\${APP}` emits a literal `${APP}` with no substitution.

### Recipes

Run one flow against two app variants in CI:

```sh
agent-device test ./flows/login.ad -e APP_ID=com.example.debug
agent-device test ./flows/login.ad -e APP_ID=com.example.release
```

Tune timings locally without editing the script:

```sh
AD_VAR_WAIT_SHORT=2000 agent-device replay ./flow.ad
```

Extract a reusable selector. Before:

```sh
click "label=Account || label=Profile || label=User"
wait 500
click "label=Account || label=Profile || label=User"
```

After:

```sh
env SETTINGS="label=Account || label=Profile || label=User"

click "${SETTINGS}"
wait 500
click "${SETTINGS}"
```

Quote `${VAR}` inside selector expressions so the whole expression is treated as a single argument.

### Notes

- `replay -u` does not yet preserve `env` directives or `${VAR}` tokens. Workaround: temporarily inline the literal values, run `-u`, re-parametrise.
- Shell env (`AD_VAR_*`) is collected on the CLI/client side at request time, so the same values are seen whether the daemon runs locally or remotely.
- No nested fallback. `${A:-${B}}` is not supported.
- Unresolved `${VAR}` fails with a `file:line` reference. Typos are loud.

## Update stale selectors in replay scripts

```bash
agent-device replay -u ~/.agent-device/sessions/e2e-2026-02-09T12-00-00-000Z.ad --session e2e-run
```

When a replay step fails, update can:

- Take a fresh snapshot.
- Resolve a stable replacement target.
- Retry the step.
- Rewrite the failing line in the same `.ad` file.

Current update targets:

- `click`
- `fill`
- `get`
- `is`
- `wait`

## `replay -u` before/after examples

Example 1: stale selector rewritten in place

```sh
# Before
click "id=\"old_continue\" || label=\"Continue\""

# After `replay -u`
click "id=\"auth_continue\" || label=\"Continue\""
```

Example 2: stale ref-based action upgraded to selector form

```sh
# Before
snapshot -i -c -s "Continue"
click @e13 "Continue"

# After `replay -u`
snapshot -i -c -s "Continue"
click "id=\"auth_continue\" || label=\"Continue\""
```

Use `replay -u` locally during maintenance, review the rewritten `.ad` lines, then commit the updated script.

## Troubleshooting

- Replay fails after UI/layout changes:
  - Run `replay -u` locally and review the rewritten lines.
- Updating cannot resolve a unique target:
  - Re-record that flow (`--save-script`) from a fresh exploratory pass.
- Replay file parse error:
  - Validate quoting in `.ad` lines (unclosed quotes are rejected).


---
url: /agent-device/docs/security-trust.md
---

# Security & Trust

`agent-device` runs locally by default and controls the devices, simulators, emulators, and desktop apps available to the current user. Treat it like any other developer tool that can interact with apps, capture screens, and read diagnostic output.

## Local control

- Device automation runs through the installed CLI and platform tooling such as Xcode, ADB, macOS accessibility APIs, and Linux AT-SPI.
- The MCP server is a thin discovery router. It exposes status, install guidance, version-matched help, prompts, and resources; it does not expose device automation or generic shell execution over MCP.
- Mutating commands should run serially against one session. Use separate sessions/devices for parallel work.

## Sensitive artifacts

Screenshots, recordings, traces, logs, network dumps, replay files, and reports can contain private UI state, credentials, tokens, request data, or customer information. Store them in a controlled directory, review before sharing, and avoid committing artifacts unless they are intentionally sanitized fixtures.

## Permissions

Some targets require local permissions or developer setup:

- iOS/tvOS/macOS automation uses Xcode tooling and may require signing or Developer Mode for physical devices.
- Android automation uses ADB and connected emulator/device trust.
- macOS desktop automation requires Accessibility permission, and screen capture workflows may require Screen Recording permission.

## Network and updates

Interactive CLI runs may check npm for newer `agent-device` releases and print an upgrade suggestion. Set `AGENT_DEVICE_NO_UPDATE_NOTIFIER=1` to disable the notice.

Network inspection commands only collect traffic available to the active app/session tooling. Review network artifacts before sharing because headers and payloads can contain secrets.

## Responsible disclosure

Report security issues by contacting Callstack privately at [hello@callstack.com](mailto:hello@callstack.com). Do not open a public issue for a vulnerability that exposes user data, credentials, device access, or remote execution risk.


---
url: /agent-device/docs/selectors.md
---

# Selectors

Use `find` to locate elements by semantic attributes instead of raw refs.

```bash
agent-device find "Settings" click
agent-device find text "Sign In" click
agent-device find label "Email" fill "user@example.com"
agent-device find value "Search" click
agent-device type "query"
agent-device find role button click
agent-device find id "com.example:id/login" click
```

Tips:

- Use `find ... wait <timeoutMs>` to wait for UI to appear.
- Combine with scoped snapshots using `snapshot -s "<label>"` for speed.
- \[Android] If a matched node is not hittable, agent-device will click/focus the nearest hittable ancestor.
- Use `fill` when you want find-plus-targeted text replacement in one step. Use `click` or `press` plus `type` when you need append semantics in the focused field.

## Response shape (click)

`find "<query>" click --json` returns deterministic matched-target metadata derived from the resolved snapshot node:

```json
{
  "ref": "@e3",
  "locator": "any",
  "query": "Sign In",
  "x": 195,
  "y": 422
}
```

- `ref` — snapshot ref of the matched (or nearest hittable ancestor) element.
- `locator` — find strategy used (`any`, `text`, `label`, `value`, `role`, `id`).
- `query` — the search term as provided.
- `x`, `y` — tap coordinates derived from the matched element's rect center.


---
url: /agent-device/docs/sessions.md
---

# Sessions

Sessions keep device state and snapshots consistent across commands.

```bash
agent-device open Settings --platform ios
agent-device session list
agent-device open Contacts          # change app while reusing the default session
agent-device close
```

Open another session independently (for parallel work):

```bash
agent-device open Contacts --platform ios --session my-session
agent-device snapshot -i
agent-device close --session my-session
```

Shut down the simulator/emulator on close (iOS simulators and Android emulators, prevents resource leakage in CI/multi-tenant workloads):

```bash
agent-device close --shutdown
```

Notes:

- `open <app>` within an existing session switches the active app and updates the session bundle id.
- `open <url>` in iOS sessions opens deep links.
- `open <app> <url>` in iOS sessions opens deep links.
- On iOS devices, `http(s)://` URLs open in Safari when no app is active. Custom scheme URLs require an active app in the session.
- On iOS, `appstate` is session-scoped and requires a matching active session on the target device.
- For remote `connect --remote-config` sessions, see [Commands](/agent-device/docs/commands.md#remote-metro-workflow).
- Use `--session <name>` to run multiple sessions in parallel. Do not parallelize mutating commands against the same session; serialize stateful actions such as open, press, fill, type, scroll, back, alert, replay, batch, and close.

For replay scripts and deterministic E2E guidance, see [Replay & E2E (Experimental)](/agent-device/docs/replay-e2e.md).


---
url: /agent-device/docs/snapshots.md
---

# Snapshots

Snapshots provide a structured view of the UI and generate current-screen refs.

```bash
agent-device snapshot                    # Full accessibility tree
agent-device snapshot -i                 # Interactive elements only (recommended)
agent-device snapshot -c                 # Compact (remove empty elements)
agent-device snapshot -d 3               # Limit depth to 3 levels
agent-device snapshot -s "Contacts"      # Scope to label/identifier
agent-device snapshot -i -c -d 5         # Combine options
agent-device diff snapshot               # Preferred structural diff vs previous session baseline
agent-device snapshot --diff             # Alias for the same diff operation
```

| Option       | Description               |
| ------------ | ------------------------- |
| `-i`         | Interactive-only output   |
| `-c`         | Compact structural noise  |
| `-d <depth>` | Limit tree depth          |
| `-s <scope>` | Scope to label/identifier |

Note: If XCTest returns 0 nodes (foreground app changed), agent-device fails explicitly.
It does not automatically switch to AX.

## Efficient snapshot usage

- iOS and Android share the same mobile snapshot contract: visible-first output, actionable-now refs, and hidden list content communicated via discovery hints.
- Default to `snapshot -i` for agent loops.
- Default human-readable snapshot output is visible-first. Off-screen interactive content is collapsed into compact discovery summaries such as `[off-screen below] 3 interactive items: "Privacy", "Battery", "About"`.
- If a target only appears in an off-screen summary, use `scroll <direction>` and re-snapshot until the target becomes visible.
- When container ownership is known, hidden content is shown inline under the visible scroll/list container, for example `[content above scroll-area hidden]` or `[content below list hidden]`.
- Those summaries intentionally show only a few labels for token efficiency. Use `snapshot --raw` when you need the full off-screen tree instead of the compact summary.
- Add `-s "<label>"` (or `-s @ref`) to keep results screen-local.
- Add `-d <depth>` when you only need upper hierarchy layers.
- If `snapshot -i` returns 0 nodes on Android but the screen is visibly populated, trust `screenshot` as visual truth, wait briefly, then take one fresh `snapshot -i`.
- If `snapshot -i -d <n>` says the interactive output is empty at that depth, retry once without `-d` before taking more shallow snapshots.
- Re-snapshot after any UI mutation before reusing refs.
- On Android after navigation or submit, snapshot capture retries suspicious trees for a short post-action deadline and `@ref` interactions refresh while that freshness window is active. If `snapshot -i` still disagrees with the visible screen, trust `screenshot`, wait briefly, then take one fresh snapshot instead of looping stale snapshots.
- For automation runs affected by Android animation churn, use `settings animations off` as an opt-in stabilizer and restore with `settings animations on` after the run.
- Use `diff snapshot` between mutations to validate structural changes with lower output volume.
- Use `snapshot --diff` when you discover the feature from snapshot help, but keep `diff snapshot` as the default exploration command.
- Keep `--raw` for troubleshooting only when you need the full tree instead of visible-first output.

`diff snapshot` and `snapshot --diff` behavior:

- First run initializes baseline (`baselineInitialized: true` in JSON).
- Later runs return unified-style lines (`+` added, `-` removed, unchanged context) and update baseline after each call.

## Example output:

```bash
agent-device snapshot -i
# Output:
# Snapshot: 9 visible nodes (14 total)
# @e1 [application] "Contacts"
#   @e2 [window]
#     @e3 [other]
#   @e4 [other] "Lists"
#     @e5 [navigation-bar] "Lists"
#       @e6 [button] "Lists"
#       @e7 [text] "Contacts"
#     @e8 [other] "John Doe"
#       @e9 [other] "John Doe"
# [off-screen below] 2 interactive items: "All Contacts", "New List"
```

## Backends (iOS):

- `xctest` (default): full fidelity, fast, no Accessibility permission required.
- `ax`: fast accessibility tree, may miss details, requires Accessibility permission; simulator-only.


---
url: /agent-device/index.md
---

# Agent-native app automation for mobile, TV, and desktop.

> agent-device gives coding agents structured UI access, deterministic interactions, and app-level evidence across mobile, TV, and desktop targets. Use it to verify generated code, debug broken flows, profile runtime behavior, and turn exploratory QA into replayable checks.

[Get Started](/docs/agent-setup) | [Commands](/docs/commands)

## Features

- **One CLI, many app surfaces**: Control iOS, Android, tvOS, Android TV, macOS, and Linux desktop targets with consistent snapshot and interaction commands.
- **Accessibility-first snapshots**: Accessibility trees give agents compact UI context without forcing screenshot-only reasoning.
- **Agent-native interactions**: Tap, swipe, scroll, focus, type, assert, and find visible UI through refs, selectors, and semantic finders.
- **Built-in debugging and profiling**: Collect session logs, inspect recent HTTP traffic, capture screenshots and recordings, and sample CPU, memory, startup, and frame-health metrics.
- **Session and replay**: Open apps, keep stateful context, and replay recorded `.ad` actions to reproduce flows without AI at runtime.
- **React Native internals**: Use agent-device react-devtools to inspect React Native component trees, props, state, hooks, and render profiles through pinned agent-react-devtools.