← ClaudeAtlas

use-agentvisionlisted

Control the user's real macOS screen via the `agent-vision` CLI — session management, element targeting, and UI interaction on live windows. Triggers: user says "look at my screen", "use agent-vision", "the app/browser/simulator is open", "take a screenshot of my screen", "fill this form" (when app is already open), "check the UI", "watch the browser", "navigate to" (in an open app), "scroll through", "click on" (in a visible window), "I have X open", visual QA of running applications, iOS Simulator or Android emulator interaction, before/after visual comparison of live UI, or any task requiring real screen capture and control. NOT for: headless browser testing, Playwright/Puppeteer scripts, code-only reviews, file-based screenshots, or building screen capture features.
rvanbaalen/skills · ★ 0 · Testing & QA · score 61
Install: claude install-skill rvanbaalen/skills
# Agent Vision Agent Vision is a macOS CLI that gives you eyes and hands on the user's screen. You can screenshot a selected region and control the mouse, keyboard, and UI elements within that region. **Use it for**: visual feedback loops during UI development, navigating applications, filling forms, visual QA, testing mobile emulators, and any task that requires seeing and interacting with what's on screen. > Reference files in this skill's directory: > - `references/cli-reference.md` — full command syntax, flags, and error table > - `references/app-tips.md` — per-app behaviors and shortcuts > - `references/clipboard.md` — sharing files into apps via the macOS clipboard > - `references/install.md` — install and permission setup ## Before You Start Check that agent-vision is installed: ```bash which agent-vision ``` If not found, read `references/install.md` and guide the user through it (`brew install rvanbaalen/tap/agent-vision` + Screen Recording + Accessibility permissions). ## Session Lifecycle Every agent-vision interaction happens within a **session** that scopes all commands to a user-selected screen region. ### Starting a session **Preferred — `open` when you know the app:** ```bash agent-vision open Safari # add --title "..." to disambiguate multiple windows of the same app ``` Launches (or activates) the app and automatically selects its window. No manual interaction required. **Manual area selection — `start`:** ```bash agent-vision start ``` Block