browser-automation
browser-automation
Use this skill when an agent must interact with real websites like a human.
Capabilities
- Navigate and wait for rendered JS content
- Login flows with selector fallback maps (
config/selectors/common.json) - Cookie/session persistence via
storageState - OAuth redirect handling with success URL checks
- Multi-step click/fill/wait workflows
- Screenshot capture to local artifacts directory
- Dynamic text scraping from rendered elements
- MFA/captcha pause-resume hook (
pauseForMfa) - Robust error handling (timeouts, missing elements, blocked flows)
Required scripts
scripts/launch_browser.js— browser sanity launch + screenshotscripts/login_flow.js— reusable login/session capture helperscripts/run_task.js— generic step runner (goto/click/fill/wait/screenshot/scrapeText)
Quick start
cd skills/browser-automation
npm install playwright
npx playwright install chromium
1) Validate browser stack
node scripts/launch_browser.js ./artifacts
2) Run login/session capture
Edit config/login.example.json and run:
node scripts/login_flow.js config/login.example.json
3) Run a task flow
Edit config/task.example.json and run:
node scripts/run_task.js config/task.example.json
Input contract
- Required:
- Target URL(s)
- Task intent (login / form / scrape / screenshot)
- Optional:
- Selector list and fallback selectors
headlessmode togglestorageStatePath- output directory
Output contract
Return structured JSON including:
okdata(scraped values)screenshots(local file paths)finalUrlerrorwith failing step details (if failed)
Error handling standards
- On timeout: capture failure screenshot and return actionable selector/url details.
- On missing element: include selector attempted and step type.
- On OAuth/cookie expiry: re-run login flow and refresh storage state.
- On captcha/MFA: pause automation and request human intervention.
Edge cases
- Dynamic DOM updates: prefer role/text selectors and explicit waits.
- Redirect chains: use
waitForURLwith expected pattern. - Session loss: re-create
storageStatefrom login flow. - Anti-bot challenge: switch to headed mode + manual assist.