Capability
20 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →via “dom element interaction via css/xpath selectors”
Automate browser interactions and take screenshots via Puppeteer MCP.
Unique: Exposes Puppeteer's selector-based element APIs ($ and $$) as MCP tools with built-in visibility validation, allowing LLM clients to reason about DOM structure without learning Puppeteer's JavaScript evaluation syntax. Handles selector resolution errors gracefully with descriptive error messages.
vs others: More accessible than raw JavaScript evaluation for LLM clients; provides semantic feedback about element state (visible, clickable) rather than requiring clients to write defensive JS code.
via “element interaction via accessibility-aware selectors”
Automate browsers and run web tests via Playwright MCP.
Unique: Uses accessibility tree semantics to generate robust element selectors that survive DOM refactoring, unlike brittle CSS/XPath selectors; validates element state before interaction to prevent silent failures
vs others: More robust than pixel-based clicking (screenshot + vision) because it uses semantic element properties that don't change with styling; more reliable than CSS selectors because it references accessibility roles that persist across DOM restructuring
via “adaptive element relocation and dynamic selector resolution”
🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!
Unique: Implements automatic selector relocation using structural DOM analysis and fallback matching strategies, enabling selectors to survive DOM mutations without manual updates—most competitors require static selectors or manual maintenance when HTML changes
vs others: More resilient than Selenium's static selectors because it adapts to DOM changes automatically, and more maintainable than regex-based extraction because it understands HTML structure semantically
via “dom-element-interaction-and-selection”
Experimental MCP server for browser automation using Puppeteer (inspired by @modelcontextprotocol/server-puppeteer)
Unique: Wraps Puppeteer's element query and interaction methods (page.$, page.click, page.type) as discrete MCP tools, allowing LLM agents to compose multi-step interactions (find element → extract property → click → wait) without managing Puppeteer's page object
vs others: More granular than Selenium (which requires explicit driver management) and more accessible than raw Puppeteer (no JavaScript knowledge required from LLM client, works via tool schemas)
via “unified html parsing with css and xpath selector chaining”
🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!
Unique: Unified Selector interface inherited by all Response objects enables identical CSS/XPath syntax across static HTTP, browser, and stealth fetchers. Lazy evaluation defers selector execution until terminal operations, reducing memory overhead in large-scale crawls by avoiding intermediate DOM tree materialization.
vs others: BeautifulSoup requires separate parsing for each fetcher type; Scrapling's unified Response/Selector interface works identically across all fetchers. Lazy evaluation reduces memory usage by ~30-40% vs eager parsing on large documents compared to Scrapy's immediate selector evaluation.
via “dom-element-interaction-with-selector-based-targeting”
Chrome DevTools for coding agents
Unique: Uses Chrome DevTools Protocol DOM domain to resolve selectors and validate element interactability before executing actions, with Mutex-protected sequential execution ensuring deterministic state across multiple interactions. Provides detailed error messages (element not found, not clickable, etc.) enabling agents to handle failures gracefully.
vs others: Validates element interactability via CDP before action execution (vs blind action attempts), reducing flaky interactions and providing detailed error feedback, whereas raw Puppeteer may execute actions on non-interactable elements causing silent failures.
via “dom-query-and-element-inspection”
MCP server for Chrome DevTools
Unique: Exposes CDP's Runtime domain for DOM queries through MCP, allowing agents to inspect elements without context switching to browser console. Returns structured metadata (bounding boxes, computed styles) in a single call, reducing round-trips compared to sequential property queries.
vs others: More efficient than Puppeteer's page.$() because it returns computed styles and layout info in one call rather than requiring separate property accesses, reducing network overhead in agent workflows.
via “dom-interaction-via-playwright-selectors”
Playwright Model Context Protocol Server - Tool to automate Browsers and APIs in Claude Desktop, Cline, Cursor IDE and More 🔌
Unique: Wraps Playwright's locator engine with MCP tool contracts, enabling LLMs to use role-based and text-based selectors (e.g., 'button with text Submit') instead of brittle CSS selectors, with built-in visibility and interactability validation via Playwright's isVisible() and isEnabled() checks before action execution
vs others: More robust than raw Selenium WebDriver for LLM use because Playwright's locator strategies (role, text, label) are more resilient to DOM changes, and the MCP abstraction eliminates the need for agents to manage WebDriver waits or exception handling
via “interactive element interaction (click, type, select, submit)”
Playwright MCP server
Unique: Uses Playwright's locator API with built-in retry and wait logic, automatically handling element staleness, dynamic rendering, and actionability checks without requiring explicit waits in the tool call
vs others: More reliable than raw Playwright API calls because it includes automatic waits and retry logic; more flexible than screenshot-based interaction because it uses semantic element location rather than pixel coordinates
via “dom element selection and interaction via css/xpath selectors”
** - An MCP server using Playwright for browser automation and webscrapping
Unique: Wraps Playwright's locator API with MCP tool definitions, exposing both CSS and XPath selector support with automatic waiting and error handling. Provides structured feedback on element interaction success/failure.
vs others: More reliable than regex-based selector matching; uses Playwright's native waiting mechanisms to handle dynamic content and timing issues that simpler selector tools struggle with.
via “ui element selection and interaction via accessibility tree parsing”
The most powerful Android RPA agent framework, next generation mobile automation.
Unique: Combines UIAutomator2 accessibility tree parsing with direct ADB input event injection, allowing element selection via semantic properties (text, resource-id) while maintaining pixel-perfect interaction accuracy. Caches hierarchy snapshots to reduce query latency and supports both absolute coordinates and relative positioning within element bounds.
vs others: More reliable than Appium for local Android devices because it uses native UIAutomator2 without HTTP overhead; more flexible than image-based automation (OCR) because it works with dynamic content and doesn't require visual training data.
via “dom-element-selection-and-querying”
Model Context Protocol servers for Playwright
Unique: Exposes Playwright's locator API as MCP tools with rich metadata responses (bounding box, visibility, attributes), enabling LLMs to make informed decisions about element interaction without trial-and-error clicking, and supporting both CSS and XPath with automatic selector validation
vs others: Returns structured element metadata (visibility, enabled state, bounding box) in a single query, reducing the number of round-trips needed compared to frameworks that require separate queries for element existence, visibility, and interaction readiness
via “ui element selection and interaction via accessibility hierarchy inspection”
The most powerful Android RPA agent framework, next generation mobile automation.
Unique: Leverages Android's native Accessibility API and UIAutomator2 framework for robust element selection instead of image recognition or coordinate-based clicking, enabling selector-based automation that survives UI layout changes
vs others: More reliable than image-based automation (Appium with OpenCV) because it uses semantic element attributes; more maintainable than coordinate-based scripts because selectors adapt to layout changes
via “dom-element-interaction-with-selector-based-targeting”
Your browser is the API. CLI + MCP server for AI agents to control Chrome with your login state.
Unique: Uses CDP protocol for direct DOM interaction with built-in element visibility waits and multi-element batch operations. Integrates with the authenticated browser context to interact with pages as the logged-in user.
vs others: More reliable than Playwright/Selenium for authenticated pages because it uses the real browser session; built-in waits reduce flakiness vs raw CDP usage
via “dom querying and element interaction with css selectors”
为 AI Agent 设计的 JS 逆向 MCP Server,内置反检测,基于 chrome-devtools-mcp 重构 | JS reverse engineering MCP server with agent-first tool design and built-in anti-detection. Rebuilt from chrome-devtools-mcp.
Unique: Wraps CDP element interaction commands into agent-native tool definitions with automatic element waiting and stale element recovery, vs raw CDP which requires agents to handle timing and retry logic manually
vs others: More agent-friendly than Puppeteer's page.$(selector) because it returns structured metadata and handles common failure modes (stale elements, visibility checks) automatically; simpler than raw CDP for agents unfamiliar with low-level browser protocol
via “dom-aware-element-selection-with-multi-strategy-matching”
🌐Web Agent Protocol (WAP) - Record and replay user interactions in the browser with MCP support
Unique: Implements intelligent fallback chain with selector strategy caching — learns which selector type works for each element and reuses it, reducing retry overhead on subsequent interactions
vs others: More resilient than single-strategy selectors (pure CSS or XPath) because it adapts to DOM changes, but more performant than brute-force fuzzy matching because it caches successful strategies
via “dom-based element selection and targeting”
Hey HN,Claude Code is pretty agentic now. It writes scripts, calls APIs, uses CLIs. But when something requires actually clicking through a website, it stops and asks me to do it.Problem is, I'm often unfamiliar with these platforms myself. "Go to App Store Connect and generate a P8 key&qu
Unique: Exposes DOM element metadata as structured data through MCP, allowing Claude to reason about page structure programmatically rather than relying solely on visual screenshots or trial-and-error clicking.
vs others: More reliable than coordinate-based clicking because it targets semantic elements rather than pixel positions, making automation resistant to layout changes or responsive design variations.
via “interactive element extraction and coordinate mapping”
[NAACL2025] LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent Applications
Unique: Provides dual targeting methods (coordinates + DOM selectors) with automatic fallback, enabling robust element interaction even when page layout changes or coordinate-based targeting fails
vs others: More reliable than coordinate-only targeting (which breaks on layout changes) and more flexible than selector-only approaches (which fail on dynamic elements)
via “dom-element-interaction-and-manipulation”
Fork and update (v0.6.5) of the original @modelcontextprotocol/server-puppeteer MCP server for browser automation using Puppeteer.
Unique: Wraps Puppeteer's ElementHandle operations as stateless MCP tools that re-query the DOM on each call, avoiding stale reference issues common in long-running automation scripts. Includes automatic visibility waiting before interaction.
vs others: More robust than direct Puppeteer ElementHandle usage for agent workflows because it handles element re-querying and visibility waiting transparently, reducing agent-side error handling complexity.
via “selective dom element extraction via css/xpath selectors”
A command-line tool acting as an MCP (ModelContextProtocol) server, using Playwright to crawl web content for AI models.
Unique: Leverages Playwright's locator API with built-in retry logic and cross-browser selector compatibility, avoiding regex-based extraction or DOM parsing libraries — selectors are evaluated in the browser context for accuracy
vs others: More reliable than Cheerio selectors because execution happens in the actual browser engine; faster than full-page parsing when only specific fields are needed
Building an AI tool with “Dom Element Selection And Interaction Via Css Xpath Selectors”?
Submit your artifact →curl unfragile.ai/agents.md | sh© 2026 Unfragile. The platform for software for agents.