Dom Element Interaction And Manipulation

1

Puppeteer MCP ServerMCP Server82/100

via “dom element interaction via css/xpath selectors”

Automate browser interactions and take screenshots via Puppeteer MCP.

Unique: Exposes Puppeteer's selector-based element APIs ($ and $$) as MCP tools with built-in visibility validation, allowing LLM clients to reason about DOM structure without learning Puppeteer's JavaScript evaluation syntax. Handles selector resolution errors gracefully with descriptive error messages.

vs others: More accessible than raw JavaScript evaluation for LLM clients; provides semantic feedback about element state (visible, clickable) rather than requiring clients to write defensive JS code.

2

Playwright MCP ServerMCP Server81/100

via “element interaction via accessibility-aware selectors”

Automate browsers and run web tests via Playwright MCP.

Unique: Uses accessibility tree semantics to generate robust element selectors that survive DOM refactoring, unlike brittle CSS/XPath selectors; validates element state before interaction to prevent silent failures

vs others: More robust than pixel-based clicking (screenshot + vision) because it uses semantic element properties that don't change with styling; more reliable than CSS selectors because it references accessibility roles that persist across DOM restructuring

3

puppeteer-mcp-serverMCP Server59/100

via “dom-element-interaction-and-selection”

Experimental MCP server for browser automation using Puppeteer (inspired by @modelcontextprotocol/server-puppeteer)

Unique: Wraps Puppeteer's element query and interaction methods (page.$, page.click, page.type) as discrete MCP tools, allowing LLM agents to compose multi-step interactions (find element → extract property → click → wait) without managing Puppeteer's page object

vs others: More granular than Selenium (which requires explicit driver management) and more accessible than raw Puppeteer (no JavaScript knowledge required from LLM client, works via tool schemas)

4

browser-useAgent55/100

via “dom-to-text serialization with interactive element indexing”

🌐 Make websites accessible for AI agents. Automate tasks online with ease.

Unique: Uses a Watchdog pattern with event-driven re-serialization instead of full-page re-parsing on every state change, reducing overhead. Implements visibility calculation via viewport intersection, CSS computed styles, and z-index stacking context analysis. Maintains a stable element index mapping across DOM mutations, enabling consistent LLM references even as the page updates.

vs others: More efficient than Selenium's element finding because it pre-computes all interactive elements and their coordinates in a single pass; more accurate than regex-based HTML parsing because it uses actual CSS computed styles for visibility.

5

chrome-devtools-mcpMCP Server54/100

via “dom-element-interaction-with-selector-based-targeting”

Chrome DevTools for coding agents

Unique: Uses Chrome DevTools Protocol DOM domain to resolve selectors and validate element interactability before executing actions, with Mutex-protected sequential execution ensuring deterministic state across multiple interactions. Provides detailed error messages (element not found, not clickable, etc.) enabling agents to handle failures gracefully.

vs others: Validates element interactability via CDP before action execution (vs blind action attempts), reducing flaky interactions and providing detailed error feedback, whereas raw Puppeteer may execute actions on non-interactable elements causing silent failures.

6

mcp-playwrightMCP Server53/100

via “dom-interaction-via-playwright-selectors”

Playwright Model Context Protocol Server - Tool to automate Browsers and APIs in Claude Desktop, Cline, Cursor IDE and More 🔌

Unique: Wraps Playwright's locator engine with MCP tool contracts, enabling LLMs to use role-based and text-based selectors (e.g., 'button with text Submit') instead of brittle CSS selectors, with built-in visibility and interactability validation via Playwright's isVisible() and isEnabled() checks before action execution

vs others: More robust than raw Selenium WebDriver for LLM use because Playwright's locator strategies (role, text, label) are more resilient to DOM changes, and the MCP abstraction eliminates the need for agents to manage WebDriver waits or exception handling

7

playwright-mcpMCP Server52/100

via “interactive element interaction (click, type, select, submit)”

Playwright MCP server

Unique: Uses Playwright's locator API with built-in retry and wait logic, automatically handling element staleness, dynamic rendering, and actionability checks without requiring explicit waits in the tool call

vs others: More reliable than raw Playwright API calls because it includes automatic waits and retry logic; more flexible than screenshot-based interaction because it uses semantic element location rather than pixel coordinates

8

Playwright MCP ServerMCP Server49/100

via “dom element selection and interaction via css/xpath selectors”

** - An MCP server using Playwright for browser automation and webscrapping

Unique: Wraps Playwright's locator API with MCP tool definitions, exposing both CSS and XPath selector support with automatic waiting and error handling. Provides structured feedback on element interaction success/failure.

vs others: More reliable than regex-based selector matching; uses Playwright's native waiting mechanisms to handle dynamic content and timing issues that simpler selector tools struggle with.

9

@executeautomation/playwright-mcp-serverMCP Server48/100

via “dom-element-selection-and-querying”

Model Context Protocol servers for Playwright

Unique: Exposes Playwright's locator API as MCP tools with rich metadata responses (bounding box, visibility, attributes), enabling LLMs to make informed decisions about element interaction without trial-and-error clicking, and supporting both CSS and XPath with automatic selector validation

vs others: Returns structured element metadata (visibility, enabled state, bounding box) in a single query, reducing the number of round-trips needed compared to frameworks that require separate queries for element existence, visibility, and interaction readiness

10

bb-browserMCP Server46/100

via “dom-element-interaction-with-selector-based-targeting”

Your browser is the API. CLI + MCP server for AI agents to control Chrome with your login state.

Unique: Uses CDP protocol for direct DOM interaction with built-in element visibility waits and multi-element batch operations. Integrates with the authenticated browser context to interact with pages as the logged-in user.

vs others: More reliable than Playwright/Selenium for authenticated pages because it uses the real browser session; built-in waits reduce flakiness vs raw CDP usage

11

js-reverse-mcpMCP Server46/100

via “dom querying and element interaction with css selectors”

为 AI Agent 设计的 JS 逆向 MCP Server，内置反检测，基于 chrome-devtools-mcp 重构 | JS reverse engineering MCP server with agent-first tool design and built-in anti-detection. Rebuilt from chrome-devtools-mcp.

Unique: Wraps CDP element interaction commands into agent-native tool definitions with automatic element waiting and stale element recovery, vs raw CDP which requires agents to handle timing and retry logic manually

vs others: More agent-friendly than Puppeteer's page.$(selector) because it returns structured metadata and handles common failure modes (stale elements, visibility checks) automatically; simpler than raw CDP for agents unfamiliar with low-level browser protocol

12

Comet MCP – Give Claude Code a browser that can clickMCP Server39/100

via “dom-based element selection and targeting”

Hey HN,Claude Code is pretty agentic now. It writes scripts, calls APIs, uses CLIs. But when something requires actually clicking through a website, it stops and asks me to do it.Problem is, I'm often unfamiliar with these platforms myself. "Go to App Store Connect and generate a P8 key&qu

Unique: Exposes DOM element metadata as structured data through MCP, allowing Claude to reason about page structure programmatically rather than relying solely on visual screenshots or trial-and-error clicking.

vs others: More reliable than coordinate-based clicking because it targets semantic elements rather than pixel positions, making automation resistant to layout changes or responsive design variations.

13

@hisma/server-puppeteerMCP Server37/100

via “dom-element-interaction-and-manipulation”

Fork and update (v0.6.5) of the original @modelcontextprotocol/server-puppeteer MCP server for browser automation using Puppeteer.

Unique: Wraps Puppeteer's ElementHandle operations as stateless MCP tools that re-query the DOM on each call, avoiding stale reference issues common in long-running automation scripts. Includes automatic visibility waiting before interaction.

vs others: More robust than direct Puppeteer ElementHandle usage for agent workflows because it handles element re-querying and visibility waiting transparently, reducing agent-side error handling complexity.

14

Safari MCPMCP Server37/100

via “interactive element manipulation (click, type, scroll)”

Native Safari browser automation for AI agents — 80 tools via AppleScript, zero Chrome overhead, keeps logins, runs silently. macOS only.

Unique: Uses AppleScript event simulation for native input handling rather than synthetic DOM events, providing more realistic user interaction that triggers native browser handlers. Includes pre-interaction visibility validation to prevent silent failures.

vs others: More reliable than synthetic DOM events because it uses native OS-level input; better error detection than Puppeteer because it validates element visibility before interaction; less flexible than low-level WebDriver but more user-friendly for typical form automation.

15

puppeteer-mcp-serverMCP Server34/100

via “dom-element-interaction-and-querying”

Experimental MCP server for browser automation using Puppeteer (inspired by @modelcontextprotocol/server-puppeteer)

Unique: Exposes Puppeteer's element querying and evaluation as MCP tools, allowing agents to chain selector queries with property extraction and interactions in a single tool call. Uses page.evaluate() to run JavaScript in page context for reliable property access.

vs others: More flexible than REST API scraping because it can interact with dynamic elements; more reliable than regex-based HTML parsing because it queries the live DOM after JavaScript execution.

16

BrowserbaseMCP Server34/100

via “dom-aware element targeting and interaction”

** - Automate browser interactions in the cloud (e.g. web navigation, data extraction, form filling, and more)

Unique: Wraps Playwright's element targeting and interaction APIs through MCP, exposing multiple selector strategies and automatic wait-for-interactability logic as a unified tool interface. Includes built-in retry logic for stale element references and automatic scroll-into-view, reducing the need for agents to implement custom error handling for common web automation edge cases.

vs others: More robust than raw Playwright for agent workflows because the MCP abstraction handles common failure modes (stale elements, visibility waits) automatically, and more flexible than simple REST scraping APIs because it supports interactive workflows beyond read-only data extraction.

17

onestep-puppeteer-mcp-serverMCP Server33/100

via “dom-element-interaction-and-selection”

Experimental MCP server for browser automation using Puppeteer (inspired by @modelcontextprotocol/server-puppeteer)

Unique: Wraps Puppeteer element APIs (page.$, page.$$, element.click, element.type) as discrete MCP tools, allowing agents to compose multi-step interactions. Includes element property introspection (text, attributes, visibility) for conditional branching.

vs others: More granular than Selenium/Playwright wrappers that often batch operations; allows agents to inspect element state between actions for adaptive behavior

18

playwright-mcpMCP Server33/100

via “element-interaction-and-form-filling”

MCP server: playwright-mcp

Unique: Wraps Playwright's actionability checks (visibility, enabled state, in-viewport) as implicit validation before each interaction, preventing agents from attempting to interact with hidden or disabled elements. Provides detailed error messages when interactions fail due to element state.

vs others: More robust than raw Selenium WebDriver bindings because Playwright's auto-waiting and actionability checks reduce flakiness. Simpler than building custom element detection logic because it delegates to Playwright's proven element location and validation.

19

browser-useMCP Server33/100

via “dom-to-llm serialization with interactive element indexing”

Make websites accessible for AI agents

Unique: Uses event-driven watchdog pattern with CDP event listeners to detect DOM mutations and incrementally re-serialize only changed subtrees, rather than full-page re-parsing on each step. Combines bounding box visibility calculation with viewport intersection to filter non-visible elements before serialization, reducing token overhead by 30-50% vs naive full-DOM approaches.

vs others: More efficient than Selenium/Playwright's raw HTML dumps because it pre-processes visibility and coordinates server-side, eliminating the need for LLMs to parse raw HTML or calculate element positions themselves.

20

skyvernMCP Server33/100

via “selector-based-element-interaction”

MCP server: skyvern

Unique: Provides robust selector-based element interaction through MCP tools with built-in wait conditions and error handling. Implements fallback strategies for stale elements and dynamic content.

vs others: More reliable than screenshot-based element detection for structured pages, but less adaptive than AI-powered visual element detection

Top Matches

Also Known As

Company