cross-browser automation with unified api
Provides a single high-level Python API that abstracts over Chromium, Firefox, and WebKit browser engines, translating method calls into the Chrome DevTools Protocol (CDP) or equivalent wire protocols for each browser. Uses an async/await pattern with context managers for resource lifecycle management, enabling developers to write browser automation code once and run it against multiple engines without engine-specific branching logic.
Unique: Unified API across three major browser engines (Chromium, Firefox, WebKit) using native protocol bindings rather than WebDriver, enabling faster execution and access to DevTools-level capabilities like network interception and performance metrics
vs alternatives: Faster than Selenium/WebDriver because it uses CDP directly instead of the WebDriver protocol, and supports more browsers natively than Puppeteer (which is Chromium-only)
network request/response interception and mocking
Intercepts HTTP/HTTPS requests at the browser protocol level before they reach the network, allowing modification of request headers, bodies, and URLs, or replacement with mock responses without touching the application code. Uses route handlers registered on page or context objects that match requests by URL pattern or custom predicates, enabling test isolation and deterministic response injection.
Unique: Operates at the Chrome DevTools Protocol level, intercepting requests before they leave the browser context, enabling full request/response manipulation including headers and body content without proxy setup or network-level tools
vs alternatives: More flexible than mock server libraries because it intercepts at the browser protocol level rather than requiring HTTP proxy configuration, and supports both request modification and response mocking in a single API
geolocation and permissions mocking
Mocks browser permissions (camera, microphone, geolocation, notifications) and geolocation coordinates at the context level, allowing tests to simulate location-based features and permission prompts without user interaction. Uses the Chrome DevTools Protocol to inject mock permission states and geolocation data, enabling testing of location-aware applications and permission-gated features.
Unique: Mocks browser permissions and geolocation at the context level through the Chrome DevTools Protocol, enabling testing of location-aware and permission-gated features without physical devices or user interaction
vs alternatives: More integrated than manual permission handling because permissions are set at context creation time, and more flexible than WebDriver permissions because it supports multiple permission types and geolocation coordinates
accessibility testing with aria and role inspection
Provides utilities to inspect accessibility tree (ARIA roles, labels, descriptions) and validate semantic HTML structure, enabling automated accessibility testing without external tools. Exposes element roles, accessible names, and descriptions through the accessibility tree, allowing assertions on keyboard navigation, screen reader compatibility, and WCAG compliance.
Unique: Exposes the browser's accessibility tree (ARIA roles, labels, descriptions) natively through the page API, enabling accessibility assertions without external tools or axe-core integration
vs alternatives: More integrated than external accessibility tools because it uses the browser's native accessibility tree, and more flexible than manual ARIA inspection because it supports programmatic assertions
dom element selection and interaction with wait strategies
Provides CSS selector, XPath, and text-based element locators that automatically wait for elements to become actionable (visible, enabled, stable) before performing actions like click, fill, or type. Uses internal polling with exponential backoff and timeout configuration to handle dynamic DOM updates, reducing flakiness from race conditions between script execution and DOM rendering.
Unique: Built-in wait-for-actionable logic with automatic polling and timeout handling, combined with multiple selector strategies (CSS, XPath, text, ARIA) in a single locator API, eliminating the need for explicit sleep() or WebDriverWait patterns
vs alternatives: More reliable than Selenium because waits are implicit and built into every action, and supports text/ARIA-based selection natively without custom XPath construction
screenshot and pdf capture with layout options
Captures visual snapshots of pages or specific elements as PNG/JPEG images or full-page PDFs, with options for full-page scrolling capture, clipped regions, and custom viewport sizing. Renders the page through the browser's rendering engine at specified dimensions, enabling pixel-perfect visual regression testing and documentation generation without external screenshot tools.
Unique: Captures screenshots and PDFs directly through the browser rendering engine without external tools, supporting full-page scrolling capture and element-level clipping with native viewport and scale control
vs alternatives: More integrated than external screenshot tools because it operates within the browser context and respects CSS media queries and responsive design, and supports PDF generation natively without headless Chrome subprocess calls
browser context and cookie/storage management
Creates isolated browser contexts (equivalent to private browsing sessions) with independent cookies, local storage, session storage, and IndexedDB, allowing parallel test execution without cross-contamination. Contexts can be pre-populated with authentication state, cookies, or storage data, and state can be persisted to disk and reloaded, enabling test setup optimization and session replay.
Unique: Provides first-class context isolation with automatic storage management (cookies, localStorage, sessionStorage, IndexedDB) and state persistence/reload, enabling efficient parallel test execution and session replay without manual state cleanup
vs alternatives: More efficient than creating separate browser instances because contexts share a single browser process, and more flexible than WebDriver sessions because storage state can be serialized and reused across test runs
performance metrics and network monitoring
Captures browser performance metrics (page load time, DOM content loaded, first contentful paint) and network activity (requests, responses, timing) through the Chrome DevTools Protocol, exposing raw HAR (HTTP Archive) files and parsed metrics for performance analysis. Enables real-time network monitoring without external proxy tools or performance monitoring libraries.
Unique: Exposes raw Chrome DevTools Protocol metrics and HAR recording natively, enabling detailed performance analysis and network debugging without external APM tools or proxy configuration
vs alternatives: More detailed than WebDriver performance APIs because it captures full HAR files and DevTools metrics, and more integrated than external monitoring tools because it operates within the browser context
+4 more capabilities