{"passport":{"unfragile":{"@version":"1.0","version":"2026-05","artifact":{"id":"github_mcp-web-agent-master-google-search","slug":"mcp-web-agent-master-google-search","name":"google-search","type":"mcp","url":"https://github.com/web-agent-master/google-search","page_url":"https://unfragile.ai/mcp-web-agent-master-google-search","categories":["mcp-servers"],"tags":["ai","google-search","llm","mcp-server","web-scraping"],"pricing":{"model":"open_source","free":true,"starting_price":null},"status":"active","verified":false},"capabilities":[{"id":"github_mcp-web-agent-master-google-search__cap_0","uri":"capability://search.retrieval.playwright.based.google.search.execution.with.anti.bot.evasion","name":"playwright-based google search execution with anti-bot evasion","description":"Executes real Google searches using Playwright browser automation while implementing multiple anti-detection strategies (user-agent rotation, viewport randomization, request throttling, browser state persistence) to bypass Google's anti-scraping mechanisms. The core googleSearch() function in src/search.ts orchestrates browser navigation, DOM waiting, and result extraction without relying on external SERP APIs, enabling unlimited searches without rate limits or API quotas.","intents":["Execute Google searches programmatically without API keys or paid SERP services","Retrieve real-time search results directly from Google's live index","Integrate live search capability into AI agents and LLM applications","Avoid SERP API costs and rate limiting for high-volume search workloads"],"best_for":["AI agents and LLM applications requiring real-time search without external API dependencies","Developers building local-first search tools with no cloud infrastructure","Teams migrating from paid SERP APIs (SerpAPI, DataForSEO) to self-hosted alternatives"],"limitations":["Subject to Google's dynamic anti-bot detection; may encounter CAPTCHAs or IP blocks on high-frequency searches","Browser state persistence (./browser-state.json) mitigates but doesn't eliminate CAPTCHA challenges","Single-threaded Playwright execution limits concurrent search parallelism","No built-in proxy rotation; requires external proxy infrastructure for large-scale scraping","Slower than cached SERP APIs (Playwright startup + navigation overhead ~3-5 seconds per search)"],"requires":["Node.js 18+","Playwright browser binaries (auto-installed via npm)","Writable filesystem for browser state persistence","Network access to google.com (not blocked by corporate firewall/ISP)"],"input_types":["search query (string)","optional: limit (number, default 10)","optional: timeout (milliseconds, default 60000)","optional: language/locale parameter"],"output_types":["SearchResponse JSON with title, link, snippet fields","raw HTML of search results page (--get-html flag)","screenshot/saved HTML (--save-html flag)"],"categories":["search-retrieval","web-scraping"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"github_mcp-web-agent-master-google-search__cap_1","uri":"capability://tool.use.integration.mcp.server.integration.for.ai.assistant.search.access","name":"mcp server integration for ai assistant search access","description":"Wraps the core googleSearch() function as a Model Context Protocol (MCP) server using the MCP SDK, enabling AI assistants like Claude to invoke Google searches via standardized tool-calling interface. The mcp-server.ts component manages McpServer instance, StdioServerTransport for stdio communication, and a global persistent Playwright browser to serve multiple search requests from a single AI session without browser restart overhead.","intents":["Enable Claude and other MCP-compatible AI assistants to perform real-time web searches","Integrate live search as a native tool in AI agent workflows without custom API wrappers","Provide AI models with fresh information beyond training data cutoff","Build AI agents that can research topics, verify facts, and retrieve current information"],"best_for":["AI agent developers using Claude or other MCP-compatible LLMs","Teams building AI research assistants requiring real-time web search","Developers extending Claude's capabilities with local search tools"],"limitations":["MCP server runs as separate process; requires stdio communication overhead (~10-50ms per request)","Global browser instance is shared across all concurrent MCP requests; no request isolation","No built-in request queuing; concurrent searches may block each other if Playwright browser is busy","Requires manual MCP server configuration in Claude/AI assistant config files","No authentication/authorization layer; any process with stdio access can invoke searches"],"requires":["Node.js 18+","MCP SDK (installed via npm)","Claude or MCP-compatible AI assistant with MCP server support","Proper MCP server configuration in assistant's config file (e.g., claude_desktop_config.json)"],"input_types":["MCP tool call with search query parameter","optional: limit parameter","optional: timeout parameter"],"output_types":["MCP tool result with SearchResponse JSON","structured text representation of search results for LLM consumption"],"categories":["tool-use-integration","search-retrieval"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"github_mcp-web-agent-master-google-search__cap_2","uri":"capability://search.retrieval.command.line.interface.with.configurable.search.parameters","name":"command-line interface with configurable search parameters","description":"Exposes search functionality via CLI using the commander package (src/index.ts) with options for result limit, timeout, headless mode toggle, browser state file path, and HTML extraction modes. Parses command-line arguments and invokes the core googleSearch() function with validated parameters, supporting both structured JSON output and raw HTML retrieval for downstream processing.","intents":["Execute one-off Google searches from terminal without writing code","Integrate Google search into shell scripts and CI/CD pipelines","Debug search behavior with headless mode disabled (visible browser)","Extract raw HTML for custom parsing or analysis workflows"],"best_for":["DevOps engineers integrating search into automation scripts","Researchers performing batch searches from command line","Developers debugging search extraction logic with visible browser"],"limitations":["CLI blocks until search completes; no async/streaming output","No built-in result pagination; --limit capped at practical browser rendering limits (~100 results)","Browser state file (./browser-state.json) is global; concurrent CLI invocations may conflict","No result filtering or post-processing; raw output requires piping to jq or custom scripts"],"requires":["Node.js 18+","google-search package installed globally or via npx","Bash/shell environment (Windows requires WSL or Git Bash for bin/google-search script)"],"input_types":["search query (positional argument)","optional: --limit <number> (default 10)","optional: --timeout <milliseconds> (default 60000)","optional: --no-headless flag","optional: --state-file <path> (default ./browser-state.json)","optional: --get-html flag","optional: --save-html flag"],"output_types":["JSON (stdout) with SearchResponse structure","raw HTML string (with --get-html)","HTML file saved to disk (with --save-html)"],"categories":["search-retrieval","automation-workflow"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"github_mcp-web-agent-master-google-search__cap_3","uri":"capability://safety.moderation.browser.state.persistence.for.captcha.mitigation","name":"browser state persistence for captcha mitigation","description":"Saves and restores Playwright browser state (cookies, localStorage, sessionStorage) to a JSON file (default ./browser-state.json) between search invocations. This stateful approach preserves Google's session context and reduces CAPTCHA triggers by maintaining browser identity across multiple searches, unlike stateless HTTP clients that appear as fresh visitors to Google on each request.","intents":["Reduce CAPTCHA challenges when performing multiple searches in sequence","Maintain browser session identity across CLI invocations and MCP requests","Enable long-running search workflows without manual CAPTCHA solving","Preserve Google's trust signals (cookies, session tokens) across tool invocations"],"best_for":["Batch search workflows requiring 10+ searches without interruption","Long-running AI agents performing repeated searches","Automated research tools requiring sustained search access"],"limitations":["State file is global and not thread-safe; concurrent invocations may corrupt state","Stale state file (>24 hours old) may be rejected by Google; requires periodic refresh","CAPTCHA mitigation is probabilistic; high-frequency searches (>50/hour) still trigger blocks","State file contains sensitive cookies; requires filesystem permissions protection","No automatic state rotation; single state file becomes 'burned' after IP block"],"requires":["Writable filesystem with persistent storage","Playwright browser instance to load/save state","Initial successful search to bootstrap state file"],"input_types":["state-file path (--state-file CLI option, default ./browser-state.json)"],"output_types":["JSON file containing browser state (cookies, storage, session data)"],"categories":["safety-moderation","automation-workflow"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"github_mcp-web-agent-master-google-search__cap_4","uri":"capability://data.processing.analysis.structured.result.extraction.with.title.link.snippet.fields","name":"structured result extraction with title, link, snippet fields","description":"Parses Google search result DOM using Playwright's page.locator() and evaluate() methods to extract structured data (title, link, snippet) from each result element. Returns SearchResponse JSON array with typed fields, enabling downstream processing without regex parsing or HTML string manipulation. Extraction logic handles Google's dynamic DOM structure and adapts to layout variations.","intents":["Extract search results as structured JSON for programmatic processing","Feed search results into LLM context without manual HTML parsing","Build search result pipelines with type-safe data structures","Avoid brittle regex-based HTML parsing by using DOM selectors"],"best_for":["AI agents consuming search results as structured context","Data pipelines requiring typed search result objects","Developers building search result aggregators or analyzers"],"limitations":["Extraction depends on Google's DOM structure; layout changes may break selectors","No built-in fallback selectors; single DOM change can cause extraction failure","Snippet text is truncated by Google (typically 150-160 characters); full content unavailable","Sponsored/featured results may have different DOM structure; extraction may skip or misparse them","No ranking/position metadata; results returned in Google's order without explicit ranking"],"requires":["Playwright browser instance with loaded Google search results page","JavaScript execution enabled in browser context"],"input_types":["Playwright page object with rendered Google search results"],"output_types":["SearchResponse[] JSON array with {title, link, snippet} objects"],"categories":["data-processing-analysis","search-retrieval"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"github_mcp-web-agent-master-google-search__cap_5","uri":"capability://data.processing.analysis.raw.html.retrieval.and.screenshot.capture.for.custom.analysis","name":"raw html retrieval and screenshot capture for custom analysis","description":"Provides --get-html flag to return raw HTML string of search results page and --save-html flag to capture and save full page screenshot/HTML to disk. Enables custom parsing, archival, or visual debugging workflows where structured extraction is insufficient. Playwright's page.content() and page.screenshot() methods handle full-page capture including dynamic content.","intents":["Retrieve raw HTML for custom parsing or analysis beyond standard extraction","Archive search results as HTML snapshots for historical comparison","Debug search result layout and DOM structure visually","Extract data not available in structured SearchResponse (ads, featured snippets, knowledge panels)"],"best_for":["Researchers analyzing search result layout and presentation","Developers building custom search result parsers","Compliance/audit workflows requiring search result archival"],"limitations":["--get-html returns raw HTML string; requires external parsing (cheerio, jsdom) for processing","--save-html creates large files (2-10MB per page); disk space overhead for bulk searches","Screenshots capture viewport size only (typically 1920x1080); full-page height may be truncated","HTML contains inline styles and scripts; not suitable for direct republishing without sanitization","No built-in deduplication; repeated searches generate duplicate HTML files"],"requires":["Playwright browser instance","Writable filesystem (for --save-html)","Sufficient disk space for HTML/screenshot storage"],"input_types":["--get-html flag (returns HTML string)","--save-html flag (saves to disk)"],"output_types":["raw HTML string (--get-html)","HTML file on disk (--save-html)","PNG screenshot (--save-html with screenshot option)"],"categories":["data-processing-analysis","search-retrieval"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"github_mcp-web-agent-master-google-search__cap_6","uri":"capability://automation.workflow.configurable.timeout.and.headless.mode.control","name":"configurable timeout and headless mode control","description":"Exposes --timeout <milliseconds> (default 60000) and --no-headless CLI options to control Playwright browser behavior. Timeout parameter sets page navigation and element waiting limits; --no-headless disables headless mode to show visible browser window for debugging. Enables developers to tune performance vs. reliability and visually inspect search execution.","intents":["Adjust search timeout for slow networks or high-latency environments","Debug search failures by observing browser behavior visually","Optimize performance by reducing timeout for fast networks","Diagnose anti-bot detection triggers by watching browser interaction"],"best_for":["Developers debugging search extraction logic","DevOps engineers tuning timeouts for production deployments","Network engineers testing search performance in constrained environments"],"limitations":["Timeout applies globally to all page operations; no per-operation granularity","--no-headless requires display server (X11/Wayland on Linux, not available in Docker/CI)","Visible browser window slows execution (~2-3x slower than headless); unsuitable for production","No timeout recovery; exceeded timeout terminates search without retry","Timeout value must be manually tuned per network; no automatic adaptation"],"requires":["Node.js 18+","Display server for --no-headless (X11/Wayland on Linux, native on macOS/Windows)"],"input_types":["--timeout <milliseconds> (default 60000)","--no-headless flag (boolean)"],"output_types":["search results (on success) or timeout error (on failure)"],"categories":["automation-workflow"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"github_mcp-web-agent-master-google-search__cap_7","uri":"capability://automation.workflow.pino.based.structured.logging.for.debugging.and.monitoring","name":"pino-based structured logging for debugging and monitoring","description":"Implements logging via Pino logger (src/logger.ts) with structured JSON output, enabling developers to track search execution flow, anti-bot detection events, and errors. Logs include timestamps, log levels, and contextual data suitable for parsing by log aggregation systems (ELK, Datadog, CloudWatch). Supports configurable log levels for production vs. development environments.","intents":["Debug search failures and anti-bot detection triggers","Monitor search performance and latency in production","Aggregate logs from multiple search invocations for analysis","Diagnose CAPTCHA and IP block events"],"best_for":["DevOps engineers monitoring production search deployments","Developers debugging search failures","Teams using centralized log aggregation (ELK, Datadog)"],"limitations":["Pino outputs JSON by default; requires log parsing for human readability","No built-in log rotation; requires external log management (logrotate, systemd)","Structured logging adds ~5-10ms overhead per log entry","No built-in alerting; requires external monitoring system to trigger on error logs","Log verbosity not configurable via CLI; requires environment variable or code change"],"requires":["Pino logger (installed via npm)","Log aggregation system for production (optional but recommended)"],"input_types":["search execution events (navigation, DOM parsing, extraction)"],"output_types":["JSON log entries (stdout/stderr) with timestamp, level, message, context"],"categories":["automation-workflow"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"github_mcp-web-agent-master-google-search__cap_8","uri":"capability://code.generation.editing.typescript.type.system.with.searchresponse.and.htmlresponse.interfaces","name":"typescript type system with searchresponse and htmlresponse interfaces","description":"Defines typed interfaces (src/types.ts) for SearchResponse (array of {title, link, snippet} objects) and HtmlResponse (raw HTML string) using TypeScript. Enables type-safe consumption of search results in TypeScript applications and provides IDE autocomplete for result fields. Type definitions document expected output structure and catch type errors at compile time.","intents":["Enable type-safe result processing in TypeScript applications","Provide IDE autocomplete for search result fields","Document expected output structure for API consumers","Catch type errors at compile time vs. runtime"],"best_for":["TypeScript developers building search integrations","Teams using strict TypeScript configurations (noImplicitAny, strictNullChecks)","Projects requiring type safety for search result processing"],"limitations":["Type definitions only available in TypeScript; JavaScript consumers get no type checking","No runtime validation; types are erased at compile time","Type definitions must be manually updated if Google's result structure changes","No schema validation; malformed results pass type checking if structure matches"],"requires":["TypeScript 4.0+ (for type definitions)","TypeScript compiler or ts-node for type checking"],"input_types":["TypeScript type definitions (SearchResponse, HtmlResponse)"],"output_types":["typed SearchResponse[] or HtmlResponse objects"],"categories":["code-generation-editing"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"github_mcp-web-agent-master-google-search__cap_9","uri":"capability://safety.moderation.multi.layered.anti.detection.strategy.with.user.agent.and.viewport.randomization","name":"multi-layered anti-detection strategy with user-agent and viewport randomization","description":"Implements anti-bot evasion through user-agent rotation (randomizing User-Agent header), viewport randomization (varying browser window size), and request throttling (adding delays between navigation and interactions). These strategies operate at the Playwright browser level, making searches appear as legitimate user traffic rather than automated bots. Combines multiple evasion techniques to increase success rate against Google's detection heuristics.","intents":["Bypass Google's anti-bot detection to execute searches successfully","Reduce CAPTCHA and IP block frequency","Appear as legitimate user traffic to Google's detection systems","Enable sustained search workflows without manual intervention"],"best_for":["High-volume search workflows requiring sustained access","AI agents performing repeated searches without manual CAPTCHA solving","Developers building search tools for production use"],"limitations":["Anti-detection is probabilistic; no guarantee against detection on any single search","User-agent rotation alone insufficient; Google detects via behavioral patterns (timing, request sequences)","Viewport randomization has minimal impact; Google detects via request metadata, not browser dimensions","Request throttling adds latency (~500ms-2s per search); slows batch workflows","No proxy rotation; same IP address across searches increases detection risk","Detection evasion is arms race; Google updates detection heuristics regularly"],"requires":["Playwright browser instance","Network access to google.com","Patience for throttled requests (anti-detection adds latency)"],"input_types":["search query and parameters"],"output_types":["search results (on successful evasion) or CAPTCHA/block error (on detection)"],"categories":["safety-moderation","automation-workflow"],"confidence":0.5,"matches":0,"success_rate":0}],"trust":{"score":41,"verified":false,"data_access_risk":"high","permissions":["Node.js 18+","Playwright browser binaries (auto-installed via npm)","Writable filesystem for browser state persistence","Network access to google.com (not blocked by corporate firewall/ISP)","MCP SDK (installed via npm)","Claude or MCP-compatible AI assistant with MCP server support","Proper MCP server configuration in assistant's config file (e.g., claude_desktop_config.json)","google-search package installed globally or via npx","Bash/shell environment (Windows requires WSL or Git Bash for bin/google-search script)","Writable filesystem with persistent storage"],"failure_modes":["Subject to Google's dynamic anti-bot detection; may encounter CAPTCHAs or IP blocks on high-frequency searches","Browser state persistence (./browser-state.json) mitigates but doesn't eliminate CAPTCHA challenges","Single-threaded Playwright execution limits concurrent search parallelism","No built-in proxy rotation; requires external proxy infrastructure for large-scale scraping","Slower than cached SERP APIs (Playwright startup + navigation overhead ~3-5 seconds per search)","MCP server runs as separate process; requires stdio communication overhead (~10-50ms per request)","Global browser instance is shared across all concurrent MCP requests; no request isolation","No built-in request queuing; concurrent searches may block each other if Playwright browser is busy","Requires manual MCP server configuration in Claude/AI assistant config files","No authentication/authorization layer; any process with stdio access can invoke searches","builder identity is not verified yet","no observed match outcomes yet"],"rank_breakdown":{"adoption":0.384115718658587,"quality":0.45,"ecosystem":0.55,"match_graph":0.25,"freshness":0.52,"weights":{"adoption":0.25,"quality":0.25,"ecosystem":0.15,"match_graph":0.23,"freshness":0.12}},"observed_outcomes":{"matches":0,"success_rate":0,"avg_confidence":0,"top_intents":[],"last_matched_at":null},"maintenance":{"status":"active","updated_at":"2026-05-24T12:16:22.065Z","last_scraped_at":"2026-05-03T14:23:44.761Z","last_commit":"2025-04-06T06:05:15Z"},"community":{"stars":590,"forks":95,"weekly_downloads":null,"model_downloads":null,"model_likes":null}},"distribution":{"claim_url":"https://unfragile.ai/submit?claim=mcp-web-agent-master-google-search","compare_url":"https://unfragile.ai/compare?artifact=mcp-web-agent-master-google-search"}},"signature":"fuCu8hZOIcM8JnHdQMPw6NNPJQ6kZgtODYOsNKQAs9Nqo8JgFJfKVZGj30/lQAztVk2jdXiXXX4fdvJJdAFuCA==","signedAt":"2026-06-20T04:04:45.488Z","signedBy":"unfragile.ai","version":1},"_links":{"self":"https://unfragile.ai/api/v1/passport/mcp-web-agent-master-google-search","artifact":"https://unfragile.ai/mcp-web-agent-master-google-search","verify":"https://unfragile.ai/api/v1/verify?slug=mcp-web-agent-master-google-search","publicKey":"https://unfragile.ai/api/v1/trust-passport-public-key","spec":"https://unfragile.ai/trust","schema":"https://unfragile.ai/schema.json","docs":"https://unfragile.ai/docs"}}