Llm Vulnerability Scanning

1

GiskardBenchmark63/100

via “automated llm vulnerability scanning with multi-detector pattern”

AI testing for quality, safety, compliance — vulnerability scanning, bias/toxicity detection.

Unique: Uses a pluggable detector architecture where each vulnerability class (hallucination, injection, bias, etc.) is a separate detector inheriting from a base scanner, enabling independent scaling and customization. The ScanReport abstraction automatically converts scan findings into executable GiskardTest suites, closing the gap between vulnerability discovery and test automation.

vs others: More comprehensive than point-solution tools like Promptfoo (which focus on output comparison) because it detects structural vulnerabilities like hallucination and prompt injection through LLM-as-judge evaluation rather than regex or keyword matching.

2

SourceryAgent59/100

via “security vulnerability scanning with dependency risk assessment”

AI code review agent for pull requests.

Unique: Combines dependency vulnerability scanning (CVE-based) with LLM-based logic error detection to identify both known vulnerabilities and novel security patterns (e.g., insecure deserialization, weak cryptography usage). Integrates with VCS webhooks for automated scanning without manual trigger.

vs others: More comprehensive than dependency-only scanners (Dependabot, Snyk) because it also detects logic-based vulnerabilities (SQL injection, XSS) through code analysis. Faster than manual security review and more accessible than hiring dedicated security engineers.

3

hexstrike-aiMCP Server58/100

via “advanced vulnerability research with adaptive tool chaining”

HexStrike AI MCP Agents is an advanced MCP server that lets AI agents (Claude, GPT, Copilot, etc.) autonomously run 150+ cybersecurity tools for automated pentesting, vulnerability discovery, bug bounty automation, and security research. Seamlessly bridge LLMs with real-world offensive security capa

Unique: Implements VulnerabilityResearchManager with feedback loops that chain vulnerability discovery, root cause analysis via reverse engineering, and exploitation testing, enabling adaptive research that adjusts analysis depth based on vulnerability complexity rather than static analysis workflows

vs others: Deeper than automated scanning tools; combines multiple analysis techniques (scanning, reverse engineering, exploitation testing) with AI-driven adaptation, enabling comprehensive vulnerability research without manual tool orchestration

4

promptfooCLI Tool57/100

via “automated red-team vulnerability scanning”

LLM prompt testing and evaluation — compare models, detect regressions, assertions, CI/CD.

Unique: Implements a modular attack strategy system where each vulnerability type (jailbreak, injection, prompt leaking, toxicity, bias) is a pluggable provider that generates test cases. Strategies can be composed and parameterized (e.g., 'crescendo jailbreak with 5 iterations'), and results are graded against guardrails (safety checks) to produce a structured vulnerability report.

vs others: Purpose-built red-teaming system integrated into evaluation pipeline (not a separate tool); supports custom attack strategies via plugins; generates reproducible adversarial test cases that can be version-controlled and shared

5

Llama Guard 3Model57/100

via “red-team and blue-team cybersecurity benchmarking framework (cyberseceval)”

Meta's safety classifier for LLM content moderation.

Unique: CyberSecEval v3 is the first industry-wide cybersecurity benchmark suite that combines multiple attack vectors (prompt injection, MITRE ATT&CK, code interpreter abuse, visual injection, spear phishing, autonomous operations) in a single framework with multi-provider LLM abstraction, enabling comparative security evaluation across different model families and versions.

vs others: More comprehensive than single-vector benchmarks (e.g., prompt injection-only tests) and more practical than manual red-teaming because it provides reproducible, scalable evaluation across multiple LLM providers with standardized metrics.

6

Llama GuardModel57/100

via “cybersecurity benchmark evaluation and red-teaming integration”

Meta's LLM safety classifier for content policy enforcement.

Unique: Llama Guard is integrated into CyberSecEval, a comprehensive cybersecurity benchmark framework that includes MITRE-mapped attacks, prompt injection tests, code interpreter abuse scenarios, and autonomous offensive cyber operations — providing structured red-teaming coverage beyond generic safety classification.

vs others: More comprehensive than ad-hoc red-teaming because it provides standardized benchmarks and evaluation protocols, though benchmarks lag behind real-world attack evolution

7

LLM GuardFramework57/100

via “litellm integration for transparent scanner injection into llm calls”

Open-source LLM input/output security scanner toolkit.

Unique: Integrates with LiteLLM proxy layer enabling transparent scanner injection without application code changes; supports configuration-driven per-model/provider scanning policies; works with all LiteLLM-compatible providers (OpenAI, Anthropic, Ollama, Azure, etc.) in unified framework

vs others: More transparent than manual scanner calls because it integrates at LiteLLM middleware layer; more flexible than provider-specific security solutions because it works across all LiteLLM providers; enables security-by-default without requiring developers to remember to call scanners

8

Patronus AIProduct55/100

via “automated-red-teaming-and-adversarial-testing”

Enterprise LLM evaluation for hallucination and safety.

Unique: Automated red-teaming integrated into Patronus's experiment platform, enabling systematic adversarial testing without manual prompt engineering. Results are tracked alongside other evaluations (hallucination, toxicity, PII) for holistic vulnerability assessment.

vs others: Provides automated red-teaming as part of a comprehensive evaluation suite, reducing the need for manual security testing and enabling continuous regression testing across model updates.

9

promptfooCLI Tool53/100

via “automated red-team vulnerability scanning and attack generation”

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration. Used by OpenAI and Anthropic.

Unique: Uses a plugin-based attack strategy architecture where each attack type (jailbreak, prompt injection, PII extraction) is implemented as a composable plugin with metadata. Attack providers (which can be LLMs themselves) generate adversarial inputs, and results are graded using pluggable graders that can be LLM-based classifiers or custom functions. This enables extending attack coverage without modifying core code.

vs others: More comprehensive than manual red-teaming because it systematically explores multiple attack vectors in parallel, and more actionable than generic vulnerability scanners because it provides concrete failing prompts and categorized results specific to LLM behavior.

10

strixRepository50/100

via “llm-controlled multi-agent penetration testing orchestration”

Open-source AI hackers to find and fix your app’s vulnerabilities.

Unique: Uses LLM agents in isolated Docker containers with specialized system prompts for different attack vectors, enabling dynamic proof-of-concept validation rather than static pattern matching. Implements inter-agent communication and centralized vulnerability deduplication to coordinate findings across parallel testing threads.

vs others: Automates the entire penetration testing workflow from reconnaissance to exploitation with PoC validation, whereas traditional SAST tools produce false positives and manual penetration testing requires expensive security experts.

11

AppMapExtension47/100

via “security-vulnerability-detection-in-code-analysis”

AI-driven chat with a deep understanding of your code. Build effective solutions using an intuitive chat interface and powerful code visualizations.

Unique: Integrates security analysis into the code review workflow using LLM reasoning combined with codebase context, rather than relying solely on pattern matching or static analysis rules. Can incorporate runtime execution traces to detect data flow-based vulnerabilities.

vs others: Provides LLM-powered security analysis integrated into the IDE workflow, unlike external SAST tools or manual security reviews, though less comprehensive than dedicated security scanning platforms.

12

jadx-ai-mcpMCP Server45/100

via “automated vulnerability detection and sast recommendations via llm analysis”

Plugin for JADX to integrate MCP server

Unique: Delegates vulnerability detection to the LLM's semantic reasoning rather than using hardcoded SAST rules. The system provides rich context (code, resources, xrefs) and lets the AI identify vulnerabilities based on understanding of security principles, enabling detection of novel or context-specific issues that rule-based tools miss.

vs others: More flexible than traditional SAST tools (Checkmarx, Fortify) because it adapts to new vulnerability patterns without rule updates; more accurate than simple pattern matching because it understands code semantics and context.

13

agent-scanCLI Tool43/100

via “offline local vulnerability inspection without remote submission”

Security scanner for AI agents, MCP servers and agent skills.

Unique: Implements local-first vulnerability detection using built-in heuristics and pattern signatures, enabling offline scanning without external API dependencies; trades detection accuracy for privacy and network isolation

vs others: Enables security scanning in restricted environments where remote API calls are prohibited, while maintaining the same CLI interface as remote scanning for operational consistency

14

agentsealCLI Tool41/100

via “local-skill-inventory-scanning”

Security toolkit for AI agents. Scan your machine for dangerous skills and MCP configs, monitor for supply chain attacks, test prompt injection resistance, and audit live MCP servers for tool poisoning.

Unique: Performs offline, filesystem-based skill enumeration with threat pattern matching against a curated dangerous-operations database, enabling detection of risky capabilities before they're exposed to untrusted LLM inputs — unlike cloud-based security scanners that require uploading agent configs

vs others: Faster and more privacy-preserving than cloud-based agent security scanners because it runs entirely locally without transmitting skill definitions or configurations to external services

15

mcpsafetywardenMCP Server36/100

via “llm-powered security scanning”

A security layer for MCP wraps any MCP server to add behavioral profiling, LLM-powered security scanning, schema tamper detection, risk gating, cross-tool exfiltration analysis and lot more. Drop it in front of your existing MCP servers to get visibility into what tools are actually doing before the

Unique: Utilizes a fine-tuned LLM specifically for security scanning, providing context-aware insights unlike generic code analysis tools.

vs others: Offers deeper contextual understanding than traditional static analysis tools.

16

MCP Security Scanning Tool for CI/CDMCP Server36/100

via “integration with llm agents for autonomous security workflows”

Show HN: MCP Security Scanning Tool for CI/CD

Unique: Designs all security capabilities as composable MCP tools that LLM agents can chain together for autonomous workflows, vs traditional security tools that require human orchestration

vs others: Enables autonomous security workflows through LLM agent orchestration vs manual security review processes or rigid automation scripts

17

MESH by Viscount Vuln ScannerMCP Server35/100

via “single ip scanning for mesh systems”

A comprehensive MCP server for scanning and analyzing MESH by Viscount systems for default credential vulnerabilities. This tool is designed for security research and educational purposes only. ## 🚨 Important Notice **This tool is for educational and security research purposes only.** Unauthorize

Unique: Utilizes asynchronous scanning techniques to minimize downtime and maximize efficiency when probing individual IPs.

vs others: More efficient than traditional tools that perform synchronous scans, reducing overall time for single IP assessments.

18

@sunchao116/mcp-auditMCP Server34/100

via “local-npm-dependency-vulnerability-scanning”

A Model Context Protocol (MCP) server tool for auditing npm package dependencies, supporting both local and remote repository security audits

Unique: Exposes npm audit as an MCP tool endpoint, allowing LLM agents to invoke vulnerability scanning as a native capability within their reasoning loop rather than requiring shell command execution or separate API calls. Bridges the gap between CLI-based npm audit and agent-driven security workflows.

vs others: Unlike running npm audit directly in CI/CD, this MCP server allows LLMs to interpret and act on audit results in real-time, enabling dynamic decision-making (e.g., 'block deployment if critical vulnerabilities found')

19

kali_mcp-mcp-serverMCP Server33/100

via “automated vulnerability scanning workflows”

Streamline ethical security testing with a curated set of Kali-based reconnaissance, web, crypto, reversing, and forensics workflows. Run reproducible assessments with managed workspaces and shareable results. Use only on systems you own or have explicit permission to test..

Unique: Incorporates a scheduling mechanism that allows for automated, time-based vulnerability scans, unlike manual execution methods.

vs others: More efficient than manual scanning processes, enabling regular assessments without user intervention.

20

MCPWatchCLI Tool32/100

via “multi-scanner vulnerability orchestration with parallel execution”

** - A comprehensive security scanner for Model Context Protocol (MCP) servers that detects vulnerabilities and security issues in your MCP server implementations.

Unique: Implements a modular scanner architecture with 11 research-backed vulnerability detectors coordinated through a single orchestrator class, enabling extensible security scanning specific to MCP protocol implementations rather than generic code analysis

vs others: Purpose-built for MCP security with domain-specific vulnerability patterns from VulnerableMCP database and HiddenLayer research, whereas generic SAST tools lack MCP protocol-specific detection rules

Top Matches

Also Known As

Company