Github Repository With Evaluation Code And Implementation

1

VBenchBenchmark63/100

16-dimension benchmark for video generation quality.

Unique: Provides open-source implementation of evaluation pipeline enabling local execution and community contributions, rather than proprietary closed-source benchmark. Supports transparency and enables researchers to understand and extend methodology.

vs others: Open-source code enables local evaluation, customization, and community contributions, whereas closed-source benchmarks limit transparency and extensibility. However, code quality, documentation, and maintenance status not reviewed.

2

mcp-evalsMCP Server48/100

via “evaluation result reporting and github integration”

GitHub Action for evaluating MCP server tool calls using LLM-based scoring

Unique: Native GitHub Actions integration that automatically posts evaluation results as check runs and PR comments without requiring custom GitHub API orchestration, making results immediately visible in developers' existing GitHub workflows

vs others: Simpler than building custom GitHub integrations because it provides pre-built reporting templates and GitHub API abstraction, whereas generic evaluation tools require manual GitHub API integration

3

mcp-evalsMCP Server29/100

via “evaluation result reporting and github integration”

GitHub Action for evaluating MCP server tool calls using LLM-based scoring

Unique: Multi-channel reporting that leverages GitHub's native check runs and PR comment APIs to provide contextual feedback at the point of code review, rather than requiring developers to check a separate dashboard.

vs others: More integrated into GitHub's native workflow than external dashboards or email reports, reducing friction for developers to see and act on evaluation results.

4

Cognition AIProduct

via “github-repository-analysis-and-implementation”

Top Matches

Also Known As

Company