9 tools · Browse 9 code generation AI artifacts on Unfragile.
Real-world software engineering task evaluation suite
OpenAI's standard for evaluating code generation models
Mostly Basic Programming Problems (beginner-friendly code)
Live coding benchmark with recent LeetCode problems
Extended code evaluation with harder test cases for HumanEval
Alibaba's Qwen 2.5 specialized for code generation and understanding — code-specialized
Meta's CodeLlama — Llama-based model specialized for code — code-specialized
BigCode's StarCoder 2 — multilingual code generation model — code-specialized
DeepSeek's Coder V2 — specialized for code generation and understanding — code-specialized
© 2026 Unfragile. Stronger through disorder.