AI Models
The model layer — from frontier foundation models (GPT-4, Claude, Gemini, LLaMA) to fine-tuned specialists, quantized variants, and domain-specific models for code, vision, audio, and more.
Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
LLM驱动的 A/H/美股智能分析器:多数据源行情 + 实时新闻 + LLM决策仪表盘 + 多渠道推送,零成本定时运行,纯白嫖. LLM-powered stock analysis system for A/H/US markets.
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
FULL Augment Code, Claude Code, Cluely, CodeBuddy, Comet, Cursor, Devin AI, Junie, Kiro, Leap.new, Lovable, Manus, NotionAI, Orchids.app, Perplexity, Poke, Qoder, Replit, Same.dev, Trae, Traycer AI, VSCode Agent, Warp.dev, Windsurf, Xcode, Z.ai Code, Dia & v0. (And other Open Sourced) System Prompts
A high-throughput and memory-efficient inference and serving engine for LLMs
The AI Toolkit for TypeScript. From the creators of Next.js, the AI SDK is a free open-source library for building AI-powered applications and agents
Open-source AI hackers to find and fix your app’s vulnerabilities.
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning
Web UI for training and running open models like Gemma 4, Qwen3.5, DeepSeek, gpt-oss locally.
LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.
The open source platform for AI-native application development.
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
LlamaIndex is the leading document agent and OCR platform
CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies
🚀💪Maximize your efficiency and productivity. The ultimate hub to manage, customize, and share prompts. (English/中文/Español/العربية). 让生产力加倍的 AI 快捷指令。更高效地管理提示词,在分享社区中发现适用于不同场景的灵感。
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Opiniated RAG for integrating GenAI in your apps 🧠 Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama. Any Vectorstore: PGVector, Faiss. Any Files. Anyway you want.
Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration. Used by OpenAI and Anthropic.
Postgres with GPUs for ML/AI apps.
Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. 🐳Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.
🙌 OpenHands: AI-Driven Development
Open Source AI Platform - AI Chat with advanced features that works with every LLM
Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.
✨ AI Coding, Vim Style
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. Each technique has a detailed notebook tutorial.
22 prompt engineering techniques with hands-on Jupyter Notebook tutorials, from fundamental concepts to advanced strategies for leveraging LLMs.
Run frontier LLMs and VLMs with day-0 model support across GPU, NPU, and CPU, with comprehensive runtime coverage for PC (Python/C++), mobile (Android & iOS), and Linux/IoT (Arm64 & x86 Docker). Supporting OpenAI GPT-OSS, IBM Granite-4, Qwen-3-VL, Gemma-3n, Ministral-3, and more.
An APP that integrates mainstream large language models and image generation models, built with Flutter, with fully open-source code.
A CLI tool to convert your codebase into a single LLM prompt with source tree, prompt templating, and token counting.
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
UFO³: Weaving the Digital Agent Galaxy
Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
A modular graph-based Retrieval-Augmented Generation (RAG) system
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama model family and using them on various provider services
AI memory OS for LLM and Agent systems(moltbot,clawdbot,openclaw), enabling persistent Skill memory for cross-task skill reuse and evolution.
AirLLM 70B inference with single 4GB GPU
Unified framework for building enterprise RAG pipelines with small, specialized models
An AI prompt optimizer for writing better prompts and getting better AI results.
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
LangChain4j is an idiomatic, open-source Java library for building LLM-powered applications on the JVM. It offers a unified API over popular LLM providers and vector stores, and makes implementing tool calling (including MCP support), agents and RAG easy. It integrates seamlessly with enterprise Jav
Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.
🪨 why use many token when few token do trick — Claude Code skill that cuts 65% of tokens by talking like caveman
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Sample code and notebooks for Generative AI on Google Cloud, with Gemini on Vertex AI
Open-source framework for building AI-powered apps in JavaScript, Go, and Python, built and used in production by Google
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
Retrieval and Retrieval-augmented LLMs
f.k.a. Awesome ChatGPT Prompts. Share, discover, and collect prompts from the community. Free and open source — self-host for your organization with complete privacy.
Local, open-source AI app builder for power users ✨ v0 / Lovable / Replit / Bolt alternative 🌟 Star if you like it!
Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing, memory, and generation. Built for scalable agents, RAG, multimodal applications, semantic search, and
📚 从零开始构建大模型
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain
Build Conversational AI in minutes ⚡️
The open-source hub to build & deploy GPT/LLM Agents ⚡️
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, VLLM, NVIDIA NIM]
Extracted system prompts from ChatGPT (GPT-5.4, GPT-5.3, Codex), Claude (Opus 4.6, Sonnet 4.6, Claude Code), Gemini (3.1 Pro, 3 Flash, CLI), Grok (4.2, 4), Perplexity, and more. Updated regularly.
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
Deeplake is AI Data Runtime for Agents. It provides serverless postgres with a multimodal datalake, enabling scalable retrieval and training.
RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry
AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
A curated list of modern Generative Artificial Intelligence projects and services
Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.
A tremendous feat of documentation, this guide covers Claude Code from beginner to power user, with production-ready templates for Claude Code features, guides on agentic workflows, and a lot of great learning materials, including quizzes and a handy "cheatsheet". Whether it's the "ultimate" guide t
Everything you need to know to build your own RAG application
🧑🚀 全世界最好的LLM资料总结(多模态生成、Agent、辅助编程、AI审稿、数据处理、模型训练、模型推理、o1 模型、MCP、小语言模型、视觉语言模型) | Summary of the world's best LLM resources.
Fetch source code for npm packages to give AI coding agents deeper context
A foundational, 65-billion-parameter large language model by Meta....
DALL·E 2 by OpenAI is a new AI system that can create realistic images and art from a description in natural...
Real-time object detection, segmentation, and pose.
01.AI's high-performance reasoning model.
01.AI's bilingual 34B model with 200K context option.
OpenAI's best speech recognition model for 100+ languages.
OpenAI's open-source speech recognition — 99 languages, translation, timestamps, runs locally.
1.1B model pre-trained on 3T tokens for edge use.
Open code model trained on 600+ languages.
Widely adopted open image model with massive ecosystem.
Stability AI's 8B parameter flagship image generation model.
Open-source image generation — SD3, SDXL, massive ecosystem of LoRAs, ControlNets, runs locally.
Snowflake's 480B MoE model for enterprise data tasks.
Hugging Face's small model family for on-device use.
Google's safety content classifiers built on Gemma.
Meta's foundation model for visual segmentation.
Google's vision-language-action model for robotics.
Alibaba's 32B reasoning model with chain-of-thought.
Alibaba's code-specialized model matching GPT-4o on coding.
Alibaba's 72B open model trained on 18T tokens.
Meta's prompt injection and jailbreak detection classifier.
Mistral's 124B multimodal model with vision capabilities.
Microsoft's compact model for edge deployment.
Microsoft's 14B model rivaling 70B through data quality.
Microsoft's 3.8B model with 128K context for edge deployment.
Google's vision-language model for fine-grained tasks.
This model always redirects to the latest model in the Claude Opus family.
Palmyra X5 is Writer's most advanced model, purpose-built for building and scaling AI agents across the enterprise. It delivers industry-leading speed and efficiency on context windows up to 1 million...
What are AI Models?
AI models are the foundation layer — the neural networks that generate text, code, images, audio, and video. The landscape ranges from frontier foundation models (GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 Pro, LLaMA 3) to specialized models for code, vision, embeddings, and speech. Key decisions: closed-source APIs vs. open-weight models, cloud vs. local inference, and general-purpose vs. domain-specific.
How to Choose
Start with the task, not the model. For text generation and reasoning, benchmark on YOUR use case (not general benchmarks). For embeddings, test retrieval quality on your domain. For code, test on your language and codebase complexity. Key trade-offs: capability vs. cost vs. latency vs. privacy. Open models give you control and privacy; API models give you convenience and frontier performance.
Key Capabilities to Evaluate
Common Patterns
Call a hosted model via API. Simplest path, highest capability, but data leaves your infrastructure.
Run models on your hardware via Ollama, llama.cpp, or vLLM. Full privacy, but requires GPU resources.
Adapt a base model to your domain with custom training data. Higher quality for specific tasks, but requires data and compute.
Route requests to different models based on task complexity. Use a smaller model for simple tasks, a larger one for complex reasoning.
What to Watch Out For
Top Capabilities
Browse all →Analyzes selected code or entire files and generates natural language explanations of what the code does, how it works, and why certain patterns were chosen. The feature can produce documentation in multiple formats (docstrings, comments, markdown) and supports various documentation styles (JSDoc, Sphinx, etc.). Developers can request explanations at different levels of detail (high-level overview, line-by-line breakdown, architectural context) through the chat interface, with responses appearing as formatted text or code comments.
Translates non-English speech directly to English text using the same Transformer encoder-decoder architecture by prepending a 'translate' task token during decoding, bypassing explicit transcription. The AudioEncoder processes mel spectrograms identically to transcription, but the TextDecoder generates English tokens directly from audio embeddings. This end-to-end approach avoids cascading errors from intermediate transcription-then-translation pipelines and enables language-agnostic audio understanding.
Detects the spoken language in audio by analyzing the AudioEncoder embeddings and using the TextDecoder to predict a language token before generating transcription text. Language detection is implicit in the multitask training; the model learns to identify language from acoustic features without a separate classification head. Supports 99 languages with varying confidence based on training data representation (English: 65% of training data, others: 0.1-2%).
Maintains conversation history within a single chat session, allowing developers to ask follow-up questions, request refinements, and build on previous responses without re-providing context. The extension manages conversation state (messages, responses, context) and sends the full conversation history to ChatGPT's API with each request, enabling contextual understanding of refinement requests like 'make it faster' or 'add error handling'.
Generates new code snippets based on natural language descriptions by sending the user's intent and current editor selection context to OpenAI's API, then inserting the generated code at the cursor position or displaying it in the sidebar. The extension reads the active editor's selected text to provide code context, enabling the model to generate syntactically appropriate code for the detected language. Generation is triggered via keyboard shortcut (Ctrl+Alt+G), command palette, or toolbar button.
Generates docstrings, comments, and API documentation for functions, classes, and modules by analyzing code structure and semantics using GPT-4o. The extension detects function signatures, parameter types, and return types, then generates documentation in multiple formats (JSDoc, Python docstrings, Javadoc, etc.) matching the language and project conventions. Generated docs are inserted inline with proper indentation and formatting.
Analyzes staged or modified code changes in the current Git repository and generates descriptive commit messages using the configured AI provider. The feature integrates with VS Code's Git context to identify changed files and diffs, then sends this information to the AI model to produce commit messages following conventional commit formats or project-specific conventions. This automation reduces the cognitive load of writing commit messages while maintaining code quality and repository history clarity.
Offers a freemium pricing structure where basic problem detection and explanations are available for free, with premium features (likely advanced fix generation, priority support, or higher API quotas) available through paid subscription. The free tier includes GNN-based problem detection and LLM-powered explanations using Metabob's default backend, while premium tiers likely unlock OpenAI ChatGPT integration, higher analysis quotas, or team features. Pricing details are not publicly documented in the marketplace listing.
Browse Other Types
Autonomous AI systems that act on your behalf
MCP ServersModel Context Protocol tools and integrations
RepositoriesOpen-source AI projects on GitHub
APIsProgrammatic endpoints for AI capabilities
ExtensionsBrowser and IDE extensions powered by AI
WorkflowsAutomation sequences and AI pipelines
View all 14 types →Frequently Asked Questions
What is the best AI model in 2026?
There is no single best model — it depends on your task, budget, and constraints. For general reasoning, Claude and GPT-4o lead. For open-source, LLaMA 3 and Mistral are top choices. For code, Claude and DeepSeek-Coder excel. For images, Midjourney and DALL-E 3 lead. Always benchmark on your specific use case.
Should I use open-source or closed-source AI models?
Open-source models give you control, privacy, and no per-token costs — but require infrastructure and may lag frontier capabilities. Closed-source APIs give you the best models with zero infrastructure — but create vendor dependency and data privacy concerns. Many teams use both: closed-source for complex tasks, open-source for simple/high-volume tasks.