{"passport":{"unfragile":{"@version":"1.0","version":"2026-05","artifact":{"id":"ollama-llama3","slug":"llama3","name":"Llama 3 (8B, 70B)","type":"model","url":"https://ollama.com/library/llama3","page_url":"https://unfragile.ai/llama3","categories":["text-writing"],"tags":["ollama","open-source","meta"],"pricing":{"model":"open_source","free":true,"starting_price":null},"status":"active","verified":false},"capabilities":[{"id":"ollama-llama3__cap_0","uri":"capability://text.generation.language.instruction.tuned.dialogue.generation.with.8k.context.window","name":"instruction-tuned dialogue generation with 8k context window","description":"Generates contextually coherent multi-turn conversations using a Transformer architecture fine-tuned for instruction-following. The model processes chat messages in role/content JSON format, maintaining dialogue state across up to 8,192 tokens of context. Fine-tuning optimizes for natural dialogue patterns rather than raw text prediction, enabling the model to follow user instructions and maintain conversational coherence across multiple exchanges.","intents":["Build a chatbot that understands and follows user instructions naturally","Create a conversational AI assistant that maintains context across multiple turns","Deploy a local LLM that doesn't require cloud API calls for dialogue tasks","Integrate an open-source alternative to proprietary chat models like GPT-4"],"best_for":["Solo developers building local-first LLM applications","Teams deploying on-premises AI without cloud dependencies","Builders prototyping conversational agents with privacy requirements","Organizations evaluating open-source alternatives to commercial LLMs"],"limitations":["Hard 8K token context limit — cannot process documents or conversations longer than ~6,000 words without truncation","No knowledge cutoff date documented — unclear when training data ends, limiting reliability for current-events queries","Instruction-tuning optimizations may reduce raw text generation capability compared to base models","No multimodal support — text input/output only, cannot process images, audio, or video"],"requires":["Ollama runtime (local deployment) OR Ollama Cloud account (cloud deployment)","For local: minimum GPU VRAM requirement unknown (not documented)","For cloud: Free/Pro/Max tier subscription with concurrent request limits (Free=1, Pro=3, Max=10)","HTTP client or SDK (Python, JavaScript/Node.js, or cURL) to call REST API"],"input_types":["text (chat messages in JSON format with role and content fields)"],"output_types":["text (streaming or buffered response generation)"],"categories":["text-generation-language","conversational-ai"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"ollama-llama3__cap_1","uri":"capability://tool.use.integration.local.rest.api.inference.with.streaming.output","name":"local rest api inference with streaming output","description":"Exposes Llama 3 inference through HTTP endpoints (`/api/chat` and `/api/generate`) that support both streaming and buffered response modes. The Ollama runtime handles model loading, quantization, and GPU memory management transparently, allowing developers to call the model via standard HTTP POST requests with JSON payloads. Streaming responses use server-sent events (SSE) or chunked transfer encoding for real-time token delivery.","intents":["Call a local LLM from any programming language without language-specific bindings","Stream model outputs in real-time to build responsive UI applications","Avoid cloud API latency and costs by running inference on local hardware","Integrate Llama 3 into existing REST-based microservice architectures"],"best_for":["Full-stack developers building web applications with local LLM backends","Teams with privacy requirements who cannot send data to cloud APIs","Builders prototyping LLM features without committing to cloud vendor pricing","Systems integrators adding LLM capabilities to existing REST-based services"],"limitations":["Ollama runtime must be running on the same machine or accessible network — adds operational overhead vs managed cloud APIs","Streaming implementation details (SSE vs chunked encoding) not documented — may require client-side adaptation","No built-in authentication or rate limiting in Ollama REST API — requires external reverse proxy (nginx, etc.) for production security","Concurrent request handling depends on Ollama plan tier (Free=1 concurrent, Pro=3, Max=10) — exceeding limits queues requests with fixed queue size"],"requires":["Ollama runtime installed and running (local or remote)","HTTP client library (curl, requests, fetch, axios, etc.)","Network connectivity to Ollama endpoint (localhost:11434 by default)","For cloud: Ollama Cloud account with appropriate tier"],"input_types":["JSON (chat messages with role/content structure or raw prompt text)"],"output_types":["text (streaming or buffered JSON responses with token-level granularity)"],"categories":["tool-use-integration","api-orchestration"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"ollama-llama3__cap_10","uri":"capability://automation.workflow.session.based.usage.limits.with.time.based.resets","name":"session-based usage limits with time-based resets","description":"Ollama Cloud enforces session timeouts (5-hour limit per session) and weekly usage resets, preventing indefinite resource consumption and enforcing fair-use policies across users. Sessions expire after 5 hours of inactivity or absolute time, and weekly limits reset every 7 days. This pattern is designed for shared cloud infrastructure where per-user resource quotas prevent any single user from monopolizing resources.","intents":["Understand session lifetime constraints for long-running applications","Plan for weekly usage resets when designing batch processing workflows","Implement session management and reconnection logic for applications exceeding 5-hour sessions","Budget inference usage across weekly reset boundaries"],"best_for":["Prototyping and development workloads with predictable, short-lived sessions","Applications with weekly usage patterns that align with reset cycles","Teams implementing session management and reconnection logic","Organizations with bursty inference loads that fit within weekly quotas"],"limitations":["5-hour session limit is impractical for long-running batch jobs or 24/7 services — requires frequent reconnection","Weekly reset timing not documented — unclear if resets are UTC-based or per-user timezone","No documented way to extend sessions or request higher limits — forces application redesign for longer sessions","Unclear how session limits interact with concurrency limits — cannot predict behavior when session expires mid-request","No usage monitoring or quota visibility documented — cannot track consumption before hitting limits"],"requires":["Ollama Cloud account with understanding of session/weekly limits","Client-side session management and reconnection logic","Batch job design that respects 5-hour session boundaries"],"input_types":["HTTP requests (any format supported by Ollama API)"],"output_types":["HTTP responses (success or session-expired error)"],"categories":["automation-workflow","safety-moderation"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"ollama-llama3__cap_11","uri":"capability://memory.knowledge.23.5m.model.downloads.with.community.validation","name":"23.5m+ model downloads with community validation","description":"Llama 3 has been downloaded 23.5M+ times via Ollama, indicating broad community adoption and implicit validation of model quality and usability. The high download count suggests the model is production-ready and widely trusted, though this is a social signal rather than formal certification. Ollama's model registry includes community ratings, reviews, and usage statistics that help developers assess model reliability.","intents":["Assess model maturity and community adoption before committing to production deployment","Gain confidence that model is battle-tested across diverse use cases","Discover community-reported issues, limitations, or best practices","Benchmark against other open-source models based on adoption metrics"],"best_for":["Teams evaluating open-source models for production use","Developers seeking reassurance that a model is widely adopted and stable","Organizations comparing model maturity across the open-source ecosystem","Builders looking for community-validated alternatives to proprietary models"],"limitations":["Download count is a social signal, not a quality metric — high adoption does not guarantee suitability for specific use cases","No breakdown of downloads by use case, industry, or geography — cannot assess relevance to specific domains","Community reviews and ratings not documented — unclear how feedback is aggregated or moderated","No correlation between downloads and actual production usage — many downloads may be for experimentation only"],"requires":["Access to Ollama model registry to view download statistics and community feedback"],"input_types":["none (informational capability)"],"output_types":["metadata (download count, community ratings, usage statistics)"],"categories":["memory-knowledge","community-validation"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"ollama-llama3__cap_2","uri":"capability://text.generation.language.dual.variant.model.selection.instruct.vs.pre.trained.base","name":"dual-variant model selection (instruct vs pre-trained base)","description":"Provides both instruction-tuned and pre-trained base model variants of Llama 3 (8B and 70B), allowing developers to choose between dialogue-optimized models (`llama3`, `llama3:70b`) and raw foundation models (`llama3:text`, `llama3:70b-text`). The instruct variants are fine-tuned for chat/dialogue tasks, while base variants preserve the original pre-training for tasks requiring raw text generation, completion, or custom fine-tuning.","intents":["Use a dialogue-optimized model for chatbot and assistant applications without additional fine-tuning","Access the base pre-trained model for custom fine-tuning on domain-specific tasks","Compare instruction-tuned vs base model performance on the same hardware","Build applications that require raw text generation without instruction-following constraints"],"best_for":["Researchers comparing instruction-tuning effectiveness on the same base architecture","Teams planning custom fine-tuning on domain-specific data","Developers building both chat and text-generation features in the same application","Organizations evaluating trade-offs between pre-tuned convenience and base model flexibility"],"limitations":["Base model variants may require additional fine-tuning or prompt engineering to achieve dialogue quality — not drop-in replacements for instruct variants","No documented performance differences between instruct and base variants — unclear which is faster or more memory-efficient","Both variants share the same 8K token context limit — no variant offers extended context","Training data composition not documented for either variant — cannot assess domain coverage or bias characteristics"],"requires":["Ollama runtime with model selection capability","Sufficient disk space: 4.7GB (8B) or 40GB (70B) per variant","For custom fine-tuning of base models: training framework (PyTorch, Hugging Face Transformers) and GPU with sufficient VRAM"],"input_types":["text (chat format for instruct variants, raw text for base variants)"],"output_types":["text (dialogue-optimized for instruct, raw generation for base)"],"categories":["text-generation-language","model-selection"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"ollama-llama3__cap_3","uri":"capability://text.generation.language.parameter.efficient.model.sizing.8b.and.70b.variants","name":"parameter-efficient model sizing (8b and 70b variants)","description":"Offers two distinct parameter counts (8 billion and 70 billion) to balance inference speed, memory footprint, and capability. The 8B variant fits on consumer GPUs and runs faster with lower latency, while the 70B variant provides higher quality outputs at the cost of increased memory and compute requirements. Both variants use the same Transformer architecture and training approach, enabling direct capability/performance comparisons.","intents":["Deploy a lightweight LLM on consumer hardware (laptops, edge devices) without sacrificing too much capability","Run a high-capability model on server GPUs for production applications requiring better output quality","Benchmark model size vs quality trade-offs for a specific use case","Choose between latency-optimized (8B) and quality-optimized (70B) deployments"],"best_for":["Solo developers with limited hardware resources (laptops, single GPU)","Teams deploying on edge devices or resource-constrained environments","Organizations comparing inference cost vs output quality for their workload","Builders prototyping with 8B before scaling to 70B for production"],"limitations":["Exact GPU VRAM requirements not documented — unclear if 8B fits on 8GB GPUs or requires 12GB+","No published latency or throughput benchmarks for either variant — cannot predict inference speed on specific hardware","70B variant requires 40GB disk space — prohibitive for many edge/mobile deployments","No intermediate sizes (e.g., 13B, 34B) — limited granularity for fine-tuning the speed/quality trade-off"],"requires":["For 8B: GPU with sufficient VRAM (exact requirement unknown, likely 8-12GB)","For 70B: GPU with 40GB+ VRAM (e.g., A100, H100, or multiple GPUs)","Ollama runtime optimized for the target hardware","40GB+ disk space for 70B variant, 4.7GB for 8B variant"],"input_types":["text (same format for both variants)"],"output_types":["text (quality/latency trade-off varies by variant)"],"categories":["text-generation-language","model-optimization"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"ollama-llama3__cap_4","uri":"capability://automation.workflow.cloud.and.local.deployment.flexibility.with.usage.based.billing","name":"cloud and local deployment flexibility with usage-based billing","description":"Supports both local execution (via Ollama CLI/API on user hardware) and cloud execution (via Ollama Cloud with paid tiers). Cloud deployment uses usage-based billing tied to GPU time, with tier-based concurrency limits (Free=1, Pro=3, Max=10 concurrent requests). Local deployment requires no subscription but demands hardware management; cloud deployment trades hardware costs for operational simplicity and automatic scaling.","intents":["Start with free cloud tier for prototyping without hardware investment","Scale from local development to cloud production without code changes","Avoid GPU hardware costs by using pay-as-you-go cloud inference","Run locally for privacy-critical applications, then migrate to cloud for scaling"],"best_for":["Startups prototyping LLM features with limited hardware budgets","Teams with variable inference load (bursty traffic) where cloud elasticity adds value","Developers wanting to avoid GPU hardware management and maintenance","Organizations with privacy requirements starting local and scaling selectively to cloud"],"limitations":["Cloud pricing model not documented — no per-token or per-minute rates published, making cost prediction difficult","Concurrency limits are strict — Free tier (1 concurrent) severely limits production use; Pro/Max tiers still queue requests beyond limits with fixed queue size","Session timeout of 5 hours and weekly reset limits — unclear how this affects long-running applications or batch processing","No documented SLA or uptime guarantees for cloud tier — reliability vs self-hosted unknown","Local deployment requires GPU hardware management — no automatic scaling or failover"],"requires":["For local: Ollama runtime + GPU hardware (VRAM requirements unknown)","For cloud: Ollama Cloud account (free or paid tier)","Network connectivity to Ollama Cloud endpoint","HTTP client to call REST API from either deployment"],"input_types":["text (same format for both local and cloud)"],"output_types":["text (same format for both local and cloud)"],"categories":["automation-workflow","deployment-infrastructure"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"ollama-llama3__cap_5","uri":"capability://tool.use.integration.chat.api.with.role.based.message.structure","name":"chat api with role-based message structure","description":"Implements OpenAI-compatible chat API (`/api/chat`) that accepts messages with role (user/assistant/system) and content fields in JSON format. The model processes multi-turn conversations by maintaining message history and generating contextually appropriate responses. This pattern enables drop-in compatibility with existing chat application frameworks and libraries designed for OpenAI's API.","intents":["Build chat applications using existing OpenAI-compatible libraries without rewriting integration code","Maintain conversation history and context across multiple user/assistant exchanges","Implement system prompts to guide model behavior without fine-tuning","Migrate from OpenAI API to local Llama 3 with minimal code changes"],"best_for":["Developers familiar with OpenAI API who want to use open-source models","Teams building chat UIs that need to support multiple LLM backends","Organizations migrating from proprietary to open-source LLMs","Builders leveraging existing chat frameworks (LangChain, LlamaIndex, etc.) that expect OpenAI-compatible APIs"],"limitations":["8K token context limit applies to entire conversation history — multi-turn conversations quickly exhaust context","No documented support for function calling or tool use — cannot extend chat API with external tool invocation","System prompt support not explicitly documented — unclear if system role is fully supported or treated as user message","No streaming configuration options documented — unclear if streaming is always enabled or configurable"],"requires":["Ollama runtime with `/api/chat` endpoint","HTTP client or OpenAI-compatible SDK (e.g., OpenAI Python library with custom base_url)","Knowledge of JSON message format with role/content structure"],"input_types":["JSON (array of messages with role and content fields)"],"output_types":["JSON (streaming or buffered response with role and content)"],"categories":["tool-use-integration","api-orchestration"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"ollama-llama3__cap_6","uri":"capability://text.generation.language.raw.text.generation.with.prompt.based.completion","name":"raw text generation with prompt-based completion","description":"Provides a `/api/generate` endpoint for raw text completion tasks, accepting a prompt string and generating continuations without role-based structure. This mode is optimized for tasks like code generation, creative writing, summarization, and other non-dialogue text generation. The model generates tokens sequentially until reaching a stop condition (max tokens, end-of-sequence token, or user-specified stop sequences).","intents":["Generate code completions from partial function signatures or docstrings","Create creative writing or story continuations from a prompt","Summarize documents by prompting with 'Summarize: [text]' pattern","Build text generation pipelines that don't require dialogue structure"],"best_for":["Developers building code generation or completion tools","Content creators using LLMs for writing assistance","Teams building text transformation pipelines (summarization, translation, etc.)","Builders using base model variants for custom prompt engineering"],"limitations":["No built-in prompt engineering or few-shot example support — developers must manually construct prompts","8K token context limit applies to prompt + generation — long documents or many examples quickly exhaust context","No documented stop sequence support — unclear how to reliably terminate generation at desired boundaries","Instruction-tuned variants may be over-optimized for dialogue, potentially reducing raw text generation quality vs base models"],"requires":["Ollama runtime with `/api/generate` endpoint","HTTP client (curl, requests, fetch, etc.)","Prompt engineering knowledge to structure effective completion requests"],"input_types":["text (raw prompt string, no role structure)"],"output_types":["text (streaming or buffered token-by-token generation)"],"categories":["text-generation-language","code-generation-editing"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"ollama-llama3__cap_7","uri":"capability://automation.workflow.quantization.transparent.model.distribution.via.ollama","name":"quantization-transparent model distribution via ollama","description":"Ollama distributes Llama 3 models in a proprietary quantized format (likely GGUF-based, though not explicitly documented) that abstracts quantization details from users. The runtime automatically selects appropriate quantization levels based on available GPU VRAM and hardware capabilities, handling model loading, memory management, and inference optimization transparently without requiring users to manually download or configure quantized weights.","intents":["Deploy Llama 3 without understanding quantization formats or GGUF specifications","Automatically optimize model loading for available hardware without manual tuning","Reduce model size and memory footprint compared to full-precision weights","Avoid managing multiple quantization variants (Q4, Q5, Q8) manually"],"best_for":["Developers new to LLM deployment who want to avoid quantization complexity","Teams prioritizing ease-of-use over fine-grained performance control","Organizations deploying across heterogeneous hardware (laptops, servers, edge devices)","Builders who want automatic hardware-aware optimization without manual configuration"],"limitations":["Quantization format and levels not documented — cannot predict exact model size, memory usage, or quality trade-offs","No option to use full-precision weights — all models distributed in quantized form","Quantization selection is automatic and opaque — cannot manually select specific quantization levels (Q4 vs Q5 vs Q8)","Unknown quantization impact on output quality — no benchmarks comparing Ollama-quantized vs original weights"],"requires":["Ollama runtime (handles quantization transparently)","Sufficient disk space for quantized model (4.7GB for 8B, 40GB for 70B)","GPU with sufficient VRAM (exact requirements unknown)"],"input_types":["text (same as any Llama 3 deployment)"],"output_types":["text (same as any Llama 3 deployment)"],"categories":["automation-workflow","model-optimization"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"ollama-llama3__cap_8","uri":"capability://tool.use.integration.multi.language.sdk.support.python.javascript.curl","name":"multi-language sdk support (python, javascript, curl)","description":"Ollama provides language-specific bindings and examples for Python, JavaScript/Node.js, and cURL, enabling developers to call Llama 3 inference from their preferred language without implementing HTTP clients from scratch. Each SDK abstracts the REST API details while maintaining the same underlying HTTP interface, allowing polyglot teams to integrate the same model across different services.","intents":["Call Llama 3 from Python scripts or Django/FastAPI applications","Integrate Llama 3 into Node.js/Express backends or browser-based applications","Test Llama 3 endpoints quickly using cURL without writing code","Build polyglot microservices where different services use different languages"],"best_for":["Full-stack teams using multiple programming languages","Python developers building data science or ML applications","JavaScript/Node.js developers building web backends or full-stack applications","DevOps engineers testing Ollama endpoints in shell scripts"],"limitations":["SDK documentation and feature parity not documented — unclear if all SDKs support streaming, error handling, or advanced features equally","No official SDKs for Go, Rust, Java, or other languages — limits adoption in polyglot organizations","SDK implementation details not provided — cannot assess abstraction quality or performance overhead","No async/await patterns documented for JavaScript SDK — unclear if streaming is properly handled in async contexts"],"requires":["For Python: Python 3.6+ (exact version not specified)","For JavaScript: Node.js 12+ (exact version not specified)","For cURL: curl command-line tool (any version)","Ollama runtime running and accessible at configured endpoint"],"input_types":["text (same format across all SDKs)"],"output_types":["text (same format across all SDKs)"],"categories":["tool-use-integration","api-orchestration"],"confidence":0.5,"matches":0,"success_rate":0},{"id":"ollama-llama3__cap_9","uri":"capability://automation.workflow.concurrent.request.handling.with.tier.based.limits","name":"concurrent request handling with tier-based limits","description":"Ollama Cloud enforces concurrency limits based on subscription tier (Free=1, Pro=3, Max=10 concurrent requests), queuing requests that exceed the limit with a fixed queue size. Requests beyond the queue capacity are rejected with an error. This pattern prevents resource exhaustion on shared cloud infrastructure while allowing burst traffic up to the queue limit.","intents":["Understand how many simultaneous requests your tier can handle","Plan for request queuing behavior when traffic exceeds concurrency limits","Choose appropriate tier based on expected concurrent user load","Implement client-side retry logic for rejected requests"],"best_for":["Teams with predictable, low-concurrency workloads (Free tier for prototyping)","Applications with moderate concurrent users (Pro tier for day-to-day use)","High-traffic production systems (Max tier for sustained heavy load)","Developers implementing retry logic and graceful degradation"],"limitations":["Queue size not documented — cannot predict how many requests will be queued before rejection","No priority queuing or request prioritization — all requests treated equally","Strict concurrency limits may cause cascading failures if clients don't implement exponential backoff","No documented SLA for queue wait times — unclear how long queued requests wait before execution","Free tier (1 concurrent) is impractical for any production use — forces immediate upgrade"],"requires":["Ollama Cloud account with appropriate tier (Free, Pro, or Max)","Client-side retry logic with exponential backoff for handling rejections","Understanding of expected concurrent request load for tier selection"],"input_types":["HTTP requests (any format supported by Ollama API)"],"output_types":["HTTP responses (success or rejection with error code)"],"categories":["automation-workflow","safety-moderation"],"confidence":0.5,"matches":0,"success_rate":0}],"trust":{"score":24,"verified":false,"data_access_risk":"high","permissions":["Ollama runtime (local deployment) OR Ollama Cloud account (cloud deployment)","For local: minimum GPU VRAM requirement unknown (not documented)","For cloud: Free/Pro/Max tier subscription with concurrent request limits (Free=1, Pro=3, Max=10)","HTTP client or SDK (Python, JavaScript/Node.js, or cURL) to call REST API","Ollama runtime installed and running (local or remote)","HTTP client library (curl, requests, fetch, axios, etc.)","Network connectivity to Ollama endpoint (localhost:11434 by default)","For cloud: Ollama Cloud account with appropriate tier","Ollama Cloud account with understanding of session/weekly limits","Client-side session management and reconnection logic"],"failure_modes":["Hard 8K token context limit — cannot process documents or conversations longer than ~6,000 words without truncation","No knowledge cutoff date documented — unclear when training data ends, limiting reliability for current-events queries","Instruction-tuning optimizations may reduce raw text generation capability compared to base models","No multimodal support — text input/output only, cannot process images, audio, or video","Ollama runtime must be running on the same machine or accessible network — adds operational overhead vs managed cloud APIs","Streaming implementation details (SSE vs chunked encoding) not documented — may require client-side adaptation","No built-in authentication or rate limiting in Ollama REST API — requires external reverse proxy (nginx, etc.) for production security","Concurrent request handling depends on Ollama plan tier (Free=1 concurrent, Pro=3, Max=10) — exceeding limits queues requests with fixed queue size","5-hour session limit is impractical for long-running batch jobs or 24/7 services — requires frequent reconnection","Weekly reset timing not documented — unclear if resets are UTC-based or per-user timezone","builder identity is not verified yet","no observed match outcomes yet"],"rank_breakdown":{"adoption":0.05,"quality":0.34,"ecosystem":0.38999999999999996,"match_graph":0.25,"freshness":0.75,"weights":{"adoption":0.35,"quality":0.2,"ecosystem":0.1,"match_graph":0.3,"freshness":0.05}},"observed_outcomes":{"matches":0,"success_rate":0,"avg_confidence":0,"top_intents":[],"last_matched_at":null},"maintenance":{"status":"active","updated_at":"2026-05-24T12:16:24.483Z","last_scraped_at":"2026-05-03T15:20:48.403Z","last_commit":null},"community":{"stars":null,"forks":null,"weekly_downloads":null,"model_downloads":null,"model_likes":null}},"distribution":{"claim_url":"https://unfragile.ai/submit?claim=llama3","compare_url":"https://unfragile.ai/compare?artifact=llama3"}},"signature":"peicZZ7GWN+pYGPB1Os71mlXO/t+DjruQAqf+OMvaaYvZV89KfrZs2mAP7BFaN3Fjl8wXHfXkS2pzFDWoVkbBg==","signedAt":"2026-06-21T07:17:39.756Z","signedBy":"unfragile.ai","version":1},"_links":{"self":"https://unfragile.ai/api/v1/passport/llama3","artifact":"https://unfragile.ai/llama3","verify":"https://unfragile.ai/api/v1/verify?slug=llama3","publicKey":"https://unfragile.ai/api/v1/trust-passport-public-key","spec":"https://unfragile.ai/trust","schema":"https://unfragile.ai/schema.json","docs":"https://unfragile.ai/docs"}}