via “ios sdk with metal gpu acceleration and app extension support”
Run frontier LLMs and VLMs with day-0 model support across GPU, NPU, and CPU, with comprehensive runtime coverage for PC (Python/C++), mobile (Android & iOS), and Linux/IoT (Arm64 & x86 Docker). Supporting OpenAI GPT-OSS, IBM Granite-4, Qwen-3-VL, Gemma-3n, Ministral-3, and more.
Unique: iOS SDK leverages Metal GPU compute shaders for inference, achieving 2-3x speedup vs CPU on A-series chips. App extension support enables inference in restricted contexts (Siri, keyboard) through careful memory management and background task handling.
vs others: Only on-device inference SDK for iOS with native Metal GPU acceleration and app extension support, whereas competitors (Ollama, LM Studio) have no iOS SDKs at all, making it the only true iOS-native on-device inference solution.