Ollama
The easiest way to run open models locally and serve them through a developer-friendly API.
Run AI models locally on your hardware. Privacy-first, offline-capable, open-source tools for self-hosted inference.
The easiest way to run open models locally and serve them through a developer-friendly API.
The C/C++ engine powering local AI — lightning-fast inference that Ollama and LM Studio build on.
Desktop app for discovering, running, chatting with, and serving local AI models.
Single-file portable local LLM — download and run anywhere
Self-hosted AI interface for Ollama, OpenAI-compatible APIs, tools, RAG, and teams.
High-throughput LLM serving engine — the production standard for GPU inference at scale.
Open-source ChatGPT alternative that runs 100% offline on your computer.
Free, local, privacy-aware AI — run chatbots on consumer hardware with no GPU required.
Open-source web UI for running, testing, and serving local language models.
Self-hosted OpenAI-compatible API — drop-in replacement for cloud AI in your infrastructure.
Open local AI writing and roleplay ecosystem centered on KoboldCpp and KoboldAI Lite.
1–11 of 11 tools