§ Best of · Updated May 2026
Open-source AI is no longer the consolation prize — it's competitive on capability and decisive on cost, privacy, and control. The tools below are the open-source options that have crossed the production-ready bar.
§ The picks
The easiest way to run open models locally and serve them through a developer-friendly API.
Run open-weight models locally with one command. The friendliest entry point to local AI.
Single-file portable local LLM — download and run anywhere
Download a single executable and run an LLM — no Python, CUDA, or setup ceremony required.
The C/C++ engine powering local AI — lightning-fast inference that Ollama and LM Studio build on.
The local inference workhorse behind countless desktop and server deployments. Boring in the best possible way.
High-throughput LLM serving engine — the production standard for GPU inference at scale.
High-throughput open-source serving for serious inference workloads. The pick once local demos become production traffic.
Data framework and managed services for RAG, agents, document parsing, and knowledge apps.
Open-source RAG framework. The fastest path from documents to a production retrieval pipeline.
The most popular framework for building LLM applications — chains, agents, and RAG made easy.
Open-source agent and chain orchestration. Polarizing in the community, but ubiquitous in real codebases.
Black Forest Labs image models for high-quality generation, editing, and open-weight workflows.
Open-weight image generation that rivals closed-source quality. The bedrock of community fine-tunes in 2026.
§ Common questions
Closed-source still leads at the absolute frontier (reasoning, agentic work, longest context). For the 80% of tasks below the frontier, open-source models are competitive — and you control the deployment.
For 7B-13B models, a modern Mac with 16-32GB RAM works. For 70B+ models, you'll want a GPU server (A100, H100) or a cloud inference provider. Ollama handles quantization automatically for resource-constrained setups.
Three reasons: privacy (data doesn't leave your infrastructure), cost (no per-token pricing), and control (no model deprecation, no surprise rate limits). Pay the OSS tax in setup time; collect the dividend forever after.
§ More best-of lists