Developer Andyyyy64 released whichllm, a command-line tool that identifies the optimal local language model for specific hardware configurations. Posted to Hacker News as a Show HN on May 15, 2026, the tool received 255 points and 48 comments while accumulating 426 GitHub stars.
Evidence-Based Ranking System Addresses Benchmark Inflation
whichllm integrates multiple benchmark sources including LiveBench, Artificial Analysis, Aider, Chatbot Arena ELO, and Open LLM Leaderboard. The tool assigns confidence scores to prevent outdated evaluations from inflating older models' rankings, solving what the developer describes as the need for "real, recency-aware benchmarks, not parameter count."
The ranking engine uses a 0-100 scoring system that factors in fit type, speed, and evidence quality, helping users navigate the hundreds of available local LLMs without manually cross-referencing benchmarks and hardware requirements.
Hardware Intelligence and Resource Estimation
The tool auto-detects NVIDIA, AMD, and Apple Silicon GPUs, then estimates VRAM requirements including weights, KV cache, activation memory, and overhead. It calculates expected throughput based on bandwidth and quantization factors, providing actionable recommendations for what will actually run on user hardware.
Technical Features and Availability
whichllm comprises four main modules:
- Hardware detection covering GPU specs, RAM, and compute capability
- Model fetching via live HuggingFace API with caching
- Benchmark aggregation with merged scores and recency demotion
- Ranking engine factoring multiple performance dimensions
The tool supports model simulation, JSON output for scripting integration, and instant chat sessions via the whichllm run command. It requires Python 3.11+ and is distributed under MIT license via PyPI, Homebrew, and source installation.
Hacker News commenters praised the tool for automating a previously manual process of choosing among hundreds of local LLMs, with multiple users noting they had been solving this problem manually.
Key Takeaways
- whichllm automates selection of optimal local LLMs based on hardware and multiple benchmark sources
- The tool integrates LiveBench, Artificial Analysis, Aider, Chatbot Arena ELO, and Open LLM Leaderboard
- Auto-detects NVIDIA, AMD, and Apple Silicon GPUs and estimates VRAM requirements and throughput
- Released on May 15, 2026 with 426 GitHub stars and strong Hacker News community response
- Available via PyPI, Homebrew, and source installation under MIT license