whichllm CLI Tool Finds Optimal Local LLM Based on Hardware and Benchmarks

Developer Andyyyy64 released whichllm, a command-line tool that identifies the optimal local language model for specific hardware configurations. Posted to Hacker News as a Show HN on May 15, 2026, the tool received 255 points and 48 comments while accumulating 426 GitHub stars.

Evidence-Based Ranking System Addresses Benchmark Inflation

whichllm integrates multiple benchmark sources including LiveBench, Artificial Analysis, Aider, Chatbot Arena ELO, and Open LLM Leaderboard. The tool assigns confidence scores to prevent outdated evaluations from inflating older models' rankings, solving what the developer describes as the need for "real, recency-aware benchmarks, not parameter count."

The ranking engine uses a 0-100 scoring system that factors in fit type, speed, and evidence quality, helping users navigate the hundreds of available local LLMs without manually cross-referencing benchmarks and hardware requirements.

Hardware Intelligence and Resource Estimation

The tool auto-detects NVIDIA, AMD, and Apple Silicon GPUs, then estimates VRAM requirements including weights, KV cache, activation memory, and overhead. It calculates expected throughput based on bandwidth and quantization factors, providing actionable recommendations for what will actually run on user hardware.

Technical Features and Availability

whichllm comprises four main modules:

Hardware detection covering GPU specs, RAM, and compute capability
Model fetching via live HuggingFace API with caching
Benchmark aggregation with merged scores and recency demotion
Ranking engine factoring multiple performance dimensions

The tool supports model simulation, JSON output for scripting integration, and instant chat sessions via the whichllm run command. It requires Python 3.11+ and is distributed under MIT license via PyPI, Homebrew, and source installation.

Hacker News commenters praised the tool for automating a previously manual process of choosing among hundreds of local LLMs, with multiple users noting they had been solving this problem manually.

Key Takeaways

whichllm automates selection of optimal local LLMs based on hardware and multiple benchmark sources
The tool integrates LiveBench, Artificial Analysis, Aider, Chatbot Arena ELO, and Open LLM Leaderboard
Auto-detects NVIDIA, AMD, and Apple Silicon GPUs and estimates VRAM requirements and throughput
Released on May 15, 2026 with 426 GitHub stars and strong Hacker News community response
Available via PyPI, Homebrew, and source installation under MIT license

Evidence-Based Ranking System Addresses Benchmark Inflation

Hardware Intelligence and Resource Estimation

Technical Features and Availability

whichllm comprises four main modules:

Hardware detection covering GPU specs, RAM, and compute capability

Model fetching via live HuggingFace API with caching

Benchmark aggregation with merged scores and recency demotion

Ranking engine factoring multiple performance dimensions

Hacker News commenters praised the tool for automating a previously manual process of choosing among hundreds of local LLMs, with multiple users noting they had been solving this problem manually.

Key Takeaways

whichllm automates selection of optimal local LLMs based on hardware and multiple benchmark sources

The tool integrates LiveBench, Artificial Analysis, Aider, Chatbot Arena ELO, and Open LLM Leaderboard

Auto-detects NVIDIA, AMD, and Apple Silicon GPUs and estimates VRAM requirements and throughput

Released on May 15, 2026 with 426 GitHub stars and strong Hacker News community response

Available via PyPI, Homebrew, and source installation under MIT license