Karpathy's Autoresearch Pattern Spawns Generalized Claude Code Skill for Autonomous Iteration

Andrej Karpathy released "autoresearch" on March 6, 2026—a minimal autonomous research lab running on a single GPU that lets AI agents iterate on code independently. The project achieved 30,307 GitHub stars within one week, making it one of the fastest-growing repositories in GitHub history. Developer Udit Goenka recognized the pattern's broader potential and released a generalized Claude Code skill on March 13, 2026, that extends autonomous iteration beyond machine learning to any domain with measurable outcomes.

Autoresearch Implements Modify-Verify-Keep/Discard Loop

The autoresearch concept gives an AI agent a small but fully functional LLM training environment where it can iterate autonomously. The agent modifies training code, runs a 5-minute experiment, checks whether validation performance improved, and then keeps or discards the change—repeating this cycle overnight without human supervision. Users wake up to a complete log of experiments and ideally a meaningfully better model.

The core innovation treats the program.md file as a lightweight "skill"—a natural language specification that defines agent behavior without writing traditional code. This represents a paradigm shift where developers write markdown instructions that tell agents how to write and test code, rather than writing the code directly.

Generalized Skill Enables Autonomous Iteration Across Domains

Goenka's generalized skill extends the autoresearch approach to any problem with measurable outcomes. The repository describes it as "Autonomous goal-directed iteration for Claude Code" and has gained 608 GitHub stars in approximately 3 days. The skill implements the core principle that "constraint + mechanical metric + autonomous iteration = compounding gains."

Applications for the generalized skill include:

Security audits with vulnerability counts as metrics
API performance optimization with latency and throughput measurements
Code quality improvements with complexity or coverage metrics
Any domain where success can be quantified and tested automatically

The skill extends Claude Code's capabilities by enabling truly autonomous overnight iteration on tasks that previously required constant human supervision. By establishing clear success metrics upfront, users can let the agent explore the solution space systematically.

Community Response and Documentation

Multiple developer guides appeared in mid-March 2026, including "Getting Started with Andrej Karpathy's 'autoresearch' — Full Guide" by Nikhil in Neural Notions and "Andrej Karpathy's AutoResearch: Bye Bye Researchers" by Mehul Gupta in Data Science in Your Pocket. The rapid creation of generalized versions demonstrates how quickly the AI development community can identify and extend powerful patterns.

This development is particularly significant because it makes overnight autonomous optimization accessible to developers without ML expertise, democratizing a capability previously limited to research labs with extensive infrastructure. The autoresearch pattern represents a meta-level innovation: rather than building better AI models, it provides a framework for AI agents to improve systems autonomously through systematic experimentation.

Key Takeaways

Karpathy's autoresearch achieved 30,307 GitHub stars in one week, becoming one of the fastest-growing repositories in GitHub history
The core pattern uses a modify-verify-keep/discard loop that runs autonomously overnight on a single GPU
Udit Goenka's generalized Claude Code skill extends the pattern beyond ML to any domain with measurable outcomes, gaining 608 stars in 3 days
The approach treats markdown files as "skills" that define agent behavior through natural language rather than traditional code
Applications span security audits, API optimization, code quality improvements, and any quantifiable problem domain

Autoresearch Implements Modify-Verify-Keep/Discard Loop

Generalized Skill Enables Autonomous Iteration Across Domains

Applications for the generalized skill include:

Security audits with vulnerability counts as metrics

API performance optimization with latency and throughput measurements

Code quality improvements with complexity or coverage metrics

Any domain where success can be quantified and tested automatically

Community Response and Documentation

Key Takeaways

Karpathy's autoresearch achieved 30,307 GitHub stars in one week, becoming one of the fastest-growing repositories in GitHub history

The core pattern uses a modify-verify-keep/discard loop that runs autonomously overnight on a single GPU

Udit Goenka's generalized Claude Code skill extends the pattern beyond ML to any domain with measurable outcomes, gaining 608 stars in 3 days

The approach treats markdown files as "skills" that define agent behavior through natural language rather than traditional code

Applications span security audits, API optimization, code quality improvements, and any quantifiable problem domain