MIT CSAIL researchers Yulu Gan and Phillip Isola have discovered that large, well-pretrained models contain numerous task-specific solutions densely packed around their original weights, challenging fundamental assumptions about model fine-tuning. Their paper, "Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights," published on arXiv on March 12, 2026, introduces RandOpt—a simple random sampling method that matches the performance of sophisticated techniques like PPO and GRPO without any gradient-based optimization.
Large Models Have Dense Expert Solutions, Small Models Do Not
The researchers treat pretrained weights not as a single starting point but as a distribution containing multiple task-specific solutions. In large, well-pretrained models, diverse task-improving specialists populate a substantial fraction of the neighborhood around pretrained weights. In contrast, smaller models have sparse expert solutions that require gradient-based optimization to discover. This fundamental difference in loss landscape geometry means the optimal post-training strategy depends heavily on model scale.
RandOpt: Random Sampling Matches Sophisticated Optimization Methods
The RandOpt method is remarkably simple: randomly sample N parameter perturbations around pretrained weights, select the top K performers, and combine them via majority voting. This fully parallel approach achieves competitive results with PPO, GRPO, and evolutionary strategies. The success of such a simple method suggests that for large models, finding good task specialists may be easier than previously assumed—reframing post-training from an optimization problem into a sampling and selection problem.
Open Implementation and Interactive Demos Available
The researchers have released their code on GitHub at sunrainyg/RandOpt and created an interactive project website at thickets.mit.edu. The work has significant practical implications: it suggests that practitioners working with large pretrained models may be able to skip expensive gradient-based fine-tuning in favor of simple random search, potentially reducing both computational cost and implementation complexity for many tasks.
Key Takeaways
- Large pretrained models contain dense concentrations of task-specific solutions around their original weights, unlike smaller models where experts are sparse
- RandOpt achieves competitive performance with PPO and GRPO using only random sampling and majority voting, no gradient descent required
- The finding reframes post-training as a sampling problem rather than an optimization problem for large models
- Code is available on GitHub at sunrainyg/RandOpt with interactive demos at thickets.mit.edu
- The discovery challenges fundamental assumptions about how loss landscape geometry changes with model scale