SimpleNews.ai

UC Berkeley Researchers Teach Robots to Peel Vegetables With 90% Success Rate

Wednesday, March 4, 2026

Researchers at UC Berkeley have developed a two-stage learning framework that enables robots to peel vegetables with over 90% success rates using preference-based finetuning from just 50-200 demonstration trajectories. The approach combines force-aware imitation learning with human preference feedback to master contact-rich manipulation tasks where success criteria are subjective rather than binary.

Two-Stage Framework Addresses Implicit Success Criteria in Manipulation

Many essential manipulation tasks—including food preparation, surgery, and craftsmanship—remain challenging for robots because success is continuous and context-dependent rather than easily quantifiable. The Berkeley team's framework addresses this through complementary stages:

Stage 1: Robust initial policy

  • Force-aware data collection captures contact dynamics during peeling
  • Imitation learning from 50-200 trajectories establishes baseline competence
  • Enables generalization across object variations within and across categories

Stage 2: Preference-based refinement

  • Learned reward model combines quantitative metrics with qualitative human feedback
  • Aligns policy behavior with human notions of task quality
  • Improves performance by up to 40% over baseline imitation learning

This methodology bridges techniques from language model alignment (RLHF) with physical skill learning, demonstrating that preference-based approaches transfer effectively to robotic manipulation.

System Achieves Strong Generalization Across Produce Categories

The framework demonstrates robust performance across challenging test cases:

  • Over 90% success rates on cucumbers, apples, and potatoes
  • 40% performance improvement through preference-based finetuning over imitation learning alone
  • Zero-shot generalization: Policies trained on one produce category maintain 90%+ success on unseen instances
  • Cross-category transfer: Strong performance on out-of-distribution produce from different categories

The research team includes Toru Lin, Shuying Deng, Zhao-Heng Yin, Pieter Abbeel, and Jitendra Malik from UC Berkeley.

Implications for Subjective Skill Learning in Robotics

The work demonstrates that robots can learn contact-rich skills with implicit success criteria from minimal demonstration data when the learning framework properly separates robust execution from quality refinement. The force-sensitive manipulation component handles the physical dynamics of peeling, while preference-based alignment captures the subjective aspects of task quality.

This approach could extend to other domains where "good enough" varies by context—surgical assistance, craftsmanship, personal care—where task quality exists on a continuum rather than as a binary outcome. The combination of force-aware control with preference learning provides a template for teaching robots skills that require both physical competence and aesthetic judgment.

Key Takeaways

  • UC Berkeley's framework achieves over 90% success rates on vegetable peeling using just 50-200 demonstration trajectories
  • Two-stage approach combines force-aware imitation learning with preference-based finetuning for continuous quality improvement
  • Preference-based refinement improves performance by up to 40% over baseline imitation learning policies
  • System demonstrates strong zero-shot generalization within and across produce categories
  • Methodology bridges RLHF techniques from language models to physical manipulation tasks with subjective success criteria