Researchers have released OpenMobile, an open-source framework that achieves 64.7% accuracy on AndroidWorld by addressing the data gap in mobile agent research. Published on arXiv April 16, 2026, the framework introduces a scalable task synthesis pipeline and policy-switching trajectory rollout that captures error-recovery data missing from standard imitation learning approaches.
Scalable Task Synthesis Pipeline Addresses Data Scarcity
While leading mobile agent models approach 70% on AndroidWorld, their training data and synthesis methods remain closed. OpenMobile's key innovation is a three-step synthesis pipeline:
- Building global environment memory from exploration
- Leveraging memory to generate diverse, grounded instructions
- Ensuring tasks are executable on real Android environments
This approach generates training data that covers broad app functionality without requiring proprietary datasets. The researchers conducted transparent overlap analysis between synthetic instructions and benchmark test sets, verifying that performance gains stem from functionality coverage rather than benchmark overfitting.
Policy-Switching Captures Essential Error-Recovery Data
OpenMobile introduces policy-switching trajectory rollout, which alternates between learner and expert models during data collection. When the learner model fails, the expert demonstrates recovery, creating training examples that teach agents how to handle mistakes—a critical capability missing from standard imitation learning that only captures successful trajectories.
Performance results demonstrate the effectiveness of this approach:
- Qwen3-VL fine-tuned on OpenMobile data: 64.7% on AndroidWorld
- Qwen2.5-VL fine-tuned on OpenMobile data: 51.7% on AndroidWorld
- Performance far surpasses existing open-data approaches
- Competitive with leading closed models
Transparent Evaluation Across Multiple Benchmarks
The framework was evaluated on AndroidWorld and two additional dynamic mobile agent benchmarks. Unlike many mobile agent projects, OpenMobile provides complete transparency about its synthesis pipeline and verification methods, not just final data.
The researchers emphasize that existing approaches fail because closed models don't share synthesis methods, standard imitation learning lacks error-recovery demonstrations, and synthetic tasks often lack grounding in real app functionality. OpenMobile addresses all three limitations.
Complete Release Facilitates Broader Research
The complete framework is available at https://njucckevin.github.io/openmobile/, including the synthesis pipeline, training data, and model checkpoints. By making high-quality mobile agent development accessible without proprietary data access, OpenMobile could accelerate mobile automation research across the research community.
The release demonstrates that transparent, open approaches can achieve competitive performance with closed systems while enabling broader participation in advancing mobile agent capabilities.
Key Takeaways
- OpenMobile achieves 64.7% on AndroidWorld using Qwen3-VL, competitive with leading closed models
- Policy-switching trajectory rollout captures error-recovery data by alternating between learner and expert models during training
- Transparent overlap analysis verifies performance comes from broad functionality coverage, not benchmark overfitting
- The complete synthesis pipeline and training data are publicly available, addressing the data gap in mobile agent research
- The framework makes high-quality mobile agent development accessible without requiring proprietary datasets