PrismML Ships Bonsai Image 4B: First Image Generation Model Running on iPhone

PrismML has released Bonsai Image 4B, a family of compact image-generation models that represent the first image model in its parameter class capable of running directly on an iPhone. Released on May 26, 2026 under an Apache 2.0 open-source license, the models achieve 8.3x size reduction while maintaining 95% accuracy compared to the baseline FLUX.2 Klein 4B model.

Two Variants Enable Different Performance Tradeoffs

Bonsai Image 4B comes in two quantized variants optimized for edge deployment:

1-bit variant: Uses binary {−1, +1} transformer weights with FP16 group-wise scaling, achieving a 0.93 GB footprint (8.3x reduction from the 7.75 GB baseline)
Ternary variant: Uses {−1, 0, +1} transformer weights, resulting in a 1.21 GB footprint (6.4x reduction)

The full deployment payloads on Apple Silicon are 3.42 GB for the 1-bit model and 3.88 GB for the ternary model, compared to 15.97 GB for the original FLUX.2 Klein 4B.

Performance Metrics Show Practical Mobile Inference

The models deliver practical generation speeds on consumer hardware:

iPhone 17 Pro Max: Generates 512x512 images in 9.4 seconds
Mac M4 Pro: Approximately 6 seconds per image (5.6x faster than stock FLUX pipeline)

Memory usage during generation is dramatically reduced. For 512x512 images, the 1-bit variant uses 1.5 GB compared to 11.74 GB for the original model. For 1024x1024 images, memory requirements drop to 1.95 GB from 14.39 GB.

Benchmark Results Demonstrate Minimal Accuracy Loss

On standard evaluation benchmarks (GenEval, HPSv3, DPG-Bench), the ternary model retains 95% accuracy versus FLUX.2 Klein 4B, while the 1-bit variant maintains 88% accuracy. The ternary model substantially outperforms established competitors including SDXL, Stable Diffusion 1.5, and PixArt-Σ XL 2 while maintaining similar or smaller footprints.

Open Release Includes Multiple Distribution Channels

PrismML, founded by Caltech researchers with backing from Khosla Ventures, Cerberus, and Google, has made the models available through multiple channels:

Hugging Face model repository with open weights and code
Bonsai Studio iOS app for mobile deployment
WebGPU demo for browser-based testing
GitHub source code repository

According to PrismML's announcement, Bonsai Image 4B represents "a new deployment regime for image generation: capable outputs, open weights, and practical local inference."

Key Takeaways

Bonsai Image 4B is the first 4B-parameter image generation model capable of running directly on iPhone hardware
The ternary variant achieves 6.4x size reduction while retaining 95% accuracy compared to FLUX.2 Klein 4B baseline
iPhone 17 Pro Max generates 512x512 images in 9.4 seconds with only 1.5-1.96 GB memory usage
Released under Apache 2.0 license with open weights available on Hugging Face
Full deployment payload is 3.42-3.88 GB compared to 15.97 GB for the original model

Two Variants Enable Different Performance Tradeoffs

Bonsai Image 4B comes in two quantized variants optimized for edge deployment:

1-bit variant: Uses binary {−1, +1} transformer weights with FP16 group-wise scaling, achieving a 0.93 GB footprint (8.3x reduction from the 7.75 GB baseline)

Ternary variant: Uses {−1, 0, +1} transformer weights, resulting in a 1.21 GB footprint (6.4x reduction)

The full deployment payloads on Apple Silicon are 3.42 GB for the 1-bit model and 3.88 GB for the ternary model, compared to 15.97 GB for the original FLUX.2 Klein 4B.

Performance Metrics Show Practical Mobile Inference

The models deliver practical generation speeds on consumer hardware:

iPhone 17 Pro Max: Generates 512x512 images in 9.4 seconds

Mac M4 Pro: Approximately 6 seconds per image (5.6x faster than stock FLUX pipeline)

Benchmark Results Demonstrate Minimal Accuracy Loss

Open Release Includes Multiple Distribution Channels

PrismML, founded by Caltech researchers with backing from Khosla Ventures, Cerberus, and Google, has made the models available through multiple channels:

Hugging Face model repository with open weights and code

Bonsai Studio iOS app for mobile deployment

WebGPU demo for browser-based testing

GitHub source code repository

According to PrismML's announcement, Bonsai Image 4B represents "a new deployment regime for image generation: capable outputs, open weights, and practical local inference."

Key Takeaways

Bonsai Image 4B is the first 4B-parameter image generation model capable of running directly on iPhone hardware

The ternary variant achieves 6.4x size reduction while retaining 95% accuracy compared to FLUX.2 Klein 4B baseline

iPhone 17 Pro Max generates 512x512 images in 9.4 seconds with only 1.5-1.96 GB memory usage

Released under Apache 2.0 license with open weights available on Hugging Face

Full deployment payload is 3.42-3.88 GB compared to 15.97 GB for the original model