Skip to content

@mkopcins/release0.9.1#1228

Merged
mkopcins merged 5 commits into
release/0.9from
@mkopcins/release0.9.1
Jun 11, 2026
Merged

@mkopcins/release0.9.1#1228
mkopcins merged 5 commits into
release/0.9from
@mkopcins/release0.9.1

Conversation

@mkopcins

Copy link
Copy Markdown
Collaborator

Description

Release 0.9.1

Introduces a breaking change?

  • Yes
  • No

Type of change

  • Bug fix (change which fixes an issue)
  • New feature (change which adds functionality)
  • Documentation update (improves or adds clarity to existing documentation)
  • Other (chores, tests, code style improvements etc.)

Tested on

  • iOS
  • Android

Testing instructions

Screenshots

Related issues

Checklist

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated the documentation accordingly
  • My changes generate no new warnings

Additional notes

IgorSwat and others added 3 commits June 11, 2026 11:35
Updates Whisper models URLs to match the recently uploaded CoreML fp16
models.

The newly uploaded fp16 models are roughly 50% reduced in size and ~30%
faster than the old fp32 ones.

- [ ] Yes
- [x] No

- [ ] Bug fix (change which fixes an issue)
- [ ] New feature (change which adds functionality)
- [ ] Documentation update (improves or adds clarity to existing
documentation)
- [x] Other (chores, tests, code style improvements etc.)

- [x] iOS
- [ ] Android

<!-- Provide step-by-step instructions on how to test your changes.
Include setup details if necessary. -->

<!-- Add screenshots here, if applicable -->

<!-- Link related issues here using #issue-number -->

- [x] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have updated the documentation accordingly
- [x] My changes generate no new warnings

<!-- Include any additional information, assumptions, or context that
reviewers might need to understand this PR. -->
<!-- Provide a concise and descriptive summary of the changes
implemented in this PR. -->

- [ ] Yes
- [x] No

- [ ] Bug fix (change which fixes an issue)
- [x] New feature (change which adds functionality)
- [x] Documentation update (improves or adds clarity to existing
documentation)
- [ ] Other (chores, tests, code style improvements etc.)

- [x] iOS
- [x] Android

Test by running apps/llm app on llm screen (for text only model) and
multimodal screen (for audio-vision-text model). Text model should work
as any other llm model. Multimodal can process up-to-30sec audio chunks
as well as image inputs, should be able to transcribe audio, describe
pictures or similar.

<!-- Add screenshots here, if applicable -->

- [x] I have performed a self-review of my code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have updated the documentation accordingly
- [ ] My changes generate no new warnings
@mkopcins mkopcins force-pushed the @mkopcins/release0.9.1 branch from 728338d to c233809 Compare June 11, 2026 09:37
…emma 4 E2B (#1223)

Bumps ExecuTorch to 1.3 and adds two GPU backends with Gemma 4 E2B
support:

- **MLX (iOS / Apple GPU)** — new backend, with metadata-driven chunked
prefill. The MLX `forward` is exported with a sliding-window cap on the
sequence dimension and a one-shot prefill spikes Metal memory, so MLX
models are prefilled in steps of the forward's declared max input length
(read from the method metadata). Non-MLX backends keep the original
one-shot path.
- **Vulkan (Android GPU)** — Gemma 4 E2B now runs on Vulkan. The
prebuilt `libexecutorch.so` (arm64-v8a, x86_64) is rebuilt from the labs
1.3 fork with the Gemma4 Vulkan support: the `aten.rms_norm` lowering
and the Gemma SDPA shaders, ported onto 1.3's tile-load helper API with
the DHSB Q/K/V layout the Gemma4 export uses.

`models.llm.gemma4_e2b` is registered with `mlx` / `xnnpack` / `vulkan`
variants and defaults to **MLX on iOS** and **Vulkan on Android**.

- [ ] Yes
- [x] No

- [x] Bug fix (change which fixes an issue)
- [x] New feature (change which adds functionality)
- [ ] Documentation update (improves or adds clarity to existing
documentation)
- [x] Other (chores, tests, code style improvements etc.)

- [x] iOS
- [x] Android

1. Build and run the LLM example app (`apps/llm`) on a physical device
(Vulkan/MLX need a real GPU — not the simulator/emulator).
2. In the model picker, select **Gemma 4 E2B**.
3. Send a prompt and confirm coherent generation:
   - iOS → runs on the MLX backend.
   - Android → runs on the Vulkan backend.
4. Confirm generation does not stop immediately after prefill and
produces multiple tokens.

<!-- Add screenshots here, if applicable -->

<!-- Link related issues here using #issue-number -->

- [x] I have performed a self-review of my code
- [x] I have commented my code, particularly in hard-to-understand areas
- [ ] I have updated the documentation accordingly
- [x] My changes generate no new warnings

The vulkan gemma won't work until @mkopcins PR is merged.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@mkopcins mkopcins marked this pull request as ready for review June 11, 2026 14:33
@mkopcins mkopcins requested a review from msluszniak June 11, 2026 14:34
@barhanc barhanc self-requested a review June 11, 2026 14:41
@msluszniak

Copy link
Copy Markdown
Member

I won't be able to review changes till 7pm CET

Comment thread packages/react-native-executorch/src/constants/modelUrls.ts
Comment thread packages/react-native-executorch/src/constants/modelUrls.ts
@mkopcins mkopcins merged commit bca2a54 into release/0.9 Jun 11, 2026
2 checks passed
@mkopcins mkopcins deleted the @mkopcins/release0.9.1 branch June 11, 2026 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants