Gemma 4 audio with MLX
Thanks to a tip from Rahim Nathwani, here's a uv run recipe for transcribing an audio file on macOS using the 10.28 GB Gemma 4 E2B model with MLX and mlx-vlm: uv run --python 3.13 --with mlx_vlm --with torchvision --with gradio \ mlx_vlm.generate \ --model google/gemma-4-e2b-it \ --audio file.wav \ --prompt "Transcribe this audio" \ --max-tokens 500 \ --temperature 1.0 Your browser does not support the audio element. I tried it on this 14 second .wav file and it output the following: This front…
Soutenez Simon Willison's Weblog en consultant la ressource originale
Lire l'article original