Tiny 27M-parameter English ASR built for real-time, on-device transcription
Moonshine Tiny is the smallest model in Useful Sensors' Moonshine family — a 27M-parameter,
sequence-to-sequence English speech-recognition model designed for real-time, on-device
transcription on hardware that is severely constrained in memory and compute. Trained on 200,000
hours of audio, it transcribes English speech to text and, despite its size, reports greater accuracy
than existing ASR systems of comparable scale on standard benchmarks. It targets developers building
live transcription and voice-command experiences on low-cost devices. Like other autoregressive ASR
models it can occasionally hallucinate or repeat on very short or clipped segments, so robust
in-domain evaluation is recommended before deployment. This OpenASR repo repackages the original
weights as .oasr packs that run natively in the OpenASR runtime — no Python at inference time. The
q8_0 build is the recommended default (near-reference accuracy at roughly a third of the
footprint); fp16 is for verification or maximum fidelity.
.oasr packs run with no Python at inference, engineered for peak performance on CPU & GPUThese are CLI / local-server examples. The desktop app runs this model without typing a command — see the desktop install path above.
$ openasr pull moonshine-tiny:q8 ↓ moonshine-tiny.oasr 32.8 MB ✓ verified sha256 $ openasr transcribe meeting.wav --backend native --model-pack ~/.openasr/models/moonshine-tiny/q8_0/moonshine-tiny-q8_0.oasr ✓ local transcript · 0 bytes sent
$ openasr serve --backend native --model-pack ~/.openasr/models/moonshine-tiny/q8_0/moonshine-tiny-q8_0.oasr --addr 127.0.0.1:8080 ▶ http://127.0.0.1:8080 · model=moonshine-tiny · 0 bytes will leave this host
from openai import OpenAI client = OpenAI(base_url="http://127.0.0.1:8080/v1", api_key="local") audio = open("meeting.wav", "rb") text = client.audio.transcriptions.create(model="moonshine-tiny", file=audio)