Moonshine Tiny

Tiny 27M-parameter English ASR built for real-time, on-device transcription

EnglishMIT

Desktop app Open the Models screen and click install.

CLI

$ openasr pull moonshine-tiny:q8

Overview

Moonshine Tiny is the smallest model in Useful Sensors' Moonshine family — a 27M-parameter, sequence-to-sequence English speech-recognition model designed for real-time, on-device transcription on hardware that is severely constrained in memory and compute. Trained on 200,000 hours of audio, it transcribes English speech to text and, despite its size, reports greater accuracy than existing ASR systems of comparable scale on standard benchmarks. It targets developers building live transcription and voice-command experiences on low-cost devices. Like other autoregressive ASR models it can occasionally hallucinate or repeat on very short or clipped segments, so robust in-domain evaluation is recommended before deployment. This OpenASR repo repackages the original weights as .oasr packs that run natively in the OpenASR runtime — no Python at inference time. The q8_0 build is the recommended default (near-reference accuracy at roughly a third of the footprint); fp16 is for verification or maximum fidelity.

Highlights

🪶 Just 27M parameters — the smallest Moonshine, sized for memory- and compute-constrained edge hardware
⚡ Real-time on-device — engineered by Useful Sensors for live transcription and voice commands on low-cost devices
🎯 Accurate for its size — beats similarly-sized ASR systems on standard English benchmarks (per the Moonshine paper)
🗣️ English speech-to-text — sequence-to-sequence ASR trained on 200K hours of audio
🦀 Native in OpenASR — .oasr packs run with no Python at inference, engineered for peak performance on CPU & GPU

Pull string	Size	Quant	JFK ΔWER
`moonshine-tiny:fp16`	103.8 MB	fp16	0%
`moonshine-tiny:q8`default	32.8 MB	q8_0	0%

Usage

These are CLI / local-server examples. The desktop app runs this model without typing a command — see the desktop install path above.

bash · transcribe a file

$ openasr pull moonshine-tiny:q8
↓ moonshine-tiny.oasr  32.8 MB  ✓ verified sha256
$ openasr transcribe meeting.wav --backend native --model-pack ~/.openasr/models/moonshine-tiny/q8_0/moonshine-tiny-q8_0.oasr
✓ local transcript · 0 bytes sent

bash · serve a local API

$ openasr serve --backend native --model-pack ~/.openasr/models/moonshine-tiny/q8_0/moonshine-tiny-q8_0.oasr --addr 127.0.0.1:8080
▶ http://127.0.0.1:8080 · model=moonshine-tiny · 0 bytes will leave this host

python · client.py

from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:8080/v1", api_key="local")
audio = open("meeting.wav", "rb")
text = client.audio.transcriptions.create(model="moonshine-tiny", file=audio)

Overview

Highlights

Tags

Usage

Other models