Qwen3-ASR 0.6B

Multilingual speech recognition across 52 languages & dialects — the fast, lightweight Qwen3-ASR

MultilingualApache-2.0

Desktop app Open the Models screen and click install.

CLI

$ openasr pull qwen3-asr-0.6b:q8

Overview

Qwen3-ASR-0.6B is the compact, efficiency-optimized member of Alibaba's Qwen3-ASR family, built on the Qwen3-Omni audio-understanding foundation. It performs language identification and speech recognition across 30 languages and 22 Chinese dialects (52 in total), and stays robust on challenging audio — clean speech, singing voice, and songs with background music. A single unified checkpoint handles both offline and real-time streaming transcription and can process long audio; the 0.6B size targets a strong accuracy-vs-efficiency trade-off (the Qwen team reports up to ~2000× throughput at high concurrency), making it the family's go-to for lightweight, high-throughput deployments. This OpenASR repo repackages the original weights as .oasr packs that run natively in the OpenASR runtime — no Python at inference time. The q8_0 build is the recommended default (near-reference accuracy at roughly half the footprint); q4_k suits tight-memory devices and fp16 is for verification or maximum fidelity. For word-level timestamps, pair it upstream with Qwen3-ForcedAligner-0.6B.

Highlights

🌍 52 languages & dialects — 30 languages plus 22 Chinese dialects, with built-in spoken-language identification
🎧 Robust on hard audio — clean speech, singing voice, and songs over background music
⚡ Fast & light — the efficiency-tuned member of the Qwen3-ASR family; one model for both offline and streaming
🦀 Native in OpenASR — .oasr packs run with no Python at inference, engineered for peak performance on CPU & GPU

Pull string	Size	Quant	JFK ΔWER
`qwen3-asr-0.6b:fp16`	1.8 GB	fp16	0%
`qwen3-asr-0.6b:q8`default	960.2 MB	q8_0	0%
`qwen3-asr-0.6b:q4`	571 MB	q4_k	0%

Usage

These are CLI / local-server examples. The desktop app runs this model without typing a command — see the desktop install path above.

bash · transcribe a file

$ openasr pull qwen3-asr-0.6b:q8
↓ qwen3-asr-0.6b.oasr  960.2 MB  ✓ verified sha256
$ openasr transcribe meeting.wav --backend native --model-pack ~/.openasr/models/qwen3-asr-0.6b/q8_0/qwen3-asr-0.6b-q8_0.oasr
✓ local transcript · 0 bytes sent

bash · serve a local API

$ openasr serve --backend native --model-pack ~/.openasr/models/qwen3-asr-0.6b/q8_0/qwen3-asr-0.6b-q8_0.oasr --addr 127.0.0.1:8080
▶ http://127.0.0.1:8080 · model=qwen3-asr-0.6b · 0 bytes will leave this host

python · client.py

from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:8080/v1", api_key="local")
audio = open("meeting.wav", "rb")
text = client.audio.transcriptions.create(model="qwen3-asr-0.6b", file=audio)

Overview

Highlights

Tags

Usage

Other models