Qwen3-ASR 0.6B

Multilingual speech recognition across 52 languages & dialects — the fast, lightweight Qwen3-ASR

MultilingualApache-2.0
Desktop app Open the Models screen and click install.
CLI
$ openasr pull qwen3-asr-0.6b:q8
Download .oasr

Overview

Qwen3-ASR-0.6B is the compact, efficiency-optimized member of Alibaba's Qwen3-ASR family, built on the Qwen3-Omni audio-understanding foundation. It performs language identification and speech recognition across 30 languages and 22 Chinese dialects (52 in total), and stays robust on challenging audio — clean speech, singing voice, and songs with background music. A single unified checkpoint handles both offline and real-time streaming transcription and can process long audio; the 0.6B size targets a strong accuracy-vs-efficiency trade-off (the Qwen team reports up to ~2000× throughput at high concurrency), making it the family's go-to for lightweight, high-throughput deployments. This OpenASR repo repackages the original weights as .oasr packs that run natively in the OpenASR runtime — no Python at inference time. The q8_0 build is the recommended default (near-reference accuracy at roughly half the footprint); q4_k suits tight-memory devices and fp16 is for verification or maximum fidelity. For word-level timestamps, pair it upstream with Qwen3-ForcedAligner-0.6B.

Highlights

  • 🌍 52 languages & dialects — 30 languages plus 22 Chinese dialects, with built-in spoken-language identification
  • 🎧 Robust on hard audio — clean speech, singing voice, and songs over background music
  • Fast & light — the efficiency-tuned member of the Qwen3-ASR family; one model for both offline and streaming
  • 🦀 Native in OpenASR.oasr packs run with no Python at inference, engineered for peak performance on CPU & GPU

Tags

Pull stringSizeQuantJFK ΔWER
qwen3-asr-0.6b:fp16 1.8 GB fp16 0%
qwen3-asr-0.6b:q8default 960.2 MB q8_0 0%
qwen3-asr-0.6b:q4 571 MB q4_k 0%

Usage

These are CLI / local-server examples. The desktop app runs this model without typing a command — see the desktop install path above.

bash · transcribe a file
$ openasr pull qwen3-asr-0.6b:q8
↓ qwen3-asr-0.6b.oasr  960.2 MB  ✓ verified sha256
$ openasr transcribe meeting.wav --backend native --model-pack ~/.openasr/models/qwen3-asr-0.6b/q8_0/qwen3-asr-0.6b-q8_0.oasr
✓ local transcript · 0 bytes sent
bash · serve a local API
$ openasr serve --backend native --model-pack ~/.openasr/models/qwen3-asr-0.6b/q8_0/qwen3-asr-0.6b-q8_0.oasr --addr 127.0.0.1:8080
▶ http://127.0.0.1:8080 · model=qwen3-asr-0.6b · 0 bytes will leave this host
python · client.py
from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:8080/v1", api_key="local")
audio = open("meeting.wav", "rb")
text = client.audio.transcriptions.create(model="qwen3-asr-0.6b", file=audio)

Other models