Dedicated 2B ASR for 14-language transcription
Cohere Transcribe 03-2026 is Cohere and Cohere Labs' open release of a
2B-parameter automatic speech recognition model. It is a dedicated audio-in,
text-out architecture with a Conformer-based acoustic encoder and a lightweight
Transformer decoder, trained from scratch for transcription. The upstream model
card lists support for 14 languages across English, European, APAC, and MENA
coverage and reports Apache-2.0 licensing. This OpenASR repo repackages the
original CohereLabs/cohere-transcribe-03-2026 weights as .oasr packs that
run natively in the OpenASR runtime with no Python at inference time. For most
users the q8_0 build is the recommended default; q4_k is for tighter memory
budgets and fp16 is for verification or maximum fidelity.
.oasr packs run with no Python at inference, engineered for CPU and Apple SiliconThese are CLI / local-server examples. The desktop app runs this model without typing a command — see the desktop install path above.
$ openasr pull cohere-transcribe-03-2026:q8 ↓ cohere-transcribe-03-2026.oasr 2.3 GB ✓ verified sha256 $ openasr transcribe meeting.wav --backend native --model-pack ~/.openasr/models/cohere-transcribe-03-2026/q8_0/cohere-transcribe-03-2026-q8_0.oasr ✓ local transcript · 0 bytes sent
$ openasr serve --backend native --model-pack ~/.openasr/models/cohere-transcribe-03-2026/q8_0/cohere-transcribe-03-2026-q8_0.oasr --addr 127.0.0.1:8080 ▶ http://127.0.0.1:8080 · model=cohere-transcribe-03-2026 · 0 bytes will leave this host
from openai import OpenAI client = OpenAI(base_url="http://127.0.0.1:8080/v1", api_key="local") audio = open("meeting.wav", "rb") text = client.audio.transcriptions.create(model="cohere-transcribe-03-2026", file=audio)