Skip to main content

Whisper.cpp

Transcribe text using whisper.cpp. You can run the whisper.cpp server locally or remote.

Setup

  1. Install whisper.cpp following the instructions in the whisper.cpp repository.
  2. Start the whisper.cpp server: ./server
  3. (optional): Download larger models and start the server with the --model parameter
  4. (optional): Enable input conversion on the server using the --convert parameter
note

Without the --convert parameter, the server expects WAV files with 16kHz sample rate and 16-bit PCM encoding. You can use ffmpeg for conversion: ffmpeg -i input.mp3 -ar 16000 -ac 1 -c:a pcm_s16le output.wav

Model Functions

Examples

Generate Transcription

WhisperCppTranscriptionModel API

import fs from "node:fs";
import { whispercpp, generateTranscription } from "modelfusion";

const transcription = await generateTranscription({
model: whispercpp.Transcriber(),
mimeType: "audio/wav",
audioData: await fs.promises.readFile("data/test.wav"),
});

Configuration

API Configuration

Whisper.cpp API Configuration

const api = whispercpp.Api({
baseUrl: {
host: "localhost",
port: "9000",
},
// ...
});

const model = whispercpp.Transcriber({
api,
// ...
});