Whisper.cpp
Transcribe text using whisper.cpp. You can run the whisper.cpp server locally or remote.
Setup
- Install whisper.cpp following the instructions in the
whisper.cpp
repository. - Start the whisper.cpp server:
./server
- (optional): Download larger models and start the server with the
--model
parameter - (optional): Enable input conversion on the server using the
--convert
parameter
note
Without the --convert
parameter, the server expects WAV files with 16kHz sample rate and 16-bit PCM encoding. You can use ffmpeg
for conversion:
ffmpeg -i input.mp3 -ar 16000 -ac 1 -c:a pcm_s16le output.wav
Model Functions
Generate Transcription
WhisperCppTranscriptionModel API
import fs from "node:fs";
import { whispercpp, generateTranscription } from "modelfusion";
const transcription = await generateTranscription({
model: whispercpp.Transcriber(),
mimeType: "audio/wav",
audioData: await fs.promises.readFile("data/test.wav"),
});
Configuration
API Configuration
const api = whispercpp.Api({
baseUrl: {
host: "localhost",
port: "9000",
},
// ...
});
const model = whispercpp.Transcriber({
api,
// ...
});