Node.js bindings for OpenAI's Whisper. Runs local on CPU.
Whisper-node is an essential npm package that provides Node.js bindings for OpenAI’s revolutionary Whisper model, enabling developers to integrate advanced speech-to-text capabilities directly into their Node.js applications. This package is especially powerful as it allows the Whisper model to run locally on a CPU, ensuring that developers can maintain control over their data while benefiting from OpenAI's state-of-the-art machine learning model for audio processing. The ability to run locally is crucial for applications requiring data privacy or where constant internet connectivity is a challenge.
To get started with integrating whisper-node into your project, you can simply run the command `npm install whisper-node` in your Node.js environment. This command installs the package and all necessary dependencies, setting up your application to leverage Whisper's robust audio transcription capabilities. The installation process is streamlined and designed for ease of use, ensuring that even developers new to Node.js or machine learning can easily add sophisticated audio processing functionalities to their applications.
The benefits of using whisper-node in your projects are manifold. By incorporating this package, developers can create applications that convert spoken language into written text with remarkable accuracy, which is ideal for creating transcriptions, subtitles, or even enabling command-and-control through voice. Moreover, since whisper-node works with the CPU and does not require a GPU, it is accessible to a wider range of developers, including those who may not have access to high-end hardware. This opens up opportunities for a variety of applications, from educational software and accessibility tools to advanced voice-operated systems, all while ensuring user privacy and data security.
Core dependencies of this npm package and its dev dependencies.
readline-sync, shelljs, @types/node, nodemon, ts-node, typescript
A README file for the whisper-node code repository. View Code
Node.js bindings for OpenAI's Whisper. Transcription done local.
npm install whisper-node
npx whisper-node download
Requirement for Windows: Install the make
command from here.
import whisper from 'whisper-node';
const transcript = await whisper("example/sample.wav");
console.log(transcript); // output: [ {start,end,speech} ]
[
{
"start": "00:00:14.310", // time stamp begin
"end": "00:00:16.480", // time stamp end
"speech": "howdy" // transcription
}
]
import whisper from 'whisper-node';
const filePath = "example/sample.wav"; // required
const options = {
modelName: "base.en", // default
// modelPath: "/custom/path/to/model.bin", // use model in a custom directory (cannot use along with 'modelName')
whisperOptions: {
language: 'auto' // default (use 'auto' for auto detect)
gen_file_txt: false, // outputs .txt file
gen_file_subtitle: false, // outputs .srt file
gen_file_vtt: false, // outputs .vtt file
word_timestamps: true // timestamp for every word
// timestamp_size: 0 // cannot use along with word_timestamps:true
}
}
const transcript = await whisper(filePath, options);
Files must be .wav and 16Hz
Example .mp3 file converted with an FFmpeg command: ffmpeg -i input.mp3 -ar 16000 output.wav
d.ts
filenpm run dev
- runs nodemon and tsc on '/src/test.ts'
npm run build
- runs tsc, outputs to '/dist' and gives sh permission to 'dist/download.js'