WhisperSpeechTranscription
: A yarp device which performs audio-to-text transcription using OpenAI Whisper models.
More...
Public Member Functions | |
WhisperSpeechTranscription () | |
virtual | ~WhisperSpeechTranscription () |
WhisperSpeechTranscription (const WhisperSpeechTranscription &)=delete | |
WhisperSpeechTranscription (WhisperSpeechTranscription &&)=delete | |
WhisperSpeechTranscription & | operator= (const WhisperSpeechTranscription &)=delete |
WhisperSpeechTranscription & | operator= (WhisperSpeechTranscription &&)=delete |
bool | open (yarp::os::Searchable &config) override |
Open the DeviceDriver. | |
bool | close () override |
Close the DeviceDriver. | |
virtual bool | setLanguage (const std::string &language) override |
Sets the language for speech transcription. | |
virtual bool | getLanguage (std::string &language) override |
Gets the current language set for speech transcription. | |
virtual bool | transcribe (const yarp::sig::Sound &sound, std::string &transcription, double &score) override |
Performs the speech transcription. | |
![]() | |
DeviceDriver () | |
DeviceDriver (const DeviceDriver &other)=delete | |
DeviceDriver (DeviceDriver &&other) noexcept=delete | |
DeviceDriver & | operator= (const DeviceDriver &other)=delete |
DeviceDriver & | operator= (DeviceDriver &&other) noexcept=delete |
virtual | ~DeviceDriver () |
virtual std::string | id () const |
Return the id assigned to the PolyDriver. | |
virtual void | setId (const std::string &id) |
Set the id for this device. | |
template<class T > | |
bool | view (T *&x) |
Get an interface to the device driver. | |
virtual DeviceDriver * | getImplementation () |
Some drivers are bureaucrats, pointing at others. | |
![]() | |
virtual | ~ISpeechTranscription () |
WhisperSpeechTranscription
: A yarp device which performs audio-to-text transcription using OpenAI Whisper models.
This device implements the ISpeechTranscription and can be used with a speechTranscription_nws_yarp device and a AudioRecorderWrapper to transcribe audio in real time.
Parameters required by this device are:
Parameter name | SubParameter | Type | Units | Default Value | Required | Description | Notes |
---|---|---|---|---|---|---|---|
model | - | string | - | - | Yes | Full path tot the model file, e.g. ggml-base.en.bin | |
language | - | string | - | auto | No | Language (??? TBC) | |
remove_symbols | - | bool | - | true | No | Removed symbols from output text, i.e. ...[bla bla]... |
Definition at line 31 of file whisperSpeechTranscription.h.
WhisperSpeechTranscription::WhisperSpeechTranscription | ( | ) |
Definition at line 25 of file whisperSpeechTranscription.cpp.
|
virtual |
Definition at line 33 of file whisperSpeechTranscription.cpp.
|
delete |
|
delete |
|
overridevirtual |
Close the DeviceDriver.
Reimplemented from yarp::dev::DeviceDriver.
Definition at line 150 of file whisperSpeechTranscription.cpp.
|
overridevirtual |
Gets the current language set for speech transcription.
language | the returned string (code) representing the speech language (e.g. ita, eng...). Default value is "auto". |
Implements yarp::dev::ISpeechTranscription.
Definition at line 167 of file whisperSpeechTranscription.cpp.
|
overridevirtual |
Open the DeviceDriver.
config | is a list of parameters for the device. Which parameters are effective for your device can vary. See device invocation examples. If there is no example for your device, you can run the "yarpdev" program with the verbose flag set to probe what parameters the device is checking. If that fails too, you'll need to read the source code (please nag one of the yarp developers to add documentation for your device). |
Reimplemented from yarp::dev::DeviceDriver.
Definition at line 38 of file whisperSpeechTranscription.cpp.
|
delete |
|
delete |
Sets the language for speech transcription.
language | a string (code) representing the speech language (e.g. ita, eng...). Default value is "auto". |
Implements yarp::dev::ISpeechTranscription.
Definition at line 160 of file whisperSpeechTranscription.cpp.
|
overridevirtual |
Performs the speech transcription.
sound | the audio data to transcribe |
transcription | the returned transcription (it may be empty) |
score | the returned score/confidence value in the range (0-1.0). It may be not implemented. |
Implements yarp::dev::ISpeechTranscription.
Definition at line 173 of file whisperSpeechTranscription.cpp.