YARP
Yet Another Robot Platform
 
Loading...
Searching...
No Matches
WhisperSpeechTranscription Class Reference

WhisperSpeechTranscription: A yarp device which performs audio-to-text transcription using OpenAI Whisper models. More...

#include </home/runner/work/yarp-documentation/yarp-documentation/yarp/opt-modules/yarp-device-speechTranscription-whisper/src/devices/whisperSpeechTranscription/whisperSpeechTranscription.h>

+ Inheritance diagram for WhisperSpeechTranscription:

Public Member Functions

 WhisperSpeechTranscription ()
 
virtual ~WhisperSpeechTranscription ()
 
 WhisperSpeechTranscription (const WhisperSpeechTranscription &)=delete
 
 WhisperSpeechTranscription (WhisperSpeechTranscription &&)=delete
 
WhisperSpeechTranscriptionoperator= (const WhisperSpeechTranscription &)=delete
 
WhisperSpeechTranscriptionoperator= (WhisperSpeechTranscription &&)=delete
 
bool open (yarp::os::Searchable &config) override
 Open the DeviceDriver.
 
bool close () override
 Close the DeviceDriver.
 
virtual bool setLanguage (const std::string &language) override
 Sets the language for speech transcription.
 
virtual bool getLanguage (std::string &language) override
 Gets the current language set for speech transcription.
 
virtual bool transcribe (const yarp::sig::Sound &sound, std::string &transcription, double &score) override
 Performs the speech transcription.
 
- Public Member Functions inherited from yarp::dev::DeviceDriver
 DeviceDriver ()
 
 DeviceDriver (const DeviceDriver &other)=delete
 
 DeviceDriver (DeviceDriver &&other) noexcept=delete
 
DeviceDriveroperator= (const DeviceDriver &other)=delete
 
DeviceDriveroperator= (DeviceDriver &&other) noexcept=delete
 
virtual ~DeviceDriver ()
 
virtual std::string id () const
 Return the id assigned to the PolyDriver.
 
virtual void setId (const std::string &id)
 Set the id for this device.
 
template<class T >
bool view (T *&x)
 Get an interface to the device driver.
 
virtual DeviceDrivergetImplementation ()
 Some drivers are bureaucrats, pointing at others.
 
- Public Member Functions inherited from yarp::dev::ISpeechTranscription
virtual ~ISpeechTranscription ()
 

Detailed Description

WhisperSpeechTranscription: A yarp device which performs audio-to-text transcription using OpenAI Whisper models.

This device implements the ISpeechTranscription and can be used with a speechTranscription_nws_yarp device and a AudioRecorderWrapper to transcribe audio in real time.

Parameters required by this device are:

Parameter name SubParameter Type Units Default Value Required Description Notes
model - string - - Yes Full path tot the model file, e.g. ggml-base.en.bin
language - string - auto No Language (??? TBC)
remove_symbols - bool - true No Removed symbols from output text, i.e. ...[bla bla]...

Definition at line 31 of file whisperSpeechTranscription.h.

Constructor & Destructor Documentation

◆ WhisperSpeechTranscription() [1/3]

WhisperSpeechTranscription::WhisperSpeechTranscription ( )

Definition at line 25 of file whisperSpeechTranscription.cpp.

◆ ~WhisperSpeechTranscription()

WhisperSpeechTranscription::~WhisperSpeechTranscription ( )
virtual

Definition at line 33 of file whisperSpeechTranscription.cpp.

◆ WhisperSpeechTranscription() [2/3]

WhisperSpeechTranscription::WhisperSpeechTranscription ( const WhisperSpeechTranscription )
delete

◆ WhisperSpeechTranscription() [3/3]

WhisperSpeechTranscription::WhisperSpeechTranscription ( WhisperSpeechTranscription &&  )
delete

Member Function Documentation

◆ close()

bool WhisperSpeechTranscription::close ( )
overridevirtual

Close the DeviceDriver.

Returns
true/false on success/failure.

Reimplemented from yarp::dev::DeviceDriver.

Definition at line 150 of file whisperSpeechTranscription.cpp.

◆ getLanguage()

bool WhisperSpeechTranscription::getLanguage ( std::string &  language)
overridevirtual

Gets the current language set for speech transcription.

Parameters
languagethe returned string (code) representing the speech language (e.g. ita, eng...). Default value is "auto".
Returns
a ReturnValue, convertible to true/false

Implements yarp::dev::ISpeechTranscription.

Definition at line 167 of file whisperSpeechTranscription.cpp.

◆ open()

bool WhisperSpeechTranscription::open ( yarp::os::Searchable config)
overridevirtual

Open the DeviceDriver.

Parameters
configis a list of parameters for the device. Which parameters are effective for your device can vary. See device invocation examples. If there is no example for your device, you can run the "yarpdev" program with the verbose flag set to probe what parameters the device is checking. If that fails too, you'll need to read the source code (please nag one of the yarp developers to add documentation for your device).
Returns
true/false upon success/failure

Reimplemented from yarp::dev::DeviceDriver.

Definition at line 38 of file whisperSpeechTranscription.cpp.

◆ operator=() [1/2]

WhisperSpeechTranscription & WhisperSpeechTranscription::operator= ( const WhisperSpeechTranscription )
delete

◆ operator=() [2/2]

WhisperSpeechTranscription & WhisperSpeechTranscription::operator= ( WhisperSpeechTranscription &&  )
delete

◆ setLanguage()

bool WhisperSpeechTranscription::setLanguage ( const std::string &  language)
overridevirtual

Sets the language for speech transcription.

Parameters
languagea string (code) representing the speech language (e.g. ita, eng...). Default value is "auto".
Returns
a ReturnValue, convertible to true/false

Implements yarp::dev::ISpeechTranscription.

Definition at line 160 of file whisperSpeechTranscription.cpp.

◆ transcribe()

bool WhisperSpeechTranscription::transcribe ( const yarp::sig::Sound sound,
std::string &  transcription,
double score 
)
overridevirtual

Performs the speech transcription.

Parameters
soundthe audio data to transcribe
transcriptionthe returned transcription (it may be empty)
scorethe returned score/confidence value in the range (0-1.0). It may be not implemented.
Returns
a ReturnValue, convertible to true/false

Implements yarp::dev::ISpeechTranscription.

Definition at line 173 of file whisperSpeechTranscription.cpp.


The documentation for this class was generated from the following files: