SaraKIT is equipped with three microphones and a specialized sound processor that cleanses the voice and supports speech recognition on Raspberry Pi from distances up to 16.4 feet (5 meters). Building a voice-operated ChatGPT is possible through various methods, with many examples available on GitHub. Here, I propose a solution based on the offline speech recognition tool Vosk, used for wake word detection and command recognition, and Piper for speech generation - both programs are currently the best offline Text to Speech (TTS) and Speech to Text (STT) solutions for Raspberry Pi. The offline approach means continuous internet connectivity is not required, ensuring privacy and a free solution.

For more details on Piper, see here: https://github.com/SaraEye/SaraKIT-Text-To-Speech-Piper-Raspberry-Pi

For more on Vosk, check out: https://github.com/SaraEye/SaraKIT-Speech-Recognition-Vosk-Raspberry-Pi

 

Installation on SaraKIT

Assuming the basic SaraKIT drivers are already installed ([Getting Started with SaraKIT](https://sarakit.saraai.com/getting-started/software)), follow these steps to install the required tools:

sudo apt-get update
sudo apt-get install -y python3-pip python3-pyaudio libasound2-dev libfmt-dev libspdlog-dev

sudo pip3 install vosk piper-tts openai

git clone https://github.com/SaraEye/SaraKIT-Voice-ChatGPT-Raspberry-Pi VoiceChatGPT
cd VoiceChatGPT

Before running, you'll need to insert your OpenAI API key, which you can obtain by registering on OpenAI's website.
Insert your API key in the line:

client = OpenAI(api_key="YOUR_API_KEY_HERE")

Set your wake word in the line:

WakeWord="sarah"

If you wish to change the language from English or adjust the voice for Piper or Vosk, download and load the appropriate models. See the descriptions on Piper's GitHub and Vosk's GitHub for guidance.

To run:

python VoiceChatGPT.py

Initially, the chat waits for the wake word, by default "sarah". After recognizing it, you can ask ChatGPT anything, and it will respond verbally.

This setup creates a powerful, private, and interactive voice assistant using the capabilities of ChatGPT, SaraKIT and Raspberry Pi. Dive into creating your personalized voice-operated assistant today!

 

On our website, you can discover an even more advanced version we've dubbed SaraEye, where ChatGPT activation doesn't rely on a wake word but on gaze recognition, mimicking human interaction. When we look at someone, they know we're addressing them. Similarly, here, you simply look at ChatGPT to engage, eliminating the need for constant wake prompts like "Alexa, Alexa, Alexa..." :)

Using our SaraKIT electronics, we can build a device with ChatGPT support in a housing printed on a 3D printer with the function of tracking and recognition the user's face.
 
Our most advanced version:

Of course, for simple integration with ChatGPT, the PCB itself is enough, without cameras, motors or a special housing, but this deprives our assistant of the sense of sight.

 

The effects of this simple yet powerful script can be seen in the video below:


You can find C++ and Python code for Raspberry Pi4 in the
SaraKIT Github repository:
https://github.com/SaraEye
https://github.com/SaraEye/SaraKIT-Voice-ChatGPT-Raspberry-Pi

Pan/Tilt Camera (or Turret Base): https://sarakit.saraai.com/example-of-use/camera-pan-tilt