Object detection is another component of our project thanks to which we can build our ChatGPT with many new features. Be amazed by our ChatGPT-powered voice assistant that not only hears but also sees and asks questions based on what it sees.

SaraKIT Object Detection for Raspberry Pi 4 CM4

SaraKIT is an easy-to-use object detection solution for Raspberry Pi 4 CM4, powered by state-of-the-art algorithms based on MediaPipe from Google, specifically optimized for the Raspberry Pi 64-bit platform.

MediaPipe Object Detection

The MediaPipe Object Detector task lets you detect the presence and location of multiple classes of objects within images or videos. For example, an object detector can locate dogs within an image. Each detection result represents an object that appears within the image or video.

To utilize SaraKIT for object detection, follow these steps:

  1. Clone repository, compile the code by running the command 'make' and execute the program:
    git clone https://github.com/SaraEye/SaraKIT-MediaPipe-Object-Detection-Raspberry-Pi-64bit ObjectDetection
    cd ObjectDetection
  2. The program captures frames from the camera, processes them, and sends the output.
  3. Preview the operation in your web browser by accessing the Raspberry Pi's IP address followed by port 7777 
  4. (e.g., http://raspberrypi:7777/ or
    If you have the Linux Desktop version and want to display the image from the camera in a window, change this line:
    init_viewer(ViewMode::Camera0,ViewMode::Processed, 1, true, false);
  5. The browser preview displays one or two images side by side, where the content of each image can be customized. By default, the left image shows the camera preview, while the right image displays the detected face along with face landmarks. Refer to the video below for a similar visualization.

Both the standard Raspberry Pi MMAL functions and OpenCV functions (slightly slower) can be used to capture frames from the camera.

#include <iostream>
#include <signal.h>
#include <stdio.h>
#include <math.h>
#include <arm_neon.h>
#include "unistd.h"

#include "struct.hpp"
#include "lib/viewer/viewer.hpp"
#include "lib/mediapipe/Mediapipe.hpp"

using namespace std;
cv::Mat frame0, frame0Gray, frame0GrayHalf, frame0GrayHalfEdge; // cam0
cv::Mat frame1, frame1Gray, frame1GrayHalf, frame1GrayHalfEdge; // cam1
cv::Mat imgProcessed;

ViewerStatus viewStatus;

void ctrlc_handler(sig_atomic_t s){
    printf("\nCaught signal %d\n",s);

int main(int argc, char** argv){

    imgProcessed=cv::Mat(camheight, camwidth, CV_8UC3);

    init_camera(0, camwidth, camheight, false, false, true, true, true);


    std::vector<mpobject::Object> objects;

    MediapipeObject mo;
    FrameResultSynchroniser objectSync(200);

    printf("Start Loop\n");
    do {
        // Get frame to frame,frameGray,frameGrayHalf
        if (GetFrame()==0) { //GetFrame()==1 (new frame from cam0 ==1, ==2 from cam1, ==3 from cam0 & cam 1)

            //here you have time to do anything


        int reso=objectSync.getFrameFromId(mo.getObjects(objects),imgProcessed);

        if (reso&&objects.size()) {

        viewStatus = viewer_refresh();
    } while (viewStatus != ViewerStatus::Exit && control_c != true);
    return 1;
You can find C++ and Python code for Raspberry Pi4 in the
SaraKIT Github repository: