OrateVR

    Loading...

    Stage fright (or performance anxiety) is the fear/anxiety/persistent phobia that arises in an individual by the requirement to perform in front of an audience.

    It can be alleviated by constant practice and repetition, but people lack the motivation and means to practice. There is a dire need for a mechanism for people to be able to practice and get feedback on themselves to improve.

    Using immersive tech like VR simulates the biggest component that causes stage fright: users being in the social environment. Along with having accurate VR scenes and avatars, our solution involves the usage of sensors that tracks users' anxiety levels, enabling them to diagnose and improve themselves.

    Loading...

    Device that houses the sensors, circuit board, and the battery

    Device that houses the sensors, circuit board, and the battery

    Intended Users:

    Primary Target users:

      • People suffering from glossophobia
      • Speech disorders
      • People suffering from social anxiety disorder
      • People with fluency disorders(stuttering/stammering)

    Secondary Target users:

      • People learning a new language
      • People with less public speaking experience
      • Professions that involve more interaction (Sales, Teaching, Managers)

    Hardware Architecture:

    The hardware architecture for OrateVr includes a PPG sensor and a GSR sensor connected to an Arduino Nano 33 IoT.

    Loading...

    Hardware diagram of sensor and circuit board configuration

    Hardware diagram of sensor and circuit board configuration

    Initial Prototype:

    Loading...

    Initial prototype based on Raspberry Pi, Arduino Uno, and local Python scripts

    Initial prototype based on Raspberry Pi, Arduino Uno, and local Python scripts

    Final Prototype:

    Custom circuit boards are designed in Fusion 360 and manufactured using the OtherMill.

    Loading...

    PCB Prototyping

    PCB Prototyping

    Loading...

    Completed PCB that houses Arduino Nano and connects to PPG, GSR, and the battery

    Completed PCB that houses Arduino Nano and connects to PPG, GSR, and the battery

    Loading...

    Final model with Nano casting data to cloud through WiFi

    Final model with Nano casting data to cloud through WiFi

    Software Architecture:

    Loading...

    Software architecture showing database connecting to, Python scripts running on Azure, Front-end dashboard, and VR headsets

    Software architecture showing database connecting to, Python scripts running on Azure, Front-end dashboard, and VR headsets

    The data from the firebase is processed using python scripts running on Azure servers. The following are the functions of the python script:

      • Removes the noise from the data using bandpass and moving average filters
      • Uses machine learning models to predict the moods of the user
      • Posts the processed data and the user mood data back in Firebase

    The C# scripts running on Unity pull the information from the Firebase database and show it to the user in the VR application.

    The Dashboard (based on React, Plotly Dash, and Tailwind CSS) displays the real-time sensor data and the user’s mood on the dashboard.

    Data Refinement:

    Loading...

    Raw signal (Red) and the Filtered signal after noise removal (blue)

    Raw signal (Red) and the Filtered signal after noise removal (blue)

    The raw PPG data (red) is filtered using bandpass filters that remove unnecessary frequencies (bandpass filters remove both high and low frequencies based on the specified thresholds).

    The filter based on amplitude values removes the ranges above and below certain thresholds (PPG values that are not heartbeats).

    Loading...

    The final heartbeats detected corresponding to the PPG signals

    The final heartbeats detected corresponding to the PPG signals

    The heart rates are detected using the HeartPy library based on SciPy modules.

    ML for Mood Prediction:

    A dense neural network was used to classify the user’s mood into three buckets (stressed, distracted, and normal). This section goes into the details about how the machine learning model was arrived at.

    The datasets were taken from Kaggle (SWELL dataset | Kaggle) that studied subjects of 25 members doing typical office work, while taking measurements using various instruments like the EEG sensor, body postures, skin conductance (correlating to GSR), etc.

    Loading...

    Pearson’s correlation to measure input parameter confluence with outputs

    Pearson’s correlation to measure input parameter confluence with outputs

    After the datasets are obtained, and the relevant data from the sensors are arranged, the machine learning model had to be constructed and trained.

    Through a process of trial and error, it was concluded that a combination of 18 nodes in the 1st layer, and 10 nodes in the 2nd layer gave the maximum accuracy.

    Loading...

    Dense neural network with hidden nodes connecting the input and output layers

    Dense neural network with hidden nodes connecting the input and output layers

    Loading...

    Accuracy of the machine learning model

    Accuracy of the machine learning model

    The classifications from the machine learning model were sent back to Firebase to be displayed on the dashboard and to the user in VR.

    OrateVR in working:

    The following are some of the images of the final product:

    Oculus Application:

    The following is the interface that the user sees in the Oculus. The location is in a classroom with other students, a commonplace a typical user would prefer to practice speech in.

    The tablet on the screen moves with the controller of the Oculus.

    Loading...

    Oculus App with an internal interface that shows real-time data of the user, and text to practice speech with.

    Oculus App with an internal interface that shows real-time data of the user, and text to practice speech with

    Web Application:

    The web application shows the real-time speech session of the user on the dashboard. One can see the sensor values and the mood of the user based on these sensor values.

    Loading...

    Dashboard showing real-time sensor values and user mood during a speech session

    Dashboard showing real-time sensor values and user mood during a speech session

    Conclusion:

    The application of emerging technology and Virtual Reality in the space of speech development has been explored. Upon reflecting on this journey, we realized we could have enhanced the user experience by including speech and fluency analysis as part of the feedback, which we look forward to in the upcoming projects.