Related Projects Review
What is Tobii EyeTracker?
- Tobii Eye Tracker is a technology developed by Tobii Technology for eye-tracking purposes. It enables the monitoring and analysis of eye movements for various applications, such as research, usability testing, gaming, and accessibility.
Main Features:
- High Accuracy: Tobii EyeTracker offers high precision and accuracy in tracking eye movements, allowing for detailed analysis of gaze patterns.
- Real-time Tracking: It provides real-time tracking of eye movements, enabling immediate feedback and analysis during interaction with digital content or physical environments.
- Wide Compatibility: Tobii EyeTracker devices are compatible with a range of hardware setups, including monitors, laptops, and VR headsets, making them versatile for different use cases.
- Software Integration: Tobii offers software development kits (SDKs) and APIs for integrating eye-tracking functionality into custom applications and software solutions.
- User-Friendly Setup: The setup process for Tobii EyeTracker devices is relatively straightforward, with user-friendly calibration procedures for accurate tracking.
Shortcomings of Tobii Eye Tracker
- Algorithm Development: Understanding the algorithms and techniques used for accurate eye tracking can inform the development of your own software.
- Calibration Methods: Learning about calibration procedures used in commercial eye-tracking systems can help in devising efficient calibration methods for webcam-based solutions.
- Integration Challenges: Examining how Tobii integrates its eye-tracking technology with different hardware and software platforms can provide insights into compatibility and integration challenges.
- Use Cases and Applications: Exploring the diverse applications of Tobii EyeTracker can inspire ideas for potential use cases and applications for your own webcam-based eye-tracking software.
By studying Tobii EyeTracker and similar commercial solutions, you can gain valuable knowledge and insights to inform the development of your own eye-tracking software using a webcam.
Existing Solution Review
We began by examining the current repo of the existing eye gaze module within MI v3.2. This build was the primary work of Guari Desai, whom we also met with for clarification on any confusions we had about her codebase. In their basic version, they had the file eye_tracking_mediapipe.py that she evaluated as “can track different parts of the eye accurately”. The following is a breakdown of how the code within this file operates:
- The code reads configuration data from a JSON file and initialises necessary modules like the webcam, FaceMesh from MediaPipe, and PyAutoGUI for cursor control.
- It creates a small window with a red dot at the centre of the screen, prompting the user to focus on it for calibration purposes.
- During calibration, the code tracks the movement of the user's eyes and records their average position as the default cursor centre.
- After calibration, the code continuously captures video frames from the webcam, processes them to detect facial landmarks, and calculates the position of the user's eyes.
- Depending on the user's settings, the code adjusts the cursor position based on the detected eye movement. It also implements features like double-click control using eye gestures like winking or smiling.
- The code displays the processed video frames with visualisations of the detected eye landmarks and cursor movement.
- The program runs indefinitely until the user closes the window or exits the application.
- Overall, this code provides a basic framework for implementing eye gaze navigation functionality using facial landmark detection and cursor control, catering to various user preferences and interaction patterns.
The advanced version works by operating two files gaze_tracking.py: “the main file to run to detect the gaze vector”, and gaze.py: “the code that takes landmarks and other values to return the gaze vector.”
- Starting with 'gaze_tracking.py', this script utilises the MediaPipe library to detect facial landmarks and track the user's gaze. It initialises a camera stream and processes each frame using the MediaPipe Face Mesh model to identify relevant facial landmarks. The detected landmarks are then passed to the 'gaze.py' script for gaze estimation.
- In 'gaze.py', the gaze estimation process begins by calculating the relative positions of facial landmarks and mapping them to 3D model points. These points are used to estimate the rotation and translation vectors, which describe the orientation and position of the user's head relative to the camera. By projecting the gaze vector onto the image plane, the script determines the direction of the user's gaze and visualises it by drawing a line from the pupil to the estimated gaze point on the screen.
While the code demonstrates a functional implementation of gaze tracking, there are several areas that warrant critical evaluation and improvement:
- Firstly, the code lacks extensive documentation and comments, making it challenging for developers to understand the underlying logic and functionality. Improved documentation would enhance code readability and facilitate easier maintenance and debugging.
- Furthermore, the gaze estimation algorithm implemented in 'gaze.py' may not be optimised for accuracy and robustness in various lighting conditions or user environments. As a critical component of an eye gaze tracking system, the accuracy of gaze estimation is paramount for reliable user interaction. Therefore, rigorous testing and validation procedures are necessary to assess the algorithm's performance across diverse scenarios and datasets.
- Additionally, the code could benefit from refactoring and modularization to improve code organisation and maintainability. Breaking down the functionality into smaller, reusable components would enhance code reusability and facilitate future enhancements or modifications.
Overall, while the provided code lays the groundwork for an eye gaze tracking system, critical evaluation and refinement are essential to address potential limitations and ensure the system's effectiveness, accuracy, and usability in real-world applications.