System Design

System Architecture

High-Level System Architecture

Our project consists of a centralised system architecture which allows us to leverage Bluetooth Low Energy capable devices across multiple workstations. The main components of the system diagram include:

  • Raspberry Pi — Central computing platform for proximity detection and authentication.
  • Host PC — Device to be authenticated into and where local LLM inferencing occurs in the proximity agents app.
  • ESP32 — Bluetooth-enabled microcontroller for BLE signal emitting and TOTP generation.
  • IBM Cloud Server — For secure data storage and device authentication.
System Architecture

Figure 1: System Architecture Diagram

All these components come together to create a complex, scalable and secure architecture, allowing future upgrades and the integration of new proximity agent iterations. We reduce our reliance on the device being authenticated into by utilising a Raspberry Pi which allows developers to easily adapt this system for other usecases (i.e printers, whiteboard profiles).

IBM Cloud Server

Both our registration website and API are hosted utilising cloud services, this is the first step to getting a QPG system setup. For our server, we are making use of serverless computing utilising IBM's Cloud Code Engine, which allows us to deploy a flexible and scalable server. This helps us ensure the costs of such a system are reasonable and that the services are accessible at all times. IBM Cloud Code Engine automatically scales the number of instances of the server that we have running just by the number of requests that we are getting, so if there are no requests incoming no instances of the server will be initiated and vice-versa.

IBM Cloud Engine Architecture

Figure 2: IBM Code Engine Architecture Diagram

https://cloud.ibm.com/docs/codeengine?topic=codeengine-architecture

By containerizing our Litestar server onto a Docker container, we ensured that our deployment options were flexible too, allowing us to easily update the server or switch to other cloud providers in the future if needed. Our container includes a persistent SQLite database alongside the main code for the endpoints and image processing for the facial recognition encodings.

This container serves as the central backend for all our services, all other components such as the ESP32, Raspberry Pi and Proximity Agents interact with this core, at least once in their lifetime. The ESP32 shares a TOTP secret key with this server, the Raspberry Pi fetches all registered ESP32s from the server and the Proximity Agents continuously fetch and update user profiles.

ESP32/Registration Site

Our registration site is a simple frontend written utilising Next.js and the Chakra UI component library that communicates with our main backend on IBM Cloud. It registers a user's ESP32 onto the database and records a 5 second video of the user to send to the server for encodings processing.

The ESP32 communicates with the Registration Site via an API called Web Serial. This allows us to use the user's browser to read the MAC Address of the ESP32 and share the secret key with the server, without having the ESP32 needing to be connected directly to the server. However, the downside of this comes from the fact that this API is only supported on Chromium-based browsers currently.

On the other hand, the ESP32 interacts with it's Bluetooth Stack to advertise itself as an available device. It initializes itself as a BLE peripheral and broadcasts a specific service UUID that client devices can discover. When a connection is established, it serves a characteristic that provides the TOTP token, which is being constantly regenerated in intervals of 30 seconds. This TOTP is read by the Raspberry Pi to verify the user's identity.

Raspberry Pi

The Raspberry Pi acts as the key component for each and every QPG enabled device, it processed incoming BLE signals, sorts signal strength, filters MAC Addresses and performs facial recognition on incoming users. This is all done utilising Python and many different components that we have specifically built in-house to authenticate users securely.

Proximity Agents

We are pioneering Proximity Prompting through our project, our first prototype of the Proximity Agents is a desktop application written using Tauri, which is a rust-based desktop application framework. This application runs on the host pc, and consists of a TypeScript based front-end alongside a Rust backend which is invoked by the frontend. The Rust backend allows us to do local LLM inferencing utilising ollama, communicate with the core server to update and fetch preferences, whilst the Next.js frontend displays the chatbot and the preferences according to the users actions.

Sequence Diagrams

Sequence diagrams illustrate how different components of our system interact over time. The following diagram shows the authentication flow for our Raspberry Pi component, demonstrating how it processes Bluetooth signals, authenticates users, and manages the login process.

Raspberry Pi Authentication Sequence Diagram

Figure 3: Raspberry Pi Authentication Sequence Diagram

Our sequence diagram shows the interactions between different components in our complex authentication process. It shows how the Raspberry Pi detects Bluetooth devices, filters them based on registered MAC addresses, verifies their TOTP tokens, and then performs facial recognition for additional security. After successful authentication, the system retrieves user credentials and executes the login process on the host machine.

Design Patterns

In software development, design patterns are solutions to common problems encountered in software design. They provide a structured approach to writing maintainable, scalable, and reusable code by defining proven methodologies that can be applied to different scenarios. In our system, we leverage several design patterns to enhance the modularity, and robustness of our code. Below, we explore four key design patterns which are Client-Server, Delegate, Facade, and Observer, detailing how they are implemented in our project.

Client-Server Pattern

This is the main pattern we utilise in our system. Most of our components rely on the centralised server to fetch updated data on the user like credentials, encodings, preferences amongst other items. However, we also utilise this pattern in our Rapsberry Pi code, where we have the main Bluetooth scanner program act as a server to provide data to a graphical interface that allows you to see shortlisted ESP32 keys and their estimated distance to the Raspberry Pi.

Client Server

Figure 4: Client-Server Communication

ER Diagrams

For our core server, we currently rely on a single database with three tables, one identifying the device and one which stores the user's accessibility preferences and one which stores authentication related fields. This is to be scaled up in the future with further tables storing different data related to the user's profile.

ER Diagram - Server

Figure 5: Entity Relationship Diagram

By default, we intilize a Device with a preferences JSON which has the default values for the specific accessibility command, this is stored on the Preferences table. On the other hand in the Authentication table, a nonce is stored which helps the server decrypt the password utilising a key stored elsewhere, making it difficult for bad actors to steal credentials. The TOTP timestamp and secret that is agreed upon during registration is also stored on this table.

Data Storage

As explained in the IBM Cloud System Architecture section, we have managed to containerize our server and deploy it as a serverless application utilising IBM Code Engine. As the scale of our database is currently very small, we have opted to use an SQLite database that's part of the container. This approach not only allows us to simplify the deployment of our application, but it also acts as a cost-saving measure allowing the system to be deployed at a much cheaper cost as a separate database service is not needed.

SQLite Logo

Figure 6: SQLite Logo

In future iterations, utilising propietary database services such as IBM Cloud Databases for PostgreSQL or AWS RDS will be essential for scalability. However, due to the budget and time constraints of our projects we chose SQLite for the proof of concept.

Preferences are stored as a JSON on the database:

Show JSON Configuration
{
  "zoom": {
    "lower_bound": 0.5,
    "upper_bound": 3.0,
    "current": 1.0,
    "commands": {
      "windows": "",
      "macos": "",
      "gnome": "gsettings set org.gnome.desktop.interface text-scaling-factor"
    }
  },
  "on_screen_keyboard": {
    "lower_bound": null,
    "upper_bound": null,
    "current": false,
    "commands": {
      "windows": "",
      "macos": "",
      "gnome": "gsettings set org.gnome.desktop.a11y.applications screen-keyboard-enabled"
    }
  },
  "magnifier": {
    "lower_bound": 0.1,
    "upper_bound": 32.0,
    "current": 1.0,
    "commands": {
      "windows": "",
      "macos": "",
      "gnome": "gsettings set org.gnome.desktop.a11y.magnifier mag-factor"
    }
  },
  "enable_animation": {
    "lower_bound": null,
    "upper_bound": null,
    "current": true,
    "commands": {
      "windows": "",
      "macos": "",
      "gnome": "gsettings set org.gnome.desktop.interface enable-animations"
    }
  },
  "screen_reader": {
    "lower_bound": null,
    "upper_bound": null,
    "current": false,
    "commands": {
      "windows": "",
      "macos": "",
      "gnome": "gsettings set org.gnome.desktop.a11y.applications screen-reader-enabled"
    }
  },
  "cursor_size": {
    "lower_bound": 0.0,
    "upper_bound": 128.0,
    "current": 24.0,
    "commands": {
      "windows": "",
      "macos": "",
      "gnome": "gsettings set org.gnome.desktop.interface cursor-size"
    }
  },
  "font_name": {
    "lower_bound": null,
    "upper_bound": null,
    "current": "Cantarell 11",
    "commands": {
      "windows": "",
      "macos": "",
      "gnome": "gsettings set org.gnome.desktop.interface font-name"
    }
  },
  "locate_pointer": {
    "lower_bound": null,
    "upper_bound": null,
    "current": false,
    "commands": {
      "windows": "",
      "macos": "",
      "gnome": "gsettings set org.gnome.desktop.interface locate-pointer"
    }
  }
}

APIs

Server

We defined 11 API endpoints on our server in total, which allow for a range of operations such as registering a new device, managing preferences, facial recognition, etc.

Authentication

Device Management

Preferences Management

Key Exchange Mechanism (KEM)

Face Recognition