Requirements

Project background, clients and goals

The use of voice assistants for interacting with the technology in the world around us is becoming increasingly ubiquitous; however, existing options are dominated by cloud-based voice assistants. There is a significant need for a federated, privacy-safe voice assistant framework that developers would be able to adapt and extend in their own projects for use in a multitude of settings, for example, health and social care, combatting social isolating, childcare, retail and so on.

In partnership with our clients – Prof John McNamara (IBM), Prof Joseph Connor (NHS), Dr Dean Mohamedally (UCL), Dr Graham Roberts (UCL), Sheena Visram (UCL) and Alan Fish (Apperta Foundation) – we have been aiming to build Ask Bob: a customisable, open-source framework for developing federated, privacy-safe voice assistants.

Our primary goal has been for voice assistants built with our proof-of-concept Ask Bob framework to be able to perform all speech and data processing locally in order to help safeguard users’ privacy. We have been aiming for it to have a modular skills plugin architecture so that third-party developers in the future could package together related ‘skills’ to extend the functionality of the voice assistant framework. Moreover, we have also been aiming to assemble a proof-of-concept ‘smart speaker’-style device running Ask Bob with a few plugins for demonstrative purposes.

We also set the goal of making plugin creation more accessible to developers by accompanying the framework with a configuration generator progressive web app which may also be installed and used offline, and which also stores and processes plugin configuration data locally. This utility would be able to help developers produce valid plugin configurations.

Later into the project, we additionally set the additional goal of building a ‘skills viewer’ progressive web app, which would allow administrators to view the skills they have installed on a running instance of an Ask Bob voice assistant server sorted by plugin or category.

Requirement gathering

In order to identify and then prioritise our requirements, we interviewed both our clients as well as friends representing potential use case environments: care homes, the NHS, retail and childcare settings. We chose to conduct recorded, semi-structured interviews as they offered us greater flexibility to probe deeper and encourage our interviewees to elaborate on crucial details as compared to alternatives, such as online surveys.

We based our interviews on a unified team framework consisting of a shared interview template format. This allowed us to work more efficiently and effectively as we shared a consistent common language for collating and comparing our research findings. After the initial round of requirement gathering, we continued to clarify, validate and update our model of users’ needs within our sprints.

We have included further detail on the requirements we initially uncovered through interviews with our clients on our development blog:

Personas and scenarios

Carol Davidson - voice assistant user in a supermarket

Jim Wilson - voice assistant administrator in a hospital

Use cases

skills_viewer_use_case.png

config_use_case.png

ID Use case
1 Defining plugin info
2 Defining entities
3 Defining slots
4 Defining intents
5 Defining synonyms
6 Defining regexes
7 Defining lookups
8 Defining responses
9 Defining skills
10 Defining stories
11 Exporting JSON

MoSCoW requirements

Functional requirements

  Requirement
MUST Local speech and personal data processing when in interactive mode to help safeguard users’ privacy
MUST An accessible index of installed voice assistant skills
SHOULD A configuration generator progressive web app to help non-experts to develop plugins
COULD A ‘skills viewer’ web app to more visually inspect plugin skills installed on an Ask Bob server
COULD Multilingual support through the ability to use non-English language models

Non-functional requirements

  Requirement
MUST An open-source, federated architecture
MUST Potential operation on a proof-of-concept low-power prototype device (e.g., Intel Core i3 NUC)
MUST A plugin system for administrators to install additional voice assistant skills
SHOULD The capability for third-party plugins to interface with external APIs
SHOULD Integration with team 25’s project (FISE concierge) – providing a voice assistant interface to concierge web services within their Android app that is privacy-safe when run on a private local area network
SHOULD Integration with team 38’s project (FISE video conferencing) – providing voice assistant functionality for their ‘video conferencing lounge’ that is privacy-safe when run on a private local area network
COULD A drag-and-drop-style interface to specify simple skills in the configuration generator web app to improve usability