work

Achievements

A non-exhaustive collection of things we done for the project
- Evaluated the team members’ skills and created a balanced split of roles.
- Completed the tasks sent to us by client i.e. MRC certificates.
- Established basic brief of project and developing action plan and created Trello team/board, Documents for deliverables, stakeholders and resources created.
- Created initial external architectural model.
- Researched and compared Continuous Integration tools.
- Carried out a comprehensive literature analysis on chatbot systems.
- Analysed and developed use cases and scenarios resulting in a high-level overview of interactions with the system.
- Analysed the masters’ dissertation created by one of the people who had previously worked on the project.
- Definition of chatbot dialogues.
- Decision on Python web framework and initial websocket setup using that framework. Created a chat server with long polling web-sockets that mirrors the messages.
- Definition of a messaging JSON format so that we can work on the server-side parsing/creation of the messages.
- Created architectural model of our system.
- Researched and compared Continuous Integration tools.
- Creation of the Chatbot database schema. This is compatible with further adaptation and work on the system, especially including NLP elements.
- Set up continuous integration for the chatbot, and ported to Python 3.
- Integrated all API components together.
- Wrote unit tests for the API to ensure proper functionality.
- Built a Chatbot CLI for testing purposes. Can easily identify issues with input, and distribute to anybody else who needs to test the messaging endpoint.
- Met all of our 'must have' requirements.
- Implemented further functionality of universal keywords like 'help' and 'restart' and 'quit'.
- Deployed chatbot app into Azure.
- Created and edited project website to fit requirements, and following feedback from Yun Fu or our TA.
- Fully functional conversation structural models defined, messaging API meets all requirements.
- Summary API complete and returns summary of a patient's conversation.

Experiment Log

After we had defined an initial database schema for dialogue templates, we carried out some experiments to test the robustness of our schema. Below is a list of the experiments we carried out to verify said robustness.

Click to expand the experiment log from term 1

	Outcome	Participants
Running the Django DB Shell	Received SyntaxError, fixed the issue (A circular Object reference in the Django ORM)	Christoph
Creating a "Question" instance in the Django ORM	Added UUID to the typeFrom our prototype, we inferred that individual messages are best decoupled by referencing them by id.	Christoph
Creating an 'Option'	Options are ways to answer a question. Potentially, we have to move the field 'next question' to Questions since in the case of checkboxes, the next Question will differ depending on the individual option rather than the question	Christoph
Creating a Websocket	We created a WebSocket during a hackathon	Faiz
Creatinng the general message interface as a React component	For a more complex application, React-Redux definitely needs to be added for a cleaner design. Already, the components were starting to be bloated.	Christoph
Creating the frontend part of the websocket as part of the react prototype	Figured out the fields needed for logging messages in our database, and a potential representation of these messages in the frontend by defining the accepted JSON.	Christoph
Chat Server from Scratch	Chat server that echoes the messages we sent to it. This demonsstrated the how user would interact with the server API with multiple channels being present	Emily
Stream API integration experiment	Rocket.chat API behaves almost like twitter's tweepy, the streaming API, So I created a streaming API middleware that mimicks the project that we have with a production ready API twitter provides	Faiz
Middleware integration of a chat server	Integrated slack streaming API with Django to study the behaviour streaming APIs with django and third party libraries like channels and socket.io	Faiz
Database scheme behaviour	A mock dataflow between server and client to check the validity of the scheme and how it fits in our environment	Emily
Creating different types of messages	The types of messages created were: Checkboxes, Quickreply and normal message. From this, we can conclude that it's sufficient to only send the ticked checkboxes to the backend instead of the entire set of checkboxes. In additon, future work should be centered around assessing whether a slider would really work (as dragging is a different interaction than clicking).	Christoph
Running the previous solution	In the first session, we ran the previous year's solution. We deemed it not fit for use, and overengineered.	Faiz

Work Packages : Term 2

List of work packages completed in term 2, along with the contributors to each.

Click to expand

Work Package	Contributors
Client Liaison	Christoph
Website	Emily, All
Research	All
CLI	Emily, Faiz
Final Report	All
React Interface	Chris
Tests	All
Video	Chris
Summary API	All
Chatbot Database Model	Faiz
Chatbot Controller	Faiz
Biweekly Report 5	All
Biweekly Report 6	All
Biweekly Report 7	All
Biweekly Report 8	All

Command Line Interface

CLI for testing and improving our ChatBot

Click to expand
We developed a command line interface to allow interaction with the ChatBot during development. This was built in Python, and interacted with the messaging API endpoint to generate users and conversations.

This was extremely useful throughout development as it allowed for testing of the API, and for the quick implementation of new features which we hadn't previously thought about.

The user can interact with the ChatBot CLI as follows:
- - Run the ChatBot helper Python file in terminal.
- - Message the ChatBot saying anything to trigger a conversation.
- - Answer the multiple choice questions in the format answer1,answer2,...,answern.
- - Answer the text input questions with any valid string.
The screenshot below is the main code of the CLI. The comments describe the logic and flow of how it works, and its interaction with the API.

Summary of Achievements

A table of requested functional requirements and user cases, priority, the completed status and contributors.

Click to expand

Must Have			Should Have			Could Have
The user can respond to chatbot messages.	✔️	All	Integration with the Rocket.Chat messaging platform chosen by the messaging team.	❌	Team working on messaging platform failed to set up.	Use Sentiment analysis to guage severity of user concerns.	❌	Not a priority for the client.
The user can see the previous messages the chatbot sent within the current conversation, and their own replies.	✔️	All	The solution is built using CI.	✔️	Christoph
The user can send and receive messages to/from the chatbot.	✔️	All	Have a method of summarising a conversation.	✔️	Christoph
The user shall be asked about relevant areas of concern (e.g. Physical Concerns).	✔️	All	Documentation outlining how to set up the system.
The user can indicate the problem areas within each ‘concern bracket’, i.e. each category of concerns (physical, emotional etc.).	✔️	All	Documentation outlining use and features of the system.	✔️	Emily
Templates for bot response and user response (single option from list, multiple choice, text input and scale slider).	✔️	All	API documentation describing parameters and usage.		Faiz, Emily
Universal keywords such as 'quit' and 'restart' that the user can type to change the outcome of the conversation.	✔️	Emily

Known Incomplete Features and Bugs

A list of features that are, as of yet, incomplete or bugs we are aware of

- The restart keyword is currently partially functional. On the API side, this feature is working. When interacting with the CLI, however, there are still fixes which need to be made
- The 'help' keyword is not functional

Critical Evaluation

Click to expand

Summary API	The Summary API is designed such a way that it gives all the question-answer pairs in a JSON. This is useful external database, since it is raw data so can be modified in many ways. Since the responses are note edited in the database itself, this data can be easily used for sentiment analysis in the future. This is an example of a potential extension of the bot's capabilities and functionality that we have built our system to make a possibility. The summary API responses can be improved in a way that an external developer who wants to use the API for summarisation and analysing meaning, can interpret meaning directly from the JSON structure itself. For example, you could embed the tree structure of the Questions within a JSON.
Use of templates	Templates were also a great addition to the bot since this takes away the complexity of the context-awareness problem. The only drawback of this is that it greatly reduces the number of ways you can interact with the bot. Introducing more ways to interact with the bot would make the style of conversations more flexible, it will also likely improve user experience.
Simplicity of architecture	The architecture of our application is very technically complex. Comprehension difficulty of the architecture can be problematic, however it is optimal for generating a dynamic conversation from user response. Adding new questions and giving them place within the defined conversational flow is also easy because of the architecture.
Fitness for expandability	Since the question generation is based on the input only, this enables us to easily mix machine learning based conversations with this rule-based model.
Use of feedback	We tested the bot and got user feedback using 2 different client side implementations: A ReactJS app and a CLI client. BOth of them gave critical feedback from the conversation they had with the bot. Several concerns were solved using different methods. This included editing the phrasing of questions which were generated using a script. These iterations using direct testing with user feedback let us do iterations and improve the structure of the API both internally and on the client side.
Fitness of API	The ChatBot API and both endpoint provided are 100% functional. We have solved all the known bugs upto the day of writing this.
Fitness of architecture	Since the API is build with a Client-Adapter-Server setup, any 3rd party clients can be integrated with a platform using a single adapter. This adapter can live in a microservice environment where it only has to worry about translating client-server response into the specified format in the documentation. This app is perfectly integratable with any clients because of this architecture.
Documentation	The entire code base is documented with special technical documentation with technical diagrams to help with design choices and special attention is given to spaces where improvements can be introduced. This also has API documentation, and the codebase has docstrings provided with all the functions to help with further maintenance.
Testing	We believe that our application is sufficiently unit tested. Moreover, we have performed some testing whether our dialogue model is sufficient to be accepted by patients. After this, we rephrased some questions. However, we have only performed this testing on a limited number of people, and was only performed once, due to time constraints. Future work should definitely have an eye on providing more iterations on improving the dialogue models. One way to make this a reality is to create a focus group at UCLH for this purpose.

Future work

"If we had six more months", we would be able to tackle some of the below projects. Generally, we believe that our main achievement was to concretely envision how to tackle a chatbot project for the medical field. Our main deliverable, apart from the actual implementation of our proof-of-concept, is thus the ideas presented on how to further develop this chatbot. Thus, the below ideas are of special importance to us.

Expose the chatbot to a more diverse audience

Firstly, we would perform more representative acceptance testing: So far, our testing group only consisted of highly-educated people, mostly from Computer Science or Medicine. While we did ask other people to test our application (e.g. our parents), we believe that this sample set is not representative of the entire population. Within the existing time, we have not been able to create ties with 'real' patients. A longer project, or continued work on this project should definitely tackle this issue and establish a more representative group of users for acceptance testing. This should be seen as a constant feedback process and be a chance to improve on the dialogue model.

Development of interface for editing dialogue models

Secondly, we would tackle the development of a doctor interface for editing the dialogue model and viewing results of chatbot sessions. While we believe that the invention of such a system as we have built in this context was an appropriate workload for the time we were given, to move the project from proof-of-concept to full solution. Especially the idea of having domain experts edit the dialogue model for improved rapid development is a novel idea. To 'sell' this vision, we have created mockups for part of the doctor interface.

Integration with a third-party messaging application

Thirdly, we would give our best to integrate with Rocket.Chat. Given the additional six months of development time, we are sure that the other team, which was responsible for hosting the PEACH Messaging platform, would have been able to deploy the fork from the repository, rather than only the container. Based on this assumption, we would be in the clear to finish writing our adapter to Rocket.Chat in the time window after that.

Development of a chatbot management interface

Fourthly, we would have invested time in developing an interface for doctors to 'schedule' a chatbot session. While this feature was never a requirement for our work, our client was clear that in the long-term vision of the bot, at some point, he wants to simply 'prescribe' chatbot sessions, just as one would prescribe a physical checkup. Working on the project for another six months would mean implementing such a system in part, or entirely.

Interface for analysing dialogues

Lastly, we would make more research into knowledge extraction from the data provided by the user. With our Summary API, one is already able to start work on this right now, conforming to our belief that good work means that future people can start working right away. Yet, we are unclear about the extent to which knowledge can be extracted at the moment, and how to build such a system, such that - ideally - the chatbot would simply 'fill in' the patient record automatically. Concretely, we have doubts about whether in the future, diagnosis and knowledge extraction will happen by the same subsystems, or whether they will be separate. Additionally, we believe that natural language processing is not accurate enough to cover the large body of medical knowledge with sufficient precision at this stage, and that any attempt to cover the entire spectrum is not possible, unless significant resources are devoted to the project.
Yet, one way to reduce the amount of resources and to create the system within a shorter timeframe, a human-in-the-loop system, such as described with our doctor interface that provides chat data that humans, in turn, use to fill in the patient health record will be more feasible. Another alternative would be a hybrid system, in which a human confirms or denies categorizations made by the chatbot This is ultimately the reason why we did not invest heavily in NLP now. Yet, with more time, an implementation of a hybrid solution could come into scope for our project.

Group 40- PEACH Chatbot Systems Engineering Project

Expose the chatbot to a more diverse audience

Development of interface for editing dialogue models

Integration with a third-party messaging application

Development of a chatbot management interface

Interface for analysing dialogues

Group 40- PEACH Chatbot
Systems Engineering Project