Throughout the development of the HeartBot application, we conducted different tests. We continuously wrote and tested the system on different inputs to ensure the system behaved correctly and as expected. Any tests that failed were investigated and any problems found were immediately patched. After implementing the complete functionality, we wrote multiple unit, integration, and system tests to cover as much of the code as possible.
We wrote unit tests for our functions. We did this using python's unittest. Our unit tests covered our tokenise method, our finding closest matches methods, our check methods and our table name class. By testing different possible inputs, we covered most of the code.
We wrote system tests for the whole system by writing tests for all possible inputs there can be for example greetings, FAQs, retrieval questions from different tables, questions with no answers etc. Together, both system and unit tests provide 99% coverage for our functionality.
Testers include:
Charles - 33 years old researcher interested in key statistics regarding cardiovascular diseases
Michael - 60 years old policy maker, who makes health policies regarding the prevalence of obesity
Amelia - BHF team member, who's job includes answering people's questions about the data in the BHF compendium
Katie - PHD student at UCL, researching data about hospital admits for cardiac arrests
These users were chosen because they are potential users of the Heartbot for example the BHF team who need to query the compendium regularly, researchers who need statistics from the compendium, and policy makers who use this data. None of these users are technology/software experts and they all come from different fields of work.
We came up with 4 test cases that the users would try and give feedback. We gave the users some requirements and a scale to rate them and asked them for comments.
Test Case 1 - We let the users ask FAQs to Heartbot and see if they were satisfied with the responses
Test Case 2 - We asked the users to ask HeartBot data retrieval questions from the BHF compendium and see if they got appropriate data
Test Case 3 - We asked the users to ask questions with no data/wrong questions and see if they got friendly error messages
Test Case 4 - We asked the users to make general conversation with HeartBot like greetings etc and see if they got appropriate responses back
Requirement | Totally Disagree | Disagree | Neutral | Agree | Totally Agree | Comments |
---|---|---|---|---|---|---|
HeartBot runs to end of the job | 0 | 0 | 0 | 0 | 4 | |
Responses are clear and not confusing | 0 | 1 | 0 | 0 | 3 | + the tables displayed as results provide all the information so are very clear - error messages are very generic |
Correct data answers | 0 | 0 | 0 | 0 | 4 | + correct table displayed for questions + correct data retrieved |
Answers received | 0 | 0 | 0 | 0 | 4 | + chatbot answers all questions + question always answered |
Correct FAQ answers | 0 | 0 | 0 | 0 | 4 | + all FAQs answered correctly + correct answers |
Chatbot can make conversation | 3 | 1 | 0 | 0 | 0 | - chatbot cannot answer questions like how are you - chatbot does not make conversation |
Friendly error messages | 2 | 0 | 2 | 0 | 0 | - generic error messages + error messages are friendly but do not state the error - error messages do not define what to do |
Fast response | 0 | 0 | 0 | 0 | 4 | |
Easy to use/navigate | 0 | 0 | 0 | 0 | 4 | + useful help button + clear icons and buttons + easy to use interface like a chat |
We were happy with the users' feedback to HeartBot. The constructive feedback helped us to make improvements to HeartBot to make it better and more suited for the needs of people. We made error messages more detailed and useful and updated the chatbot to be able to make general conversation with the users.