Clinical Tabular Data QnA Bot

User Interface

Overview

Our project focused mainly on discovering TaPaS model features and improving its accuracy when used in clinical context. Therefore our user interface is just a basic bot that can be accessed from a website. The goal of this bot is to provide users with an easy and not complicated access to interact with the TaPaS model and explore its question answering feature.

The interaction with a bot starts with a welcome message from a bot. After the user's greeting message the bot gives the user a chance to upload a relevant file with medical data and consequently lets the user ask questions about the given information.

Code Integration

All the code implementing the user interface can be found in the index.html file in the source code of the project.

In order to be able to focus on the research side of the project and not designing the interactive platform, we decided to use a React-based Chatroom component for Rasa Stack. We settled to use this library for a number of reasons. First and most important is compatibility with Rasa's REST channel; another one is it supports Text with Markdown formatting, Images, and Buttons, that make the bot-human interaction easy and straight-forward.

To implement the bot we used two plug-ins, namely chatroom.css, to style and layout the web page, and chatroom.js to make the page interactive.

To customise the imported code we set title and welcomeMessage to our own text, describing the purpose of the bot application.

At the time of writing this text our bot, that is all the data, is stored locally and the bot can be accessed through localhost:5005. Later on we will use UCL servers as a hosting platform.

TaPaS

Overview

TaPas is a low-level supervision question-answering model that reasones over tables without establishing logical structures. TaPas identifies a subset of table cells and a likely aggregation operation to be performed on top of them to forecast a minimal programme. As a result, TaPas may learn operations using natural language without requiring them to be expressed in some form of formalism. This is accomplished by adding more embeddings that capture tabular structure to BERT's architecture, as well as two classification layers for picking cells and predicting an aggregation operator.

The extended BERT’s masked language model has been pre-trained on Wikipedia's millions of tables and related text segments. The model's designers also propose a pre-training technique for TaPas, which is important to its ultimate performance.

WTQ Selection

We chose to implement WTQ (WikiTable Questions) to increase certain functionality with TaPas. The model is pre-trained on MLM and an additional step called intermediate pre-training, and then fine-tuned in a chain on SQA, WikiSQL and WTQ. Fine-tuning is done by adding a cell selection head and aggregation head on top of the pre-trained model, and then jointly training these randomly initialised classification heads. It uses relative position embeddings meaning it resets the position index at every cell of the table.

Code Integration

With the help of the Hugging Face Transformers library, we imported the Google WTQ-tuned TaPas model with the TapasForQuestionAnswering method. This simplifies the process as model instances are downloaded through it. As for TapasTokenizer, it is used as a utility for tokenizing information into a TaPas-compatible version for processing and inferencing.

The tabular data is kept in a Pandas Dataframe for inputs (table and questions) tokenizing. In the tokenizing process, max_length is set to 512 as that is the maximum sequence length that the model can take each time. Moreover, truncation is set to True so that the tokenizer will be able to split large tables into chunks of tensors that are smaller than a sequence length of 512 to avoid errors. [1]

By feeding the tokenized inputs into the model function, outputs are returned in the form of logits. In order to make it easier to analyse, it is converted into the predicted coordinates and aggregation indices pair. They represent the answer coordinates list and aggregation value respectively. Aggregation will be explained in the next section.

From the photo above, it can be seen that the cell classification threshold is limited to 0.7. This is raised from the default 0.5 for the model to return answers with higher accuracy, which are those with a prediction score of at least 0.7.

TaPaS Aggregation Indices Conversion Code Chunk

The aggregation indices are computed by the developers, where each index means:

None

No additional computation needed.

The returned answer coordinates translate to their respective floats according to the table.
Summing is required to get the desired answer.

Average

The answers returned are floats and computation has to be done to obtain their average.

Count

The question is related to counting the number of possible answers.
Further counting on the number of returned coordinates is required.

The method shown in the photo is used to convert the indices to their respective string equivalent.

TaPaS Aggregation Computation Code Chunk

In order to carry out the additional computation, the method shown in the photo is created. Counting involves returning the length of the coordinates list, which equals the total number of returned answers.

As for average and sum, the values have to be totaled and processed accordingly. If this process fails, the answers consist of strings which do not support mathematical operations. An error or failing message will be returned so that the user could change their questions or rephrase their original one.

Reference: [1] How to Apply Transformers to Any Length of Text [Online] Available at: https://towardsdatascience.com/how-to-apply-transformers-to-any-length-of-text-a5601410af7f [Accessed: 28 February 2022]