February

Client meetings

13th February

24th February

Progress

Most of our requirements were finished by the end of February. We discussed the best libraries and algorithms to use to extract the different pieces of information. We then decided that using a Question-Answer(QA) model to extract 9 out of 14 of the requirements:

What is the Country of Disaster?

What is the Operation Start Date?

What is the Operation End Date?

What is the number of people affected?

What is the number of people assisted?

What is the Glide Number?

What is the Operation n°?

What is the Operation Budget?

What is the Host National Society?

The QA model we used was was the ‘Roberta” model from HuggingFace. We compared the accuracy of a few other models, which we analysed in our evaluation, however, the Roberta model performed the best with answering our questions.

With our next 2 requirements(Admin 1 locations, Admin 2 locations), we went back to using the spaCy model, as it performs well with extracting Geopolitical entities(GPE). In terms of differentiating between the Admin1 & 2 locations, this is done when linking to ISO code.

For the 12th requirement, finding ISO codes of the Admins & P-code of Country, our clients gave us 2 excel sheets with a list of Admins and their corresponding ISO codes. We then used the ‘Pandas’ package in python to pattern match the extracted locations and, in turn, their correct ISO code. We did this comparison using the Fuzzy matching library in Python.

The 13th requirement was to have details of the operational strategy extracted in the DREF from. Operational strategy is how the Red Cross responded to the emergency at the time of appeal, and the actions taken to help beneficials. However, this requirement was one that the team struggled with. This is due to the fact that the operational strategy is usually outlined in tabular form, and also it is found in varying parts of the document from form to form. We decided to put this at the bottom of our priority list, as it was a “Could have”, and we had already met all other requirements, though we may still come back to it.

The final requirement was to have a User Interface(UI), with which you can drag and drop the document for extraction. The extracted data would then be displayed on the following screen. We had finished building the interface using Tkinter during this month. After showing it to our TA, we were given some usability feedback, and guidance as to how to make it more user friendly.

By the end of this month, we had Integrated the front & back-end, though there were still some bugs that needed to be fixed. Overall, the major bulk of out project had been completed. We continued to have weekly team meetings to keep each other on track. The team had remained on track, and planned to tie up any loose ends during the month of March.

24th February Client Meeting minutes:

Next steps

Improve UI

Fix bugs with integrated system

Finish portfolio website

Finish project & Hand over to clients.