Final MOscOW List

Task Priority Complete Contributors
Automatically download data from DesInventar MUST Dekun
Convert xml files to csv files for analysis MUST Dekun
Categorise hazards and combine sub-types of floods, storms and earthquakes MUST ALL
Decide rules to exclude any unreliable data due to report bias MUST ALL
Slice datasets based on the rules MUST ALL
Validate data to make sure each event only got include once MUST DEKUN
Loss exceedance table generator MUST DEKUN
Loss exceedance curve generator MUST HARDIK, DEKUN, JUCHENG
Completed API for GO platform Frontend Developers MUST DEKUN AND JUCHENG
Highlight significant points of loss exceedance curves SHOULD JUCHENG
Users can choose which countries and events they want to generate data for. SHOULD DEKUN
Robust error handling for our system SHOULD ALL
Users can adjust the slicing rule of the data as well. COULD DEKUN
Users can input other datasets to generate curves and tables COULD DEKUN
Users have the option of setting a time range as well. COULD HARDIK, YUHANG AND DEKUN
All other datasets produced will be cleaned and sliced automatically COULD N/A
UI is not needed, as IFRC is integrating on IFRC GO WON'T N/A N/A

Percentage of key(Must & Should) functionalities completed: 100%

With this project, we have fully produced all the key functionalities that we aimed to complete.

Percentage of optional(Could) functionalities completed: 75%

Additionally, most of the optional functionalities have been completed. Users are able to input their choice of dataset for this project, however, we do not have a data cleaning and slicing rule for different datasets so given that the data is cleaned they can enter it into our model.

Individual Contributions

Task Hardik Dekun Yuhang Jucheng
Client Liaison 60 14 13 13
HCI 30 30 30 10
Requirement Analysis 20 20 20 40
Pitch Presentations 25 25 25 25
Data Categorization 5 5 85 5
Data Parsing/Downloading 0 100 0 0
Data Visualization 0 70 0 30
Data Processing 5 85 5 5
Data Analysis 25 25 25 25
Report Writing 52 20 8 20
Report Website 0 0 100 0
Video Editing 5 85 5 5
Non-technical Presentation 100 0 0 0
Overall 25 25 25 25

Evaluation of the project

After completing the documentation for our project, we discussed the quality of our documentation and potential improvements with our client (Justin Ginetti) and IFRC Software Engineers. They unanimously agreed that our documentation was very simple to follow even for someone that has a non-technical background and it is super helpful. Furthermore, they also complimented us for including example.py files which included demonstrations of how to use each package. During our code handover, our clients also highlighted our extensive use of docstrings and comments for each individual class and function. For someone, who has not worked on our project, our documentation makes our program simple to understand and customise.
With our program, we made sure that it was incredibly modular and was separated into components. This ensures, when a part of our code is changed, not many additional changes are needed to different parts of our program, so features can be added quite simply. To ensure the modularity of our program, we utilised design patterns such as Controllers and Adapters, to isolate different components of our program.
CIDAS will run once every month, as that is the time frame for National Loss Databases to be updated. As such, the efficiency of the program is not of significant concern because it is not being executed frequently. Furthermore, the amount of events added each month will be a fraction of the size as the entire dataset does not need to be parsed using very low amounts of memory and processing time.

Feedback from Client

Team 5, which worked on generating estimates of impacts for high-frequency, low-impact natural hazards did a truly incredible job, and exceeded expectations in terms of the quality of the work, their ability to work autonomously and still meet deadlines, as well as their ability to explain their code and how it worked. They took on a complex assignment that required multiple steps of data acquisition, concatenation, cleaning, classification, analysis and visualisation, and they managed to do all of it with much less guidance from me than I had anticipated. They were professional throughout -- responsive to questions, proactive when needing guidance and very communicative -- and they seemed to have organised themselves very intelligently in terms of dividing up the work. To be honest, I'm sorry the project is coming to a close as I'd like to continue working with them.

—Justin Ginnetti (Senior Officer, Information Management and Risk Analysis at IFRC)

Future Work

Given the time constraint was just over 5 months, we have included key functionalities that could be beneficial for the future.

  • Increase the number of datasets used for global coverage.
  • Implement an algorithm that compares the same events across multiple datasets, used to validate the accuracy of each event.
  • Use non-parametric distribution with Deep Learning to model impact.
  • Add a batch_size parameter to the parsing module to ensure that users can choose their own size to parse depending on their machine constraints.
  • Train a model that categorises different event types, rather than writing all the categorizations yourself.
  • Currently, everything is divided in countries, however, having the ability to select regions would be useful as regions could have similar impact trends to countries.
  • Improving error management.