Project with ANCSSC

‘Let’s leave no one behind’: A cloud solution for analysing patterns in NGO projects with a first stage synthetic data generator

Prototype 1 website

The repository containing our code for reportQuery can be found here

The repository containing our code for the NGO Name Extractor can be found here

The repository containing code from Prototype 1 can be found here

The repository containing a lot of now defunct code from our research can be found here

Overview


Problem Statement

‘Let’s leave no one behind’: A cloud solution for analysing patterns in NGO projects with a first stage synthetic data generator


Abstract

The aim of our project is to build a database based on annual NGO reports. This involves data extraction from PDFs, which are mostly in an image format, and storing this data in a database hosted on the Azure cloud. This project is a first step towards synthetic data generation in the future, to produce a general model which can be used to meet the UN’s sustainable development goals.

As part of our project, we are also collaborating with a master’s year team, who are developing a web app for the ANCSSC, by creating a backend in Azure to store their data. This database is a first step towards providing actionable data and predictions to the ANCSSC regarding the progress and efficiency of NGOs operating in the south.


Key Features

  • Our reportQuery PDF extraction system to extract data from NGO reports and store in a corresponding database structure
  • A database structure for use by the ANCSSC for their web app currently being constructed by a Masters team

Future works

Having had an opportunity to discuss applications for out BERT algorithm with John Booth, the senior data steward at GOSH DRIVE, along with conversations with Sheena, we will be including our project’s applications to the healthcare sector as part of our recommendations for future works. This can be read about here.

Video

This is our client video, which gives a good overview of the project.


The Development Team

Rachel Mattoo - Team Leader

rachel.mattoo.18@ucl.ac.uk

Client Liaison, team management, SQL database management, programming

Mark Anson

mark.anson.18@ucl.ac.uk

SQL database management, SQL database connections, programming, website

Yansong Liu

yansong.liu.18@ucl.ac.uk

Programming including BERT development, testing