We are working with the Alliance of NGOs and CSOs for South-South Cooperation (ANCSSC) to assist in their efforts to improve and assist NGOs within their remit. The ANCSSC's mission is to "enhance civil society’s understanding of the value of South-South Cooperation in developmental, humanitarian and related spheres." They achieve this particularly through their focus on their Sustainable Development Goals (SDGs).
Broadly, this project aims to create a way in which the ANCSSC can review and compare the effectiveness and sustainability of NGOs and their projects, this would be through an algorithm that can firstly gather relevant data from NGOs, either data pulled from annual reports or entered manually through a created form into a database, and then an algorithm to compare and contrast this data between NGOs.
The requirement gathering for this project been quite a complex journey. The original problem statement was left deliberately vague to allow movement in what particular direction we took. After a lot of back-and-forth between ourselves and multiple different sources including our clients, machine learning specialists, and members of the Computer Science Department, notably Dean Mohamedally. Originally our project was going to be more focused on the creation of a Generative Neural Network in order to build a dataset in order to model the success of the NGOs. As we have continued to gather and understand our requirements, we and our clients realised that the major obstacle for us to overcome is in fact the data gathering from the NGOs. In particular, this involves extracting relevant information form annual reports of NGOs working under the ANCSSC, as well as building a form and corresponding database structure so that data can be added by ANCSSC employees.
We have found it a challenge to truly nail down our requirements this term. And is is expected that things could still change in term 2 as we continue to seek feedback from our clients and other sources of expertise. We believe that we have reached a good understanding with our clients as to what is expected of us throughout this project.
Sources of requirement gathering such as the use of surveys are not possible in our case. The user-base of our software is incredibly niche, being mostly consisting of ANCSSC employees. We hope that during term 2 we will be able to meet and interact with others from the ANCSSC in order to widen our understanding of what they do and also to continue to refine our understanding as to how we can help them.
Initially the goal of this project was focused around building this Generative Adversarial Network (GAN) in order to create a decision-making tool for the UN. Although we would love to attempt this, after long consultations with our clients and members of staff at UCL, we have scaled back our ambitions to focus on building the dataset. Fundamentally, we want to be able to collect and store as much data about these NGOs as possible, making it possible for a future team, such as a group of UCL masters students, to come in and build the model. Building the dataset is still a fundamental part of data science and is certainly complex enough for this project.
Therefore in overview, we will build two database structures, one to contain data collected via a form submitted by NGOs, and the second to contain data scraped from NGO reports, which is essentially a collection and synthesis task from unstructured data.
ID | Requirements | Priority | Completed status | Contributors |
---|---|---|---|---|
1 |
Data collection form for NGOs
|
MUST | Completed preliminary form layouts | Mark, Rachel |
2 |
Database 1 (for collecting form data)
|
MUST | Database created and designed database in Azure | Mark |
3 |
PDF data extraction tool
|
MUST | "Phase one" of development progress | Yansong, Rachel, Mark |
4 |
Database 2 (UN "Knowledge-base")
|
MUST | Todo | - |
5 | Front end access to database 1
|
SHOULD | Todo | - |
6 | Front end access to database 2
|
SHOULD | Todo | - |
7 | Server unit to contain PDF extractor
|
SHOULD | Todo | - |
8 | Statistical analysis tool
|
COULD | Todo | - |
9 | Build generative adversarial network (GAN)
|
COULD | Todo | - |
10 | Machine Learning based recommendation algorithm
|
COULD | Todo | - |
11 | Link database 1 & 2 into a single coherent structure | COULD | Todo | - |
12 | Build app in order for individuals in and outside of the ANCSSC to access and review our data
|
WOULD LIKE | Todo | - |