Requirements

Project Background

We are working with the Alliance of NGOs and CSOs for South-South Cooperation (ANCSSC) to assist in their efforts to improve and assist NGOs within their remit. The ANCSSC's mission is to "enhance civil society’s understanding of the value of South-South Cooperation in developmental, humanitarian and related spheres." They achieve this particularly through their focus on their Sustainable Development Goals (SDGs).

Broadly, this project aims to create a way in which the ANCSSC can review and compare the effectiveness and sustainability of NGOs and their projects, this would be through an algorithm that can firstly gather relevant data from NGOs, either data pulled from annual reports or entered manually through a created form into a database, and then an algorithm to compare and contrast this data between NGOs.


Requirement gathering

The requirement gathering for this project been quite a complex journey. The original problem statement was left deliberately vague to allow movement in what particular direction we took. After a lot of back-and-forth between ourselves and multiple different sources including our clients, machine learning specialists, and members of the Computer Science Department, notably Dean Mohamedally. Originally our project was going to be more focused on the creation of a Generative Neural Network in order to build a dataset in order to model the success of the NGOs. As we have continued to gather and understand our requirements, we and our clients realised that the major obstacle for us to overcome is in fact the data gathering from the NGOs. In particular, this involves extracting relevant information form annual reports of NGOs working under the ANCSSC, as well as building a form and corresponding database structure so that data can be added by ANCSSC employees.

We have found it a challenge to truly nail down our requirements this term. And is is expected that things could still change in term 2 as we continue to seek feedback from our clients and other sources of expertise. We believe that we have reached a good understanding with our clients as to what is expected of us throughout this project.

Sources of requirement gathering such as the use of surveys are not possible in our case. The user-base of our software is incredibly niche, being mostly consisting of ANCSSC employees. We hope that during term 2 we will be able to meet and interact with others from the ANCSSC in order to widen our understanding of what they do and also to continue to refine our understanding as to how we can help them.


Persona






Project Goals

Initially the goal of this project was focused around building this Generative Adversarial Network (GAN) in order to create a decision-making tool for the UN. Although we would love to attempt this, after long consultations with our clients and members of staff at UCL, we have scaled back our ambitions to focus on building the dataset. Fundamentally, we want to be able to collect and store as much data about these NGOs as possible, making it possible for a future team, such as a group of UCL masters students, to come in and build the model. Building the dataset is still a fundamental part of data science and is certainly complex enough for this project.

Therefore in overview, we will build two database structures, one to contain data collected via a form submitted by NGOs, and the second to contain data scraped from NGO reports, which is essentially a collection and synthesis task from unstructured data.


MoSCoW List

ID Requirements Priority Completed status Contributors
1 Data collection form for NGOs
  • Form built in Microsoft forms or manual
  • Server on azure to handle incoming form data and store in database 1
MUST Completed preliminary form layouts Mark, Rachel
2 Database 1 (for collecting form data)
  • Fields to contain information via the submitted form
MUST Database created and designed database in Azure Mark
3 PDF data extraction tool
  • Extract relevant information from report PDFs from NGOs
  • Such Information may include but is not limited to: financial information, project details, location, staff records
  • Using Azure's Cognitive Services
MUST "Phase one" of development progress Yansong, Rachel, Mark
4 Database 2 (UN "Knowledge-base")
  • Fields to store information from PDF extraction tool
MUST Todo -
5 Front end access to database 1
  • Through web-app
  • Hosted on Azure
SHOULD Todo -
6 Front end access to database 2
  • Through web-app
  • Hosted on Azure
SHOULD Todo -
7 Server unit to contain PDF extractor
  • Mechanism to upload new PDF documents
  • Accessible from web-app
  • Mechanism to send extracted data to database
SHOULD Todo -
8 Statistical analysis tool
  • First-stage analysis of data we collect
  • Could be used as part of, or be superseded by, the Machine Learning based recommendation algorithm.
COULD Todo -
9 Build generative adversarial network (GAN)
  • Utilises both database structures to create lots of similar data
  • Allows modelling of the data to be undertaken
COULD Todo -
10 Machine Learning based recommendation algorithm
  • Can generate synthesised UN reports
  • Provides recommendations to the UN, particularly around which NGOs should receive more or less funding
COULD Todo -
11 Link database 1 & 2 into a single coherent structure COULD Todo -
12 Build app in order for individuals in and outside of the ANCSSC to access and review our data
  • Includes graphs and other visual media to explain different SDGs
WOULD LIKE Todo -