1.   Summary of Achievements


Achievements Table
ID Requirement Priority Status Contributors
1 Testing for finding out the framework with the highest accuracy Must have All
2 A "Upload Data" page Must have All
3 A "Start Verifying" page Must have All
4 A "Home" page Must have All
5 A "Processing" page Must have All
6 A "User Verified" page Must have All
7 A "Submit" page Must have All
8 A "Consent" Page Must have All
9 Literature review Must have All
10 Comparisons between different feature extractions Must have All
11 Comparisons between dimensionality reduction Must have All
12 Comparisons between diufferent learning models Must have All
13 Database Must have All
14 File Uploader Must have All
15 Audio Recorder Must have All
16 Feature extraction: MFCC Must have All
17 Modeling: HMM Must have All
18 Modeling: GMM Must have All
19 Front-End: Modify with AJAX Must have All
20 Use Existing Code (libraries and APIs) Must have All
21 Comparisons between different frameworks Should have All
22 Speaker Recognition VS Biometric Feature Extractions Should have All
23 A "Recent verifications" page Could have In Progress... All
24 Feature extraction: PLP Could have In Progress... All
25 Feature extraction: LPC Could have In Progress... All
26 Optimisation - Check 2 or 3 digits of the number to find the user faster Could have x -
27 {To be added} Won't have x -
Known Bugs List

IDBug DescriptionPriority
1When the sample rate of the input file is high, the output MFCCs have some invalid values.Medium
2The number of trials that not corresponding audio and model matching successfully is higher than expected High

Individual Contribution Table

Work PackagesRuo ChenSabina-Maria MitroiJingze Xu
Client Liaison33%33%33%
Biweekly Reports33%33%33%
Requirement Analysis33%33%33%
Research and Experiments33%33%33%
UI Design33%33%33%
Coding33%33%33%
Video Making33%33%33%
Report Website33%33%33%
Poster Design33%33%33%
Testing33%33%33%
Overall Contribution33%33%33%
Main rolesResearcher, Back-end developerFront-end developer, UI designerResearcher, Back-end developer


2.   Critical Evaluation


User Interface

The user interface of our web app has all needed parts and has a minimalist design. However, the style of different pages varies. It we have more time, we can unify the style.

Functionality

The app has all must-have and should-have functionality. There is still one could-have functionality, which is the recent verifications search.

Stability

The system is stable. We haven't found any unstable loophole so far. The verification process is also stable.

Efficiency

The efficiency of the algorithms MFCC + GMM is high enough for the client's need. As we compute, the speed of uploading and training models is about 60 times faster than the minimum standard.

Compatibility

Our web app can run smoothly on different Operating Systems.

Maintainability

Our code has good maintainability. We did several times of rafactor and the parameters of the algorithms can be easily changed.

Project Management

We met very often and we discussed and devided the workload before the end of every meeting, which was good to manage the progress. We sent bi-weekly reports on time. And we also made regular video meeting with our clients from time to time. We had normal version control using GitLab. It was not allowed to push to the origin master directly in order to make the codebase clean. We pushed to our branch and then merge it to master if there was no conflicts.

3.   Future Work

If we have another 6 months, we will improve the UI design and finish our unfinished algorithms. More functionality will be added like recent verification search. The front-end will have more details. This project can become a speaker identification project rather than just verification.

Moreover, in order to make the prototype work on PC and also on a phone, we were thinking of using Java Telephony API (JTAPI). The Java Telephony API (JTAPI) supports telephony call control. It is an extensible API designed to scale for use in a range of domains, from first-party call control in a consumer device to third-party call control in large distributed call centers. In order to make it work on a phone call, we will require an app running on a phone and a JTAPI running on a PC into the phone line and, as you make the connection, a wav file is generated to the user and it will automatically parse it into the prototype in Python.