ID | Requirement | Priority | Status | Contributors |
---|---|---|---|---|
1 | Testing for finding out the framework with the highest accuracy | Must have | √ | All |
2 | A "Upload Data" page | Must have | √ | All |
3 | A "Start Verifying" page | Must have | √ | All |
4 | A "Home" page | Must have | √ | All |
5 | A "Processing" page | Must have | √ | All |
6 | A "User Verified" page | Must have | √ | All |
7 | A "Submit" page | Must have | √ | All |
8 | A "Consent" Page | Must have | √ | All |
9 | Literature review | Must have | √ | All |
10 | Comparisons between different feature extractions | Must have | √ | All |
11 | Comparisons between dimensionality reduction | Must have | √ | All |
12 | Comparisons between diufferent learning models | Must have | √ | All |
13 | Database | Must have | √ | All |
14 | File Uploader | Must have | √ | All |
15 | Audio Recorder | Must have | √ | All |
16 | Feature extraction: MFCC | Must have | √ | All |
17 | Modeling: HMM | Must have | √ | All |
18 | Modeling: GMM | Must have | √ | All |
19 | Front-End: Modify with AJAX | Must have | √ | All |
20 | Use Existing Code (libraries and APIs) | Must have | √ | All |
21 | Comparisons between different frameworks | Should have | √ | All |
22 | Speaker Recognition VS Biometric Feature Extractions | Should have | √ | All |
23 | A "Recent verifications" page | Could have | In Progress... | All |
24 | Feature extraction: PLP | Could have | In Progress... | All |
25 | Feature extraction: LPC | Could have | In Progress... | All |
26 | Optimisation - Check 2 or 3 digits of the number to find the user faster | Could have | x | - |
27 | {To be added} | Won't have | x | - |
ID | Bug Description | Priority |
---|---|---|
1 | When the sample rate of the input file is high, the output MFCCs have some invalid values. | Medium |
2 | The number of trials that not corresponding audio and model matching successfully is higher than expected | High |
Work Packages | Ruo Chen | Sabina-Maria Mitroi | Jingze Xu |
---|---|---|---|
Client Liaison | 33% | 33% | 33% |
Biweekly Reports | 33% | 33% | 33% |
Requirement Analysis | 33% | 33% | 33% |
Research and Experiments | 33% | 33% | 33% |
UI Design | 33% | 33% | 33% |
Coding | 33% | 33% | 33% |
Video Making | 33% | 33% | 33% |
Report Website | 33% | 33% | 33% |
Poster Design | 33% | 33% | 33% |
Testing | 33% | 33% | 33% |
Overall Contribution | 33% | 33% | 33% |
Main roles | Researcher, Back-end developer | Front-end developer, UI designer | Researcher, Back-end developer |
The user interface of our web app has all needed parts and has a minimalist design. However, the style of different pages varies. It we have more time, we can unify the style.
The app has all must-have and should-have functionality. There is still one could-have functionality, which is the recent verifications search.
The system is stable. We haven't found any unstable loophole so far. The verification process is also stable.
The efficiency of the algorithms MFCC + GMM is high enough for the client's need. As we compute, the speed of uploading and training models is about 60 times faster than the minimum standard.
Our web app can run smoothly on different Operating Systems.
Our code has good maintainability. We did several times of rafactor and the parameters of the algorithms can be easily changed.
We met very often and we discussed and devided the workload before the end of every meeting, which was good to manage the progress. We sent bi-weekly reports on time. And we also made regular video meeting with our clients from time to time. We had normal version control using GitLab. It was not allowed to push to the origin master directly in order to make the codebase clean. We pushed to our branch and then merge it to master if there was no conflicts.
If we have another 6 months, we will improve
the UI design and finish our unfinished algorithms.
More functionality will be added like recent
verification search. The front-end will have more
details. This project can become a speaker identification
project rather than just verification.
Moreover, in order to make the prototype work on
PC and also on a phone, we were thinking of using
Java Telephony API (JTAPI). The Java Telephony
API (JTAPI)
supports telephony call control.
It is an extensible API designed to scale for
use in a range of domains, from first-party
call control in a consumer device to third-party
call control in large distributed call centers.
In order to make it work on a phone call, we
will require an app running on a phone and a
JTAPI running on a PC into the phone line and,
as you make the connection, a wav file is
generated to the user and it will automatically
parse it into the prototype in Python.