The Front-End is implemented by HTML, CSS and javaScript.
Django is in charge of handling the requests and
communicating with the database. Machine learning
techniques are used to conduct the speaker verification.
There are three main components of the algorithms,
feature extraction, speaker modeling and feature matching.
Feature extraction is used both in entering and matching
processes, where entering is to train a voice model using
as much audio as possible from one single person, whose
features are identified and extracted by some machine
learning algorithms like MFCC.
Those extracted features are represented in high-dimensional
vectors, which would be simplified in terms of dimension and
modeled using statistical model such as HMM. The result is
stored in database. Feature matching process requires specific
algorithms as well, which produces similarity between the
uploaded audio and the voice model in the database. Decision
of accepted or rejected is made according to the similarity
number generated. In the database, user details and paths to
models files are stored.
1. File uploader with Drag-and-Drop support
2. Auto type convertor for any type of audio file
3. Audio Recorder (recording, replay, download and upload)
4. Data uploading for training models
5. Speaker verification using MFCC and GMM algorithms