March
Progress
This month we optimised user data collection, created python modules for implementing and testing ranking algorithms and performed algorithm testing. We also optimised search performance.
User data
We implemented user data backup on local storage. This optimises performance.
Privacy and security are not the main concerns of our application, since it is only a prototype and not meant for commercial deployment. Still, we are trying to develop a safe application that respects user data. Therefore, we now encrypt user login credentials on the backend.
Testing algorithms
To set up testing our ranking algorithms, we created a Python module dedicated to this for Elasticsearch algorithms. We also created a Python module for implementing Learn-To-Rank algorithms in RankLib and integrated it into the testing module.
Using both modules, we performed testing on several classic and Learn-To-Rank ranking algorithms. Testing was performed on labelled query data from the Spotify dataset.
We integrated an LM Dirichlet search engine into the main project with Elasticsearch.
We are also writing a report on the findings from our testing which we will submit as part of the final project submission.
Optimisations
We observed that it always took several seconds for search results to load on the frontend. Taking a closer look, we noticed that most of this waiting time resulted from the API calls to the X5GON API for retrieving additional document data. We performed API calls one after the other, and with one API call taking 0.5 seconds, this added up to a long delay.
We implemented parallel calls to the X5GON API using the requests-futures Python module. This made search results load 3x faster than before, significantly improving the applicationās performance.
Deployment
We containerised each application of our system using Docker. We then deployed these Docker images using a Kubernetes cluster. The application can now be accessed using its IP address on the web.
Plans
The project has concluded and we will submit everything in the coming days.