⛷️

February

Progress

This month we implemented search filters, user login for the frontend and integrated Elasticsearch into the application.

Search filters

We implemented filters that allows the user to decide what kind of documents they would like to see. The two types of filters are language and file type.

User login

One requirement is to implement “user account functionalities”. This month, we implemented a login and register route for the website: new users can register for an account and existing users can user their credentials to login.

We perform user authentication using JWT token. User data is stored in a database with the schema show below.

Elasticsearch

We implemented BM25, LM Dirichlet and other ranking algorithms in python and Elasticsearch. We integrated Elasticsearch into our program and now the search engine frontend shows results using the new search engines.

Soon we will start testing the different ranking algorithms. To facilitate the testing process, we performed some preparation.

We received access to the Spotify podcast dataset. We downloaded it and cleaned the data, setting it up for testing.

We also started writing efficiency testing code for the search engine using built-in modules from Elasticsearch.

Plans

Next month will be the final month of the project, so we have to finish everything up.

We are planning to