❄️

January

Meetings 8-13

Progress

This month we spent on finishing the MUST requirements of the software engineering part, mainly by improving the search UI. We also built a simple search engine and linked it to the frontend.

Search Results UI

We incrementally improved the search results UI. The first iteration was to include the document’s title as well as keywords in a rudimentary UI.

We then used a CSS Grid and other CSS formatting to make the results look prettier and display additional information such as language and document ID. The colourful icon is supposed to indicate the file type, but is currently just a placeholder.

We improved the search bar design. Also, the file icon now changes depending on the file type (top result: video, bottom result: spreadsheet). In the X5GON dataset that we were given, there was no link to access the resource and no description of the contents other than the keywords. That is why we implemented a call to the X5GON API to fetch the additional metadata for each resource. Now the document title is a hyperlink to the resource and there is a description for every document.

We then added a message underneath the search bar that reminds the user of the search query.

When the search engine does not find any results, the UI displays an error message.

Search Engine

We implemented a basic search engine in Python. The model is based on BM25, which is simple but effective. We implemented it using Pandas dataframes. The search engine is reasonably fast but search results are not particularly accurate. For example, the query “Math” gives a list of multiple results, however, “Maths” does not give any results. In the next iteration we need to work on some error tolerance for search terms.

Linking the search engine to the frontend was relatively straightforward thanks to our API-style architecture.

We are now familiarising ourselves with how to use Elasticsearch. We want to switch to using this technology for the next iteration of our search engine.

Plans

Next month we want to