Implementation Methodology

At the beginning of the project, we had several interviews with our client, James Baker (digital curator of the British Library), in order to make sure that we understood the problem thoroughly before beginning implementation. Once we understood the problem, we broke it down and discussed priorities of the proposed system [The requirements and their priorities can be seen in the requirements section]. Throughout the project, we used an AGILE approach because we split the development of the project into many iterations. In the beginning of each iteration, we discussed the progress of the project and what needed to be done next. This approach was good for the project because it guaranteed that our project implemented the client’s requirements successfully. Our client changed his requirements many times during the course of the project which is another reason why an AGILE approach was useful. If the traditional waterfall model was used, then we would have delivered a project that the client may not like. An example of a changing requirement is the design of the system, since we had to modify the user interface many times before he was satisfied.


Technologies Used

NodeJS

Node.js is an asynchronous framework that is event-driven. We used this as the back-end for the main part of our project. There were many reasons why we chose NodeJS as our back-end framework, some of these include:

  1. JavaScript is used which is a dynamic language. Web development that are made using a dynamic language on a virtual machine is very fast. It is quicker than the other alternatives such as Python or Ruby.
  2. Node.js allows us to use JavaScript on the browser as well as the web server. This allows code to be shared between the client and the server, so duplicate code is not used.
  3. Our project is event-driven - Many of the features rely on certain events being triggered and Node.js is an event-driven language. Therefore, it is useful for our project.
  4. Node.js can handle many connections happening at the same time within a single process - Our project relies on handling thousands of concurrent connections, so it is useful to have a framework that can handle many concurrent connections.
  5. Our team already knows JavaScript, so it is easier for us to use Node.js. As a result, we were able to deliver the project more efficiently than if we had used a different language.


Heroku

Our client, James Baker, wanted us to deploy the application to a URL so that it becomes accessible to the British Library. In order to do this, we used Heroku.Our team used Heroku because:

  1. It is free - the British Library did not have a budget for us to spent for this project.
  2. No time trial limit - many of the other alternatives have a specific time trial limit, whereas Heroku does not have a free trial (with a time limit).
  3. Very easy to deploy Heroku applications - there is extensive support online for using the Heroku platform for deployment.


Bootstrap

Bootstrap is very powerful framework for front-end development. Bootstrap has built-in features which improved our user interface for the website as well as the user experience. Bootstrap allowed our project to have responsive design and to have consistent user interface elements. This was crucial to the development of our project.


jQuery

jQuery is a JavaScript library that was used in order to run the Bootstrap framework for our project.


MongoDB

All of the images that we have processed throughout the project are stored in Flickr. However we had to tag each image (and store the tags associated with each image). We used MongoDB to store all of the tags of the images that we dealt with. One of the main reasons why we used MongoDB over alternatives such as MySQL is that the data-set that we dealt with during this project is very large (we have data that stores information about thousands of images). MongoDB uses NoSQL which can more efficient than MySQL at dealing with big data-sets.


AlchemyAPI

AlchemyAPI is a company that was founded in 2005. Their API uses machine learning in order to allow users to do tasks such as image tagging. After conducting research on various machine learning APIs that do image tagging, we found that AlchemyAPI is one of the most accurate and it is compatible with Node.js.


ImaggaAPI

Our team used two APIs for image tagging, AlchemyAPI and Imagga API, in case one of the APIs correctly tags images using tags that the other one did not use. Imagga API was used because it provides more tags than AlchemyAPI and the tags are generally quite different.


Testing Utilities for NodeJs

  1. Mocha is a JavaScript framework that runs on the browser as well as on Node.JS. There is a lot of support available for Mocha since it is well-tested and maintained. It was used to run and evaluate specific parts of the application using unit-testing.

  2. Should.js and Supertest are libraries that are used to compliment Mocha in testing the different features of the project. Both of them have different purposes though:

    • Supertest was used to test the router.
    • Should.js is an assertion library that is used to make assert statements easier during the testing phase.


Python

There were two main parts that were implemented during this project. The part of the project that was dedicated to machine learning was written in Python. This is because:

  1. Python is powerful, so the processing for the machine learning algorithms is quick.
  2. Many of the libraries that we needed at the beginning of the project relies on Python. Therefore, we were able to take advantage of these features.
  3. There is a lot of support available for potential problems that occur in Python.


Numpy and SciPy

Numpy is the successor to other libraries such as Numeric and NumArray. This library was used for functions that utilised numerical analysis. The library was essential because numerical analysis needed to be done in order to do the image processing correctly. SciPy is open-source software that is mainly used for scientific calculations. It is similar to Numpy, but most of the scientific numerical calculations were performed using SciPy.


OpenCV

OpenCV is another library that we utilised for the implementation of the project. OpenCV is a very popular library that includes functionality to handle image processing. The specific features that we used for OpenCV are:

  1. Inputting and outputting images appropriately so that they are handled by Python.
  2. The use of Gabor filters - these Gabor filters were mainly used to detect edges.


WEKA

Weka is a machine learning library that was utilised throughout the implementation of the project. The main purpose of using Weka is to include feature vectors that were required for each image stored on Flickr (in order to classify each image with a particular tag).


Flickrpy

All of the images that we had to process and tag during this project were stored in Flickr. Therefore, we needed to access information about all of the images using the Flickr API. The following information was extracted using the API:

  1. The URL of each image
  2. The author of the book (that the image is contained in)
  3. The date of publication of the book (that the image is contained in)
  4. The number of pages of the book (that the image is contained in).


Collaboration

Git and Github

We used Git to allow us to share and collaborate our code during the project. We used GitHub as the host as well. There were several reasons why we decided to use GitHub:


Prototype Design - Development of User Interface

First Prototype

The following image is for the search by tags page: Description of first screen for the prototype:

The second screen for the same prototype is browsing the data-set: Description for browsing page of the first prototype:

The last page for our first prototype is the “Similar Images” page where users can view images that are similar to the one that they selected: Description for this page:

Evaluation of First Prototype

When this prototype was tested on users, we received the following feedback:

Second Prototype

We made another design prototype, the first page for this prototype is the main page where we can search for images. The following image is the design for this page: The reasons for this choice are:

The reasons for this choice of UI design for this page (results page) are because: The last page for this prototype is the image details page. This page appears when the user clicks on a specific image. The following image shows this: The reasons for choosing this UI design for the image details page are:

Evaluation of Second Prototype

We evaluated this design prototype and we received the following feedback:


Third Prototype

There was one more design prototype made. The first page of this prototype is similar to the previous two, but there is also a feature which allows users to retrieve unknown images and to tag them automatically. The first page is the main page which allows users to search for images and to retrieve untagged images: The second page of this prototype is to retrieve unknown images. This page appears when the user clicks on the button labelled “Retrieve”, it returns a set of images that have not yet been tagged: The third page of this prototype is the search page, when the user types in search criteria in the main page and clicks on “Search”, this page is triggered: The last page of this prototype is used to view an individual image. When any image is clicked on the “Search Images” page, the user is redirected to this page:

Evaluation of Third Prototype


Heuristic Evaluation

We combined our UI ideas into a single UI prototype for our application. Afterwards, we did a heuristic evaluation on the prototype to improve its design and usability before producing the fourth prototype.
The 10 usability heuristics are outlined below:

  1. Visibility of system status - The system should always keep users informed about what is going on, through appropriate feedback within reasonable time.
  2. Match between system and the real world - The system should speak the users' language, with words, phrases and concepts familiar to the user, rather than system-oriented terms. Follow real-world conventions, making information appear in a natural and logical order.
  3. User control and freedom - Users often choose system functions by mistake and will need a clearly marked "emergency exit" to leave the unwanted state without having to go through an extended dialogue. Support undo and redo.
  4. Consistency and standards - Users should not have to wonder whether different words, situations, or actions mean the same thing. Follow platform conventions.
  5. Error prevention - Even better than good error messages is a careful design which prevents a problem from occurring in the first place. Either eliminate error-prone conditions or check for them and present users with a confirmation option before they commit to the action.
  6. Recognition rather than recall - Minimize the user's memory load by making objects, actions, and options visible. The user should not have to remember information from one part of the dialogue to another. Instructions for use of the system should be visible or easily retrievable whenever appropriate.
  7. Flexibility and efficiency of use - Accelerators -- unseen by the novice user -- may often speed up the interaction for the expert user such that the system can cater to both inexperienced and experienced users. Allow users to tailor frequent actions.
  8. Aesthetic and minimalist design - Dialogues should not contain information which is irrelevant or rarely needed. Every extra unit of information in a dialogue competes with the relevant units of information and diminishes their relative visibility.
  9. Help users recognize, diagnose, and recover from errors - Error messages should be expressed in plain language (no codes), precisely indicate the problem, and constructively suggest a solution.
  10. Help and documentation - Even though it is better if the system can be used without documentation, it may be necessary to provide help and documentation. Any such information should be easy to search, focused on the user's task, list concrete steps to be carried out, and not be too large.

The heuristic evaluation of our combined prototype is outlined below:

Interface

Issue

Heuristic(s)

Search box on the main page of the UI.

Initially, the search button was removed to make the search box neater. But as a result, the UI loses usability as the user is trying to proceed to the results page.

Visibility of system status and Help users recognize, diagnose, and recover from errors.

British Library logo that goes back to the home page

The British Library logo that appears to the corner does not redirect user to the homepage, unlike what many search engines do. User has to go back several pages to go back to the home page.

Recognition rather than recall and flexibility and efficiency of use

Individual image page

The page looks really empty as there are two massive white region on the sides that has contains no UI entities.

Aesthetic and minimalist design

Tags on Individual image

The words on the tags and tag categories are really small and hard too see

Aesthetic and minimalist design and Visibility of system status

Tag categories on individual result page

User found it hard to categorise the tags.

Recognition rather than recall and Match between system and the real world

Individual image page

User could not find the button to find similar images, forcing them to go back to the results page each time.

Visibility of system status and flexibility and efficiency of use

Similar image page

User could not find the button to find the tags for the particular image he selected, forcing them to go back to the results page each time.

Visibility of system status and flexibility and efficiency of use


Fourth Prototype

After we have done the evaluation on the combined prototype, we came up with the final design prototype. It has a clean search feature which allows a user to search for images based on a tag as well as having a feature to find images that are similar to a specific one. The index page (main page) for this prototype is: When the search button is clicked from the home page, the user will be presented with a list of images based on the search query that has been typed. The design for that page is a very familiar layout as it is similar to the Google Images Search page, this makes the user experience more enjoyable more user-friendly. The following image is a screenshot of what the list of images would look like: There is also a feature to find similar images to each individual image. So if a user wants to find a similar image, “Find Similar Images” is clicked and the user is then presented with the following page: If an individual image is selected, then the user will be redirected to a page where all they see is that image.The following is a screenshot of it: The reason for choosing this final design prototype is because it combines the clear interface of the second prototype and has the “similar images” feature of the first prototype. Therefore, it makes the user experience better.

To view the fully working implementation of our prototype, please refer to the implementation page or visit the app at http://blbigdata.herokuapp.com.