< Bi-Weekly Reports

Bi-Weekly Report 8

March 4, 2016

15/2/2016

Time 14-15

The main purpose of this meeting was to plan the development of our system during reading week. This was because the team members won’t be able to meet during the week and we wanted to ensure that the progress of the project would not fall behind schedule. We decided that each team member will work on a set of tasks that are not strictly dependent of each other. This is to avoid delays caused by difficulties of coordinating team members during reading week. In the end of this meeting, we decided that we were going to work on the following features:

  • Visualise data using graphs (should have): scatter plots, time-series plots
  • Show the unique values and their count of every column of the dataset
  • Export data frame as CSV
  • Categorical feature encoding using the one-hot method
  • Delete selected columns

22/2/2016

Time 11-13

This meeting was carried out to discuss what each team member achieved during reading week and to plan what should be achieved during the week the follows. During the discussion, it became apparent that the javascript library we were using for the graphing feature, plotly.js, was not capable enough to handle large datasets. Furthermore, when searching for potential replacement libraries, we found that most graphing libraries that supported user interactions cannot handle large datasets. As including interactive graphs was not a requirement, we decided that we will opt for using matplotlib library to generate the graph as images in the backend and send it to the frontend when requested.

26/2/2016

Time 14-15

The main purpose of this meeting is to plan the development of the system for the rest of the term. We decided that the team should aim to finish all essential elements of the project in the next 4 weeks. More precisely, the team should finish all essential features of the system by 11/3 and not starting development on any new features if they cannot be completed by that date. Between 12/3 and 25/3, the team should mainly focus on performing tests on the system and complete all necessary documentation, including the system manual and user manual, as well as updating the project website.

Even though the deadline for this project is not until 27/4. the team felt that it would be difficult to collaborate efficiently with each other between the end of term and the deadline and so it will be beneficial to make sure all the essential elements of the project is done before the end of term.

Shivam Dhall

Over the course to the last two weeks, my teammates and I carried on with the development of our should have requirements, we also worked on optimising the speed of carrying out operations within the application. One of the features I worked on was to create a frequency table for every column, this involved getting the frequency of every value in a particular column and displaying the results in a table in the analyses tab. Another feature I worked on was to allow the user to save the manipulated dataset as both a CSV or JSON file within their local filesystem; this involved initiating a file download on our flask backend by creating a response message and sending it to the front-end, setting the correct HTTP headers on the response message was quite tricky as every header plays an integral part in determining how the download is handled on a particular browser. I also implemented a feature which allows the user to download the visualisation they have generated, to implement this feature I used the new download attribute defined in the HTML5 specification. As our code base gets larger, I have also continued to carry out regression testing as well as add more tests to the existing test suite.

Gordon Cheng

During the last two weeks, I primarily worked on completing some of the should have requirements for our system. I worked on categorical feature encoding using the one-hot method and deleting selected columns features. In both cases, I worked on the frontend and backend. This involved completing the UI for both features in the sidebar and the relevant python modules that performs these functions in the backend. Other than developing features of the system, I also did refinements to the user interface of the system. One of these refinements was to combine the table navigation bar and the quick inspector bar into one status bar displayed at the bottom of the table, with a button to toggle between the two bars.

Bandi Enkh-Amgalan

The last two weeks coincided with reading week and scenario week so we were not able to make as much progress with the project as we had hoped, particularly during scenario week. I primarily worked on reimplementing data visualization features as we discovered earlier that our current plotting library does not scale well with large datasets. I decided that we should drop interactivity in favor of performance and generate the charts on the backend using matplotlib and serve static images to the front-end. I was able to create a demo of this configuration with the backend serving images of frequency histograms generated by matplotlib to the front-end. With the Vagrant virtual machine rebuilt with an installation matplotlib and a working demo of the new visualization system, I should be able to finish reimplementing all of the visualization features within the week.