Evaluation

Achievement Table


ID Requirement Priority State Contributors
1 Extract 3D tennis shot pose animations from video Must Yes Jin, Prithvi
2 Recognise which shots are being performed Must Yes Prithvi, Morgane
3 View analysis in 3D and add annotations on video Should Yes Prithvi, Morgane
4 Create web-application Should Yes All
5 Provide metrics such as shot speed Should Yes Prithvi
6 Provide API for external developers Should Yes Prithvi
7 Attach 3D models to extracted shot animations Could Yes Prithvi
8 Extract 3D player position in relation to court Could No Jin
Key Functionalities (must have and should have) 100% Completed
Optional Functionalities (could have) 50% completed

Known Bug List

ID Bug Description Priority
1 Some joints behave erratically in the converted .bvh pose animation format. Low

Individual Contribution Table

Work Packages Prithvi Morgane Jin
Project Partners liaison 30% 20% 50%
Requirement analysis 33% 33% 33%
Research and Experiments 33% 33% 33%
UI Design 0% 60% 40%
Coding 50% 25% 25%
Testing 80% 10% 10%
Report Website 33% 33% 33%
Presentation/Video Editing 33% 33% 33%
Overall contribution 36% 32% 32%
Main Roles Backend developer, researcher, tester Front-end developer, UI designer, back-end developer Back-end developer, report editor, client liaison

Critical Evaluation

UI/UX: our front-end provides a simple interface to access all features of our analysis pipeline. Since our front-end is web-based, it is accessible to any user since nothing needs to be downloaded; all the user needs to do is find and upload to the webapp a video of their choosing. Furthermore, since we implement responsive web design using the Bootstrap framework and use Three.js which runs natively in browsers using WebGL, our webapp is mobile-friendly. As such, a user can simply record a video of tennis play on their phone and then navigate to the website to upload the video and view the analysis. Thus, we provide a highly accessible and simple flow for our UI/UX.

Functionality: we successfully completed all key functionalities and half of the optional functionalities; our system is functional from the perspective of a developer at the level of the analysis pipeline API and is also functional from the perspective of a user at the front-end webapp level.

Stability: our shot detection heuristic performs quite well and detects most shots, however, fails to detect some, or only detects parts of shots. Our shot recognition model performs very well at distinguishing backhands from non-backhands, however, sometimes confuses forehands with smashes or serves. Furthermore, since our shot recognition model is trained on full body videos, our analysis pipeline only works on videos that include all joints, including the legs and feet. Overall, our system is mostly stable with good results achieved for a prototype, however there is room for improvement in terms of stability.

Efficiency: Unfortunately, we could only run TensorFlow on a CPU, with analysis taking around 50 seconds for a 1-minute video. We anticipate that running our code with a CUDA-enabled GPU should improve efficiency for TensorFlow, along with any other libraries such as SciPy that can also be run on a GPU, improving analysis times. For the sake of this prototype, we only considered videos with a maximum length of around a minute, so this efficiency proved to be sufficient.

Compatibility: our entire system is highly compatible and platform independent. Our backend is written entirely in Python and uses standard Python libraries and frameworks. Since our front-end is a webapp, it can be accessed on any web-enabled device. Furthermore, no plugins are required for 3D shot reconstruction since Three.js runs natively in the browser using WebGL. Our analysis pipeline also functions in a variety of contexts and camera setups, across users of different ages. Thus, our entire end-to-end system, from user to analysis pipeline, is highly compatible.

Maintainability: we provide a modular API for our analysis pipeline that can be used to extend our system in conjunction with our provided documentation. Additionally, our code is straightforward and concise as it makes use of standard Python libraries and frameworks. Finally, our project is fully open source and available publicly on GitHub. Overall, our code has a good level of maintainability to be further extended with the use of our API and standard Python libraries.

Project Management: Throughout the entire project, we created a flexible schedule which allowed us to take a highly agile approach to project management with a strong emphasis on reflection and self-organisation. Due to our lack of experience, we would sometimes find that what we had independently set out to do did not work out as we had hoped. Reflecting and being able to adapt quickly and effectively to changing requirements and designs proved to be especially important for successfully delivering a product in these circumstances.


Future Work

First, to refine our project, we plan to fix the bug concerning the .BVH files as well as releasing an executable that non-developers can utilize to test out our prototype.

We have also noticed that it would be interesting to provide a wider range of metrics about shots, notably qualitatively or quantitatively ranking them, as well as providing more advanced metrics and visualizations. This would in turn improve the user experience as players would be able to extract more data from the application alone. As we already possess most of the core data required to extract this additional information, this would be dealt with first.

As our project progressed, we expressed the possibility of the user being interested in analyzing their position relative to the court boundaries, which would provide users with data such as court coverage, and the ability to reconstruct entire plays in 3D. However, as this was a very complex and different area of Computer Vision, we lacked the experience to delve into it within the appropriate timeframe. As a result, in the future, we’d like to extract 3D skeleton positions in relation to court boundaries, which would then allow us to extract additional useful metrics and visualisations.