Research

By conducting thorough research on Microsoft Teams and remote MotionInput solutions, we gained valuable insights into the current landscape and identified key areas for improvement in our project.

Project Research Overview

Our project aims to provide a comprehensive solution for remote physiotherapy, enabling patients to access physiotherapy sessions from the comfort of their own homes. To achieve this, we knew that we needed to undertake extensive research to identify the best methods and tools to make our solution effective and user-friendly.

Our research focused on four key areas: The front-end/teams app, our remote solution possibilities (NDI), the games we can use and modify, and our data tracking component. We explored various sources to gain a deep understanding of each of these areas and how they could contribute to our solution. By analyzing and comparing different approaches, we aimed to identify the most effective methods that could provide the best outcomes for our users.

Technology Review


Teams/Front-End

When starting to build our Microsoft Teams Application, the Teams Toolkit provided us with 2 options regarding the language to use: JavaScript or Typescript. Knowing our objectives, we needed to take into account the popularity and performance.

  • The reason usage levels and popularity was important to us is because, as more developers use Javascript, there will be a plethora of learning resources available online. These resources are pivotal to the execution of our front-end.

    Furthermore, there will be a bigger community for Javascript, where community answers to development related questions will be available. This means, by choosing Javascript, there is a higher likelihood to encounter similar issues that have been previously be posted online. This is very helpful and saves times when debugging. [1]

    JavaScript vs TypeScript Usage Levels
  • In Teams toolkit, frontend libraries are prominent. With JavaScript development being formatted in React and the Typescript in Angular. We have to compared the performance of each of these frameworks to established which one will provide the best experience within Microsoft Teams. [2]

    Memory
    Memory allocation in MBs ± standard deviation

    The lower memory usage in React framework is attractive. One reason is because part of our target demographic are elderly patients who may have join Teams meeting on an old machine or mobile phone. Less memory usage will help these patients enjoy using our application.

    Speed
    Duration in milliseconds ± standard deviation (Slowdown = Duration / Fastest)

    With our NDI-IN solution being now blocked for a while, live collaboration had not been possible for the last few weeks. However, we can now use LiveShareSDK for Teams, which uses Microsoft's fluid service, connected to a network, to perform state sharing between participants in a Teams meeting.

    When sharing state, usually through JSON, the duration of standard data manipulation must be quick, to ensure latency and speed does not hinder the collaborative game experience.

  • From data evidence gathered, it is meticulous that using Javascript will give us the best support in development, in getting started and in debugging stages. The availability of previous projects will give us examples, and will make it easier to find open-source games to integrate into our app and render over Microsoft Teams. The initial memory usage in React is extremely useful for app users.

    Despite Angular framework being slightly quicker in data formatting and manipulation, React library in Javascript has more important advantages as previously explained, as well as the fact that the ONLY LiveShare SDK example code uses Javascript. LiveShare SDK (Software Development Kit) for Teams, will be a great additional feature. This is a new feature released by Microsoft and operates by managing Fluid state. Similar SDK’s in Angular do not exist.


Remote Solutions

In developing our remote solution, we recognized the importance of finding a technology that could seamlessly stream high-quality audio and video between patients and physiotherapists. After research, we found that NDI technology and PSI (Platform for Situated Intelligence) with Microsoft Graph were the two most viable options for our project. Therefore we delved deeper with our research into these two options.

  • PSI (Platform for Situated Intelligence), is an open and flexible framework developed by Microsoft to support the development of solutions that process and stream data such as audio and video. On the other hand, Microsoft Graph is a developer platform that integrates multiple Microsoft services and devices, empowering developers to create apps. [3]

    As a potential solution, we explored the idea of using Microsoft Graph to connect to a Teams tenant and the calls made through that tenant. By doing so, we could access live video and audio feeds that could be managed using the PSI framework. Our team looked into a solution where we combine these two with an Azure bot that can be summoned back into a Teams call. This would allow us to create live video feeds by merging member video streams and games, which can then be called back into a Teams call with Azure to display them in the call itself. [4]

  • NDI, or Network Device Interface, is a protocol developed by NewTek that allows for the transmission of multiple high-quality video and audio feeds over a network. By utilizing this technology, we can capture and stream the audio and video feeds in real-time from a Teams call. To explore its potential, we considered two possible paths. The first option was to integrate NDI with OBS (Open Broadcaster Software) to create a customized build. This would involve taking video and audio feeds from a Teams call and using them to create a graphics overlay that includes games such as ping pong, which can be overlaid over the feeds. We would then use NDI-In technology to relay this overlay back into the Teams call. [5]

    Alternatively, we also explored building an NDI solution directly into Python. By integrating NDI receivers and finders into the MotionInput codebase, we could replace the standard desktop webcam as a video source for the motion software and instead use a selected video stream broadcasted over NDI technology via the Teams call. [6]

  • In conclusion, we explored two potential solutions for our remote solution – the PSI Bot and Microsoft Graph solution and the NDI solution. While the PSI Bot and Microsoft Graph solution seemed promising, it would have involved a high level of complexity and numerous dependencies on Microsoft Graph and Azure, making it difficult for patients and physiotherapists to use, as well as institutions or hospitals trying to set this up at a large scale. This would have gone against our goal of providing a user-friendly and enjoyable experience for all.

    On the other hand, the NDI solution showed a lot of potential, particularly in integrating it with custom OBS overlays to create visually appealing graphics in the Teams call. However, this solution was halted due to the current blocking of NDI-In software by Microsoft.

    As a result, we went ahead with the NDI solution built directly into MotionInput, which still allowed remote MotionInput control from any device, anywhere, providing a practical solution for our needs. While the blocking of the NDI-In software was a setback, we remain optimistic about the current technologies we have in-place.


Data Collection

In developing our remote physiotherapy solution, we recognized the importance of accurately tracking the movements and positions of patients in real-time, and providing personalized feedback on their progress.

After researching various technologies and tools, we identified Mediapipe as a promising option for pose estimation, which could help us accurately track patients' body positions and movements during remote sessions. To visualize and analyze the data collected through Mediapipe, we decided to use scatter plots.

  • To visualize and analyze the data collected through Mediapipe, we decided to use Matplotlib. Matplotlib is a popular Python library that provides a wide range of visualization options, including scatter plots, bar charts, and line charts. By plotting the data collected through Mediapipe on a scatter plot using Matplotlib, we could easily see the relationship between joint angle and movement frequency.

    In a scatter plot, each point represents a single observation, and the location of the point reflects the value of the two variables for that observation. This allows you to easily see patterns in the data and identify any outliers or unusual observations.

    The arm angle data is plotted on the x-axis and the frequency data on the y-axis. Each point on the graph would represent a single measurement of arm angle and frequency, and the location of the point on the graph would reflect the value of these two variables for that measurement.

    In the case of arm angle and frequency data, a scatter plot is the best choice than any other graph types because each data point represents a single measurement of the angle and frequency, which consists of discrete observations that are not necessarily ordered or measured over time.

  • Mediapipe is a powerful open-source framework developed by Google that uses machine learning-based tracking algorithms to accurately detect and track human body joints in real-time.

    Mediapipe offers a good balance of accuracy, ease of use, customizability, and cross-platform compatibility. Additionally, it offers a wide range of pre-built models for pose estimation, facial recognition, and other computer vision tasks, which can save time and resources when implementing pose estimation in a new application. It also has a large and active user community, which means that there are many resources and tutorials available to help users learn how to use the library and troubleshoot any issues they may encounter. [7]

    There are other pose estimation technologies such as OpenPose and AlphaPose. They both are open-source library for real-time which uses deep learning algorithms to detect and track human body joints. Even though they have achieved high accuracy in benchmark tests and offers a range of customization options, they are complex to implement and requires significant computational resources, which may be a limitation for some users. [8]

Solution Summary

As a team, we conducted extensive research on various remote motion input solutions to use in our project. After careful consideration, we decided to use NDI as our remote motion input solution, as it offered the most robust and reliable system for transmitting motion data over a network. Additionally, we chose to use Javascript for our web application, as it is a widely used language that offers a lot of flexibility and support for web development.

Our research also highlighted the importance of user experience and the need to ensure that our application is user-friendly and easy to use. We are confident that our choices for remote MotionInput and programming language will allow us to develop a high-quality, user-friendly application that meets the needs of our users.

Literature Review

References

  • [1] Barot, S. (2022) Typescript vs javascript : Check the difference in 2023, Aglowid IT Solutions. Available at: https://aglowiditsolutions.com/blog/typescript-vs-javascript/
  • [2] Interactive results - stefan_krause. Available at: https://stefankrause.net/js-frameworks-benchmark8/table.html
  • [3] Microsoft - Microsoft/PSI: Platform for situated intelligence, GitHub. Available at: https://github.com/microsoft/psi
  • [4] Microsoft-Graph microsoftgraph/microsoft-graph-comms-samples, GitHub. Available at: https://github.com/microsoftgraph/microsoft-graph-comms-samples/tree/master/Samples/PublicSamples/PsiBot#teams-bots-with-platform-for-situated-intelligence
  • [5] About NDI - network device interface (no date) About NDI - Network Device Interface. Available at: https://www.ndi.tv/about-ndi/
  • [6] Microsoft Support. Available at: https://support.microsoft.com/en-us/office/broadcasting-audio-and-video-from-teams-with-ndi-and-sdi-technology-e91a0adb-96b9-4dca-a2cd-07181276afa3
  • [7] Pose - mediapipe. Available at: https://google.github.io/mediapipe/solutions/pose
  • [8] Siva, L. (2022) An easy guide for pose estimation with Google's MediaPipe, Medium. MLearning.ai. Available at: https://medium.com/mlearning-ai/an-easy-guide-for-pose-estimation-with-googles-mediapipe-a7962de0e944



Literature Review References

  • [1] A variable resistance virtual exercise platform for physiotherapy rehabilitation. Neils De Ruiter, Sam Nees, Raymond Benjamin, Matthew Nagel, XiaoQi Chen, Marcus King, 2009. https://ieeexplore.ieee.org/document/4749588 . https://www.inderscienceonline.com/doi/abs/10.1504/IJISTA.2010.030204
  • [2] Reliability and validity analyzes of Kinect V2 based measurement system for shoulder motions. Burakhan Çubukçu, Uğur Yüzgeç, Raif Zileli, Ahu Zileli, 2019. https://www.sciencedirect.com/science/article/pii/S1350453319302462
  • [3] Streaming audio and video from Teams using NDI and SDI Technology. Microsoft Teams. https://support.microsoft.com/en-us/office/broadcasting-audio-and-video-from-teams-with-ndi-and-sdi-technology-e91a0adb-96b9-4dca-a2cd-07181276afa3
  • [4]Stream Smart: P2P video streaming for smartphones through the cloud. Alessando Gaeta, Sokol Kosta, Julinda Stefa, Alessandro Mei, 2013. https://ieeexplore.ieee.org/abstract/document/6644983/authors#authors
  • [5] Development of a Remote Object Webcam Controller (ROWC) with COBRA and JMF. Frank McCown, 2006 https://www.academia.edu/26534142/Development_of_a_remote_object_webcam_controller_rowc_with_corba_and_jmf