Evaluation - SightLink

Achievement Status

Current status of project requirements and achievements

Category	Requirement	Status	Contributors
Must have	Implemented a Pedestrian Crossing Recognition	Completed	Aiden, Kostas
Must have	Create a standardized output format that is easily integrable with other mapping systems	Completed	Edward, Arif
Must have	Develop a georeferencing system to convert pixel coordinates to precise geographical coordinates (latitude/longitude)	Completed	Edward, Arif
Should have	Performance Optimisation Methods	Completed	Kostas
Should have	Implement support for multiple input formats (.tif, .jgw) to ensure compatibility with various data sources	Completed	Edward
Should have	Platform Integration with Team 14's platform	Completed	Arif
Could have	Filtering and processing the output data from our machine learning models	Completed	Edward
Could have	User-friendly WebA interface with drag-and-drop features, to allow non-technical users to easily use our system.	Completed	Aiden
Could have	Maximise Reliability over Accuracy, by adjusting various confidence hyperparameters in our predictions to minimise false positive rate.	Completed	Kostas, Edward
Could have	Extend the system to recognize additional types of accessibility features beyond crosswalks	Not Able To Due To Budget Constraint	N/A
Won't have	Support for indoor accessibility mapping and navigation features	Out of Scope	N/A
Won't have	Implementation of general hazard detection in satellite imagery	Out of Scope	N/A
Won't have	Real-time updates and live detection capabilities	Out of Scope	N/A
Won't have	Natural language processing and voice interface integration	Out of Scope	N/A

Must Have Requirements

100% Complete

Should Have Requirements

100% Complete

Known Bugs

Development approach and resolved issues

Development Approach

As our system is aimed to provide data for accessible systems supporting people with visual and mobility impairments, preventing bugs was important not simply as an invconvenience but also as a matter of potential danger. Left unaddressed, these bugs have the potential to compromise the safety of users. For this reason, we chose a development approach that prioritised tackling bugs before growing in scale as a system. Each feature is rigourously checked before integration. This means that we do not have any bugs (to the best of our knowledge) that still remain in the system. Instead, we can provide a history of some key bugs we encountered over the course of our development.

Category	Issue Description	Resolution	Status
Detection	Duplicate crossroad detections being saved in the output	Implemented IoU (Intersection over Union) filtering to remove redundant detections	Fixed
Detection	False positives in areas with similar patterns to crosswalks due to overfitting	Enhanced model training with negative samples and increase confidence thresholds	Fixed
Testing	API tests failing due to timeout issues	Adjusted timeout parameters and implemented retry mechanisms	Fixed
Testing	Test success rates affected by network conditions	Added network strength checks and conditional test execution	Fixed

System Limitations

Current constraints and technical limitations

Data Limitations

Although our dataset is the most expansive rotated object detection open source dataset available, it is limited in the scope of locations that are included. Due to the localisation of crossings, with their differing regulations, as well as the primary focus of our clients being within europe, our data is limited to only include the United Kingdom and a subset of countries from Central Europe.
Our models are trained with clear, unobstructed data due to the data annotation policy that we chose. This results in a low detection rate of crosswalks that are obstructed by obstacles such as cars, as well as lower accuracy with occluded crosswalks such as those covered by significant shadow, etc.

Processing Speed vs Accuracy Optimisation

In our system, there is a trade-off between processing speed and accuracy. Following the advice of our clients and professors, we opted to prioritize processing accuracy over faster speed. However, we acknowledge that this may not be suitable for all use cases.
Although alternative methods for bounding box and classification detection can deliver higher accuracy, they demand more processing power, which would impact our system's performance. Although we chose YOLO for its optimal balance of speed and accuracy, methods like Rotated Faster R-CNN provide better accuracy. Further speed enhancements could have been achieved by incorporating quantized models into the classification layer. However, we were unable to integrated Quantised Aware Training into our training setup, and Post Quantisation resulted in an excessive decrease in accuracy for our system.

Server Limitations

File management on the server has suboptimal performance due to resource constraints, as limited memory resulted reduction in processing speed of subsequent datasets when a completed dataset is stored for retrieval..
Limited server quota affects overall system performance and processing capabilities, however limitations in budget and cost feasibility for charitable organisations meant that this was not something we were able to change.

WebApp Limitations

We are currently using a basic waiting page instead of a progress bar during processing, due to time constraints and the complexity of implementation.
Implementing real-time progress tracking would require substantial changes to the backend and additional development time.

Individual Contributions

Distribution of key contributions across team members

Important Note: Although these individual contributions do not average out to final overall contribution value, we decided an even contribution overall because different subsections had different difficulty and different importances, and as such were weighted differently.

System Artifact Contributions

Work packages	Aiden	Arif	Edward	Kostas
Main Role	ML Engineer & Webapp Developer	Technical Writer & QA Engineer	Backend Developer & System Architect	ML Engineer & Research Lead
Client Liaison	20%	20%	20%	40%
Research & Planning	25%	25%	25%	25%
Model Development	35%	15%	15%	35%
Frontend Development	50%	20%	20%	10%
Backend & Integration	15%	20%	45%	20%
Testing & QA	10%	35%	30%	25%
Documentation	20%	40%	20%	20%
Overall contribution	25%	25%	25%	25%

Website Contributions

Work packages	Aiden	Arif	Edward	Kostas
Main Role	Frontend Developer & UI Designer	Content Writer & Testing Lead	Technical Lead & Developer	Research Lead & Developer
Website Template and Setup	40%	20%	20%	20%
Home	35%	20%	20%	25%
Video	25%	25%	25%	25%
Requirement	20%	20%	40%	20%
Research	20%	25%	20%	35%
Algorithm (if applicable)	20%	20%	40%	20%
UI Design (if applicable)	40%	20%	20%	20%
System Design	25%	25%	25%	25%
Implementation	20%	20%	40%	20%
Testing	20%	40%	20%	20%
Evaluation and Future Work	20%	40%	20%	20%
User and Deployment Manuals	20%	20%	20%	40%
Legal Issues	25%	20%	20%	35%
Blog and Monthly Video	20%	35%	20%	25%
Overall contribution	25%	25%	25%	25%

Critical Evaluation

Comprehensive analysis of system performance and implementation

User Interface / User Experience

The project successfully implemented a web-based interface with an intuitive design. Key features include drag-and-drop functionality and visual feedback through a loading bar, enhancing usability. Additionally, comprehensive documentation is available on sightlinks.org, providing users with necessary guidance.

Moreover, our source code has clearly documented README files for more technical users. Users will find our system to be easily installed. We verified this by providing our source code to our partners at team 14, as well as our clients, and several mock users within our class, and instructed them to follow the README to build the project without providing any help. Positive feedback was recieved from all parties, with everybody being able to successfully deploy our application.

Functionality

The system demonstrates strong functional performance, achieving over 90% overall accuracy. The classification layer achieved 99.7% accuracy on unseen satellite data, which is significantly greater than comparable implementations in literature. The Object Detection layer successfully implemented Rotated Object Detection techniques, which have previously not been used in literature on transport feature detection, and achieved a <50cm precision using 15cm resolution data. This is a significant improvement over comparable solutions in literature.

The system is compatible with multiple common geo-spatial file formats, including the most common .tif and .jgw formats, while maintaining standardised outputs. This makes the system immediately and easily integratable with common GIS services. This was verified through consultion with an expert at the Environmental Sciences Research Institute, working on the ArcGIS service.

Stability

The processing pipeline is robust, consisting of multiple well-defined modular stages:

Image segmentation to divide images into meaningful parts
Filtering to remove noise and improve detection accuracy
Crossing detection to identify key landmarks
Georeferencing for precise spatial alignment

These stages have clearly defined input and output endpoints to ensure a stable pipeline, helping guarantee consistent and reliable performance across different datasets without requirement adjustments to the code. Each stage has unit tests, end-point tests and integration tests, to ensure that errors are immediately and clearly identified during the development process and before deployment.

To the best of our knowledge, the final deployed version does not contain any major bugs for all standard inputs. The website interface incorporates input validation, ensuring that only properly formatted data adhering to a standardized geospatial dataset format is accepted. Any error occuring during the processing of data on the server is handled gracefully, and returns the user to the input screen.

Efficiency

Significant optimizations have been achieved to enhance system efficiency:

Efficiency Improvements:
- Throughout the development process, various models were evaluated and integrated to enhance performance. The final model architecture, MobileNetV3_small, achieved an 8.85× improvement in processing speed, reducing inference time from 0.185 seconds (VGG16) to 0.021 seconds (MobileNetV3_small). All performance evaluations were conducted on an NVIDIA 4070 GPU, with each test repeated three times and averaged to mitigate potential anomalies caused by background subprocesses.
- In terms of hardware independent statistics, the model size was reduced from 528 Mb to 21.8 Mb, while the FLOP count per inference was reduced from 16 Billion to 60 Million.
- Initial model: VGG16 (528 MB)
- Intermediate model: ResNet50 (98 MB)
- Quantised MobileNetV3 (2.8MB)
- Final models:
  - MobileNetV3 (21.8 MB)

Compatibility

Our system is compatible with several standardised file formats as both input and output. It has a direct integration with team 14's platform, who are our primary users.

Our system offers multiple access methods to accommodate varying levels of technical expertise and different requirements for development granularity, ranging from a complete abstraction from the underlying code to the tools to retrain the layers to accept different features other than crosswalks:

Core implementation is open-sourced on GitHub
A WebApp interface with complete abstraction from the implementation, with a server backend on Azure.
A robust and well-documented API interface for easily integrate into their systems.
An open-sourced development toolkit, with proper documentation for recreating our results or retraining for a different feature set.

These fulfill the needs of our three clients (Open-source implementation for Soundscape Community, easily integratable API for GDI-hub and non-technical webapp for the Wheelchair Alliance).

Maintainability

The codebase follows best practices for maintainability, featuring:

A modular pipeline design, ensuring flexibility for future updates.
Clear separation of concerns, improving code readability and manageability.

We have very thorough documentation throughout our codebase, including:

API documentation on sightlinks.org, for users who wish to use our deployed system
Clearly written README files in our codebase, detailing usage examples and installation methods

Project Management

Our project was, in both our opinion and the words of our clients, effectively managed, enabling us to meet core MoSCoW requirements well ahead of schedule. We consistently updated our progress and recieved recommendations from clients and experts throughout the development process, ensuring that expectations were clearly defined and aligned.

Key aspects of our project management approach:

Collaborative Development: Every team member actively participated in all stages of development, facilitating continuous improvement and group support. We maintained transparency by holding biweekly meetings— one during lab hours on Tuesdays and another through weekly client calls.
Quality Assurance: Implemented thorough code review processes, testing protocols and regular refactoring to maintain high code quality.
Documentation: Maintained comprehensive documentation throughout development, making it easier for team members to understand and contribute to different parts of the project, as well as allow our clients to offer relevant feedback at each stage.
Risk Management: Proactively identified and addressed potential issues early in the development cycle. Input and Outputs to each function are clearly documented, and are tested by a set of unit tests. We designed our entire system before any implementation began, and although we added additional features later we did not deviate from this strategy.
Timeline Management: All critical deadlines were met. Our structured approach provided flexibility to address emerging issues and refine the system beyond initial targets without compromising on our original intended functionality.

Our management approach prioritized:

Regular communication with clients and partners to maximize transparency and incorporate feedback from people who are experts in their field.
Clear, but flexible, task delegation and responsibility assignment to optimize efficiency while accomodating people's personal needs.
Systematic bug tracking and resolution to maintain system stability and performance, and maximise user safety.

Future Work

Potential improvements and future development directions

Dataset Expansion

The primary challenge in improving the accuracy and precision of our system has consistently been the limited availability of datasets to train the machine learning models. Due to budget constraints and the scarcity of suitable datasets, we developed our own dataset, which is narrowly tailored to meet our immediate needs, thus limiting scalability. The dataset we collected primarily contains European crosswalk features and lacks substantial data on occluded and obstructed crossings. While our system and pipeline are designed for global functionality, the current dataset restricts its effectiveness largely to Europe, with accuracy diminishing outside this region.

Key areas for dataset expansion include:

Geographic Diversity:
- Crosswalks from different continents and regions
- A variety of urban planning styles and road layouts
- Accounting for diverse crossing designs across countries
- Annotations for occluded and partially obstructed crosswalk data
Environmental Conditions:
- Capturing different lighting conditions and times of day
- Various weather conditions that affect visibility
- Seasonal changes that alter crosswalk appearances

Future improvements could be achieved with minimal financial investment by:

Partnering with international and governmental mapping organizations that have access to proprietary datasets
Exploring automated data annotation methods like pseudo-labelling or self-training
Developing a community-driven data contribution platform, similar to the one created by our partners at Team 14.
Manually annotating data with a small team to introduce more diversity and representation into the training set without significantly increasing dataset size

Investigation of Applying Pre-Processing Techniques

Although the pre-processing techniques we applied did not significantly improve the accuracy of our system, this was largely due to the limitations of our dataset, as previously mentioned. Obstructed and occluded crosswalks were not addressed by our system. However, if the dataset were to include these features, the application of pre-processing techniques like edge detection could greatly enhance detection coverage, particularly in lower-income cities where road repainting is less frequent.

Expanded Accessibility Features

By incorporating multiple datasets, our system could be trained to simultaneously recognize various objects beyond just crossroads. All of our models are designed to support multi-class predictions, allowing us to identify several transport features, such as pelican crossings, tactile pavements, and others. This multi-class prediction approach eliminates the need to train the models separately for each feature set, enhancing both generalization and performance, as the system would only need to process a geographical area once.