Achievement Status

Current status of project requirements and achievements

Category Requirement Status Contributors
Must have Implemented a Pedestrian Crossing Recognition Completed Aiden, Kostas
Must have Create a standardized output format that is easily integrable with other mapping systems Completed Edward, Arif
Must have Develop a georeferencing system to convert pixel coordinates to precise geographical coordinates (latitude/longitude) Completed Edward, Arif
Should have Performance Optimisation Methods Completed Kostas
Should have Implement support for multiple input formats (.tif, .jgw) to ensure compatibility with various data sources Completed Edward
Should have Platform Integration with Team 14's platform Completed Arif
Could have Filtering and processing the output data from our machine learning models Completed Edward
Could have User-friendly WebA interface with drag-and-drop features, to allow non-technical users to easily use our system. Completed Aiden
Could have Maximise Reliability over Accuracy, by adjusting various confidence hyperparameters in our predictions to minimise false positive rate. Completed Kostas, Edward
Could have Extend the system to recognize additional types of accessibility features beyond crosswalks Not Able To Due To Budget Constraint N/A
Won't have Support for indoor accessibility mapping and navigation features Out of Scope N/A
Won't have Implementation of general hazard detection in satellite imagery Out of Scope N/A
Won't have Real-time updates and live detection capabilities Out of Scope N/A
Won't have Natural language processing and voice interface integration Out of Scope N/A

Must Have Requirements

100% Complete

Should Have Requirements

100% Complete

Known Bugs

Development approach and resolved issues

Development Approach

As our system is aimed to provide data for accessible systems supporting people with visual and mobility impairments, preventing bugs was important not simply as an invconvenience but also as a matter of potential danger. Left unaddressed, these bugs have the potential to compromise the safety of users. For this reason, we chose a development approach that prioritised tackling bugs before growing in scale as a system. Each feature is rigourously checked before integration. This means that we do not have any bugs (to the best of our knowledge) that still remain in the system. Instead, we can provide a history of some key bugs we encountered over the course of our development.

Category Issue Description Resolution Status
Detection Duplicate crossroad detections being saved in the output Implemented IoU (Intersection over Union) filtering to remove redundant detections Fixed
Detection False positives in areas with similar patterns to crosswalks due to overfitting Enhanced model training with negative samples and increase confidence thresholds Fixed
Testing API tests failing due to timeout issues Adjusted timeout parameters and implemented retry mechanisms Fixed
Testing Test success rates affected by network conditions Added network strength checks and conditional test execution Fixed

System Limitations

Current constraints and technical limitations

Data Limitations

  • Although our dataset is the most expansive rotated object detection open source dataset available, it is limited in the scope of locations that are included. Due to the localisation of crossings, with their differing regulations, as well as the primary focus of our clients being within europe, our data is limited to only include the United Kingdom and a subset of countries from Central Europe.
  • Our models are trained with clear, unobstructed data due to the data annotation policy that we chose. This results in a low detection rate of crosswalks that are obstructed by obstacles such as cars, as well as lower accuracy with occluded crosswalks such as those covered by significant shadow, etc.

Processing Speed vs Accuracy Optimisation

  • In our system, there is a trade-off between processing speed and accuracy. Following the advice of our clients and professors, we opted to prioritize processing accuracy over faster speed. However, we acknowledge that this may not be suitable for all use cases.
  • Although alternative methods for bounding box and classification detection can deliver higher accuracy, they demand more processing power, which would impact our system's performance. Although we chose YOLO for its optimal balance of speed and accuracy, methods like Rotated Faster R-CNN provide better accuracy. Further speed enhancements could have been achieved by incorporating quantized models into the classification layer. However, we were unable to integrated Quantised Aware Training into our training setup, and Post Quantisation resulted in an excessive decrease in accuracy for our system.

Server Limitations

  • File management on the server has suboptimal performance due to resource constraints, as limited memory resulted reduction in processing speed of subsequent datasets when a completed dataset is stored for retrieval..
  • Limited server quota affects overall system performance and processing capabilities, however limitations in budget and cost feasibility for charitable organisations meant that this was not something we were able to change.

WebApp Limitations

  • We are currently using a basic waiting page instead of a progress bar during processing, due to time constraints and the complexity of implementation.
  • Implementing real-time progress tracking would require substantial changes to the backend and additional development time.

Individual Contributions

Distribution of key contributions across team members

Important Note: Although these individual contributions do not average out to final overall contribution value, we decided an even contribution overall because different subsections had different difficulty and different importances, and as such were weighted differently.

System Artifact Contributions


Work packages Aiden Arif Edward Kostas
Main Role ML Engineer & Webapp Developer Technical Writer & QA Engineer Backend Developer & System Architect ML Engineer & Research Lead
Client Liaison 20% 20% 20% 40%
Research & Planning 25% 25% 25% 25%
Model Development 35% 15% 15% 35%
Frontend Development 50% 20% 20% 10%
Backend & Integration 15% 20% 45% 20%
Testing & QA 10% 35% 30% 25%
Documentation 20% 40% 20% 20%
Overall contribution 25% 25% 25% 25%

Website Contributions

Work packages Aiden Arif Edward Kostas
Main Role Frontend Developer & UI Designer Content Writer & Testing Lead Technical Lead & Developer Research Lead & Developer
Website Template and Setup 40% 20% 20% 20%
Home 35% 20% 20% 25%
Video 25% 25% 25% 25%
Requirement 20% 20% 40% 20%
Research 20% 25% 20% 35%
Algorithm (if applicable) 20% 20% 40% 20%
UI Design (if applicable) 40% 20% 20% 20%
System Design 25% 25% 25% 25%
Implementation 20% 20% 40% 20%
Testing 20% 40% 20% 20%
Evaluation and Future Work 20% 40% 20% 20%
User and Deployment Manuals 20% 20% 20% 40%
Legal Issues 25% 20% 20% 35%
Blog and Monthly Video 20% 35% 20% 25%
Overall contribution 25% 25% 25% 25%

Critical Evaluation

Comprehensive analysis of system performance and implementation

User Interface / User Experience

The project successfully implemented a web-based interface with an intuitive design. Key features include drag-and-drop functionality and visual feedback through a loading bar, enhancing usability. Additionally, comprehensive documentation is available on sightlinks.org, providing users with necessary guidance.

Moreover, our source code has clearly documented README files for more technical users. Users will find our system to be easily installed. We verified this by providing our source code to our partners at team 14, as well as our clients, and several mock users within our class, and instructed them to follow the README to build the project without providing any help. Positive feedback was recieved from all parties, with everybody being able to successfully deploy our application.

Functionality

The system demonstrates strong functional performance, achieving over 90% overall accuracy. The classification layer achieved 99.7% accuracy on unseen satellite data, which is significantly greater than comparable implementations in literature. The Object Detection layer successfully implemented Rotated Object Detection techniques, which have previously not been used in literature on transport feature detection, and achieved a <50cm precision using 15cm resolution data. This is a significant improvement over comparable solutions in literature.

The system is compatible with multiple common geo-spatial file formats, including the most common .tif and .jgw formats, while maintaining standardised outputs. This makes the system immediately and easily integratable with common GIS services. This was verified through consultion with an expert at the Environmental Sciences Research Institute, working on the ArcGIS service.

Stability

The processing pipeline is robust, consisting of multiple well-defined modular stages:

  • Image segmentation to divide images into meaningful parts
  • Filtering to remove noise and improve detection accuracy
  • Crossing detection to identify key landmarks
  • Georeferencing for precise spatial alignment

These stages have clearly defined input and output endpoints to ensure a stable pipeline, helping guarantee consistent and reliable performance across different datasets without requirement adjustments to the code. Each stage has unit tests, end-point tests and integration tests, to ensure that errors are immediately and clearly identified during the development process and before deployment.

To the best of our knowledge, the final deployed version does not contain any major bugs for all standard inputs. The website interface incorporates input validation, ensuring that only properly formatted data adhering to a standardized geospatial dataset format is accepted. Any error occuring during the processing of data on the server is handled gracefully, and returns the user to the input screen.

Efficiency

Significant optimizations have been achieved to enhance system efficiency:

  • Efficiency Improvements:
      Initially, the classification screening layer utilized a VGG16 model, as it was the approach outlined in the research paper on which our system was originally based [1].
    • Throughout the development process, various models were evaluated and integrated to enhance performance. The final model architecture, MobileNetV3_small, achieved an 8.85× improvement in processing speed, reducing inference time from 0.185 seconds (VGG16) to 0.021 seconds (MobileNetV3_small). All performance evaluations were conducted on an NVIDIA 4070 GPU, with each test repeated three times and averaged to mitigate potential anomalies caused by background subprocesses.
    • In terms of hardware independent statistics, the model size was reduced from 528 Mb to 21.8 Mb, while the FLOP count per inference was reduced from 16 Billion to 60 Million.
    • Initial model: VGG16 (528 MB)
    • Intermediate model: ResNet50 (98 MB)
    • Quantised MobileNetV3 (2.8MB)
    • Final models:
      • MobileNetV3 (21.8 MB)

Compatibility

Our system is compatible with several standardised file formats as both input and output. It has a direct integration with team 14's platform, who are our primary users.

Our system offers multiple access methods to accommodate varying levels of technical expertise and different requirements for development granularity, ranging from a complete abstraction from the underlying code to the tools to retrain the layers to accept different features other than crosswalks:

  • Core implementation is open-sourced on GitHub
  • A WebApp interface with complete abstraction from the implementation, with a server backend on Azure.
  • A robust and well-documented API interface for easily integrate into their systems.
  • An open-sourced development toolkit, with proper documentation for recreating our results or retraining for a different feature set.

These fulfill the needs of our three clients (Open-source implementation for Soundscape Community, easily integratable API for GDI-hub and non-technical webapp for the Wheelchair Alliance).

Maintainability

The codebase follows best practices for maintainability, featuring:

  • A modular pipeline design, ensuring flexibility for future updates.
  • Clear separation of concerns, improving code readability and manageability.

We have very thorough documentation throughout our codebase, including:

  • API documentation on sightlinks.org, for users who wish to use our deployed system
  • Clearly written README files in our codebase, detailing usage examples and installation methods

Project Management

Our project was, in both our opinion and the words of our clients, effectively managed, enabling us to meet core MoSCoW requirements well ahead of schedule. We consistently updated our progress and recieved recommendations from clients and experts throughout the development process, ensuring that expectations were clearly defined and aligned.

Key aspects of our project management approach:

  • Collaborative Development: Every team member actively participated in all stages of development, facilitating continuous improvement and group support. We maintained transparency by holding biweekly meetings— one during lab hours on Tuesdays and another through weekly client calls.
  • Quality Assurance: Implemented thorough code review processes, testing protocols and regular refactoring to maintain high code quality.
  • Documentation: Maintained comprehensive documentation throughout development, making it easier for team members to understand and contribute to different parts of the project, as well as allow our clients to offer relevant feedback at each stage.
  • Risk Management: Proactively identified and addressed potential issues early in the development cycle. Input and Outputs to each function are clearly documented, and are tested by a set of unit tests. We designed our entire system before any implementation began, and although we added additional features later we did not deviate from this strategy.
  • Timeline Management: All critical deadlines were met. Our structured approach provided flexibility to address emerging issues and refine the system beyond initial targets without compromising on our original intended functionality.

Our management approach prioritized:

  • Regular communication with clients and partners to maximize transparency and incorporate feedback from people who are experts in their field.
  • Clear, but flexible, task delegation and responsibility assignment to optimize efficiency while accomodating people's personal needs.
  • Systematic bug tracking and resolution to maintain system stability and performance, and maximise user safety.

Future Work

Potential improvements and future development directions

Dataset Expansion

The primary challenge in improving the accuracy and precision of our system has consistently been the limited availability of datasets to train the machine learning models. Due to budget constraints and the scarcity of suitable datasets, we developed our own dataset, which is narrowly tailored to meet our immediate needs, thus limiting scalability. The dataset we collected primarily contains European crosswalk features and lacks substantial data on occluded and obstructed crossings. While our system and pipeline are designed for global functionality, the current dataset restricts its effectiveness largely to Europe, with accuracy diminishing outside this region.

Key areas for dataset expansion include:

  • Geographic Diversity:
    • Crosswalks from different continents and regions
    • A variety of urban planning styles and road layouts
    • Accounting for diverse crossing designs across countries
    • Annotations for occluded and partially obstructed crosswalk data
  • Environmental Conditions:
    • Capturing different lighting conditions and times of day
    • Various weather conditions that affect visibility
    • Seasonal changes that alter crosswalk appearances

Future improvements could be achieved with minimal financial investment by:

  • Partnering with international and governmental mapping organizations that have access to proprietary datasets
  • Exploring automated data annotation methods like pseudo-labelling or self-training
  • Developing a community-driven data contribution platform, similar to the one created by our partners at Team 14.
  • Manually annotating data with a small team to introduce more diversity and representation into the training set without significantly increasing dataset size

Investigation of Applying Pre-Processing Techniques

Although the pre-processing techniques we applied did not significantly improve the accuracy of our system, this was largely due to the limitations of our dataset, as previously mentioned. Obstructed and occluded crosswalks were not addressed by our system. However, if the dataset were to include these features, the application of pre-processing techniques like edge detection could greatly enhance detection coverage, particularly in lower-income cities where road repainting is less frequent.

Expanded Accessibility Features

By incorporating multiple datasets, our system could be trained to simultaneously recognize various objects beyond just crossroads. All of our models are designed to support multi-class predictions, allowing us to identify several transport features, such as pelican crossings, tactile pavements, and others. This multi-class prediction approach eliminates the need to train the models separately for each feature set, enhancing both generalization and performance, as the system would only need to process a geographical area once.