The Taskar Center for Accessible Technology aims to alleviate mobility challenges faced by people with disabilities. One initiative towards this goal is the development of iOSPointMapper, designed for standardized data collection to enhance reliable sidewalk mapping. The data collected through the application can eventually be consumed by other applications that provide guidance for accessible navigation.
iOSPointMapper would be operable on iPhone Pro 13+ and iPad Pro (11’’ 4th generation+, 12.9’’ 6th generation+). The app utilizes the device’s camera and LiDAR sensor to record the surrounding environment. The recorded environment will be used to identify points of interest that either facilitate or hinder accessibility for individuals with disabilities. To facilitate this, the app will utilize on-device Artificial Intelligence (AI) that performs accurate semantic segmentation and object identification.
Points of Interest (POI)
A Point of Interest refers to specific locations or features within the environment that are relevant to accessibility. These points can either facilitate or hinder mobility for individuals with disabilities. They are critical for understanding and enhancing the navigability of spaces.
In the current iteration, iOSPointMapper aims to identify the following points of interest:
- Pedestrian Sidewalks: These are pathways designated for pedestrian use, crucial for safe and accessible navigation.
- Walls and Fences: These structures can impact navigation by defining the boundaries of walkable areas.
- Poles and Traffic Lights: These vertical structures are significant for providing guidance and safety to pedestrians.
Artificial Intelligence for POI Identification
The POI-identification primarily involves an on-device Convolutional Neural Network (CNN) model, developed on PyTorch, a cutting-edge deep learning framework, that offers a wide range of advantages, including compatibility with iOS. The CNN has been trained for semantic segmentation on CityScapes, a large-scale image dataset for urban scene understanding.
When an image is processed by the model, it generates a segmentation mask that delineates the various POIs within the image and categorizes them accordingly. These segments are further utilized to obtain the locations of the Points of Interest.
POI Location Detection
iOSPointMapper utilizes the LiDAR sensors present in the recent iterations of the iPhone Pro and iPad Pro. The LiDAR sensor returns the depth map of the image being captured by the Camera. Once the image segments are processed by the CNN model, they are correlated with the depth map to ascertain the relative distances of the identified Points of Interest (POIs) from the device.
Additionally, the application leverages the iOS CoreLocation framework, to get the location and orientation of the device. By combining the device’s location and orientation data with the relative distances of the POIs, the locations of the Points of Interest are calculated using the Haversine Formula.
Application User Guide
The user has the flexibility to choose the points of interest that they wish to capture. This will be done in the landing screen of the application itself, which is called the SetupView.
Once the user sets up the desired points of interest, they can then start capturing the surroundings using the CameraView screen. This screen displays the Camera (Photo Mode) in the top frame, and the corresponding Point-of-Interest Segmentation in the bottom frame. This segmentation frame is generated by the Computer Vision model. Once the user identifies what they want to capture in the CameraView, they tap the ‘Capture’ button, which takes them to the AnnotationView.
The AnnotationView is the screen used to vet the captured segmentations of the points of interest. The user has the options to accept/reject the segmentation, or also suggest an alternative tag for the point of interest (such as correcting a mis-identification of a traffic-light for a pole). Once all the segmented points of interest have been validated by the user, their details (such as location, type, etc.) are sent to an external server for post-processing.
Future Directions
As future directions, the Taskar Center aims to solve the following:
- POI Altitude: Calculation of the altitude of the Points of Interest captured by the camera.
- Speech to Text Annotation Vetting: Instead of just having an input form in the Annotation View, we can have the users give in audio input regarding the annotation.
- Location Vetting: Incorporate a map view to even vet the accuracy of the objects’ calculated locations
The Taskar Center for Accessible Technology thus aspires to alleviate traversal barriers for people with disabilities by contributing to standardized data collection through iOSPointMapper.