Detecting Speed Limit Signs with YOLOv5 on Android Phone
A friend of mine commented that to his surprise, he couldn’t find any mobile applications available for camera-based speed limit detection. Detecting speed limit signs and tracking the speed limit is a pretty handy advanced driver assistance system feature, preventing scenarios where the driver is accidentally speeding due to not spotting a change in speed limit. This is already somewhat established technology in modern vehicles (although working with varying levels of reliability). So, motivated by this gaping hole in the machine vision app market, I decided to make an app of my own for detecting the speed limit signs. I didn’t really check that properly whether such apps are actually available. I just wanted a good excuse to try my hand at deploying machine vision models on Android.
Model Development and Deployment
A lot of general purpose object detection models have the capability of detecting traffic signs, mostly due to the fact that traffic sign is one of the classes on the COCO dataset. Generally, traffic sign analysis pipelines apply an object detection model for detecting the traffic signs, and a separate classification model for predicting the exact type of the sign. Such approach would also be applicable for detecting different speed limit signs. However, I wanted to cut some corners and train an object detection model to directly recognise different speed limit signs. Mostly because PyTorch provides a straightforward demo for deploying object detection models on Android.
For training, I downloaded the Mapillary traffic sign dataset, which featured an impressive collection of different speed limit signs from around the world, all labelled with the exact value of the speed limit. A total 3500 images were available for training. As the detection model to be trained, I chose YOLOv5-S, since it was applied in the PyTorch Android demo. The model should also be pretty much state-of-the-art, offering an excellent speed/accuracy trade-off. To convert the Mapillary data to a suitable format for training the YOLOv5-S model, I applied this code I found with some minor modifications. I trained the model for 60 epochs with a batch size of 4 and image resolution of 1280x1280 pixels. High resolution was necessary, as most speed limit signs appeared fairly small in the images. All other parameters I kept to their default values.
With some minor modifications to the source code, and the trained YOLOv5-S model torchscript-file, the PyTorch object detection Android demo was adapted for speed limit sign detection. I used Android Studio for uploading the app to an old Android phone of mine.
Results
To test my new app, I took a virtual drive around Finland in Google Maps on my computer, hunting for speed limit signs. Displaying the speed limit signs on the computer monitor, I tried to detect the signs with my “SpeedLimit” app on the Android phone. The old Motorola G7 phone was struggling a bit when running the model, taking several seconds to process one frame. A montage of the results is shown below.
Discussion
Overall, the YOLOv5-S model was excellent for detecting the speed limit signs and placing the bounding boxes.
However, the classification of the exact speed limit readings on the signs was a bit lackluster.
On some specific speed limit readings, most notably 80 and 120, the model was consistently unable to classify the signs correctly.
Signs with limits of 80 were often misclassified as 60, and signs with limits of 120 were often misclassified as 20.
These are fairly undestandable mistakes, as the visual appearance of the numbers is notably similar.
This outcome is probably the reason why it might generally be a good idea to train traffic sign detection and sign content classification models separately. The YOLO object detection model is trained based on a multi-target loss function, with separate components for detection (bounding box fitting) and classification loss. This might cause some difficulties for the model to reach an optimal fit, as improvements in classification loss might result in decreases in detection loss. Training separate models for the tasks would allow for a more streamlined optimisation/learning process. However, the amount of training data was a bit limited, and it could be argued that the shortcomings of the model could be fixed with additional data.