The Athena Blog

Is it possible to locate & classify VRUs using just one infrared camera?

Written by Marketing Team on 2, February 2023

In a word, YES (in all conditions, to boot).

It is possible to locate and classify VRUs in the dark, using just one infrared camera. 

Owl AI has done this using the Vulnerable Road Users (VRUs) own thermal signatures, one infrared camera and carefully trained convolutional neural networks to detect, recognize and identify them in all conditions. We call this our Thermal Ranger™ and our Monocular 3D Thermal Ranger Computer Vision for ADAS & Autonomous Vehicles Evaluation Kit is now available.

This is really exciting in light of increasing regulatory efforts to provide greater protection of (VRUs) against vehicle collisions.

Detection alone is not enough to support decision-making – each VRU must be classified by type and tagged with a distance from the vehicle. Currently, no commercial sensor incorporates all the features needed to supply the operating modes and the data quality demanded by the requirements above, so design of a custom sensor was undertaken by Owl.

Let’s talk about what makes this possible.

Owl AI’s Thermal Ranger utilizes a complex of convolutional neural networks (CNNs) that can extract from a single thermal image all the information required for automatic emergency braking decisions.

Convolutional neural networks are computer simulations of pliable groups of neurons that operate by configuring themselves to produce a positive response when they detect a strong correlation between objects in a new image and objects in a series of images that they have processed in training. CNN training is similar to the method used to train humans for recognition tasks. In training radiologists to read x-rays, for instance, the saying is, “show interns 50,000 images and tell them what they mean and then they will be able to quickly decide what any new images show”.

The trick is to pick the right set of images for training to minimize both missed pathology and over-interpretation.

This is where supervision comes in.

Supervision is applied during training to inform a CNN what it has found, equipping the CNN to report findings in real terms. The Thermal Ranger system needs two types of scene information to formulate its report on the scene contents, the identification of each VRU with its location in the image and the distance from the camera to each identified object. Both of these determinations are performed by CNNs.

5 - inference pipeline

The thermal ranger’s process, as seen here, corrects lens distortion, two CNNs are applied – recognition & ranging to identify and apply depth perception, respectively, then fuse data and apply colorization.

This entire set of processes combining CNN and conventional computation is called an inference pipeline (IP). the pedestrian thermal images can be seen with their associated 2D bounding boxes and range labels.


But there's so much more to detecting and ranging VRUs from a moving automobile:

  • avoiding size comparisons
  • selecting the right camera
  • detection based on a thermal signature
  • fusing objects, and continuous operation

All of which are challenges and features Owl has addressed.

7 - video 3d labeled sized

To learn more and start exploring whether the Monocular 3D Thermal Ranging Computer Vision platform can help you improve visibility in all conditions for your ADAS suite, access our white paper “CNN for Thermal Imaging”.

Topics: OwlAI, Thermal Ranging Solutions