Taimouri, V., Cordonnier, M., Lee, K., and Goodman, B., "Distance Map Estimation of Stereoscopic Images Using Deep Neural Networks for Autonomous Vehicle Driving," SAE Technical Paper 2017-01-0071, 2017, doi:10.4271/2017-01-0071.
While operating a vehicle in either autonomous or occupant piloted mode, an array of sensors can be used to guide the vehicle including stereo cameras. The state-of-the-art distance map estimation algorithms, e.g. stereo matching, usually detect corresponding features in stereo images, and estimate disparities to compute the distance map in a scene. However, depending on the image size, content and quality, the feature extraction process can become inaccurate, unstable and slow. In contrast, we employ deep convolutional neural networks, and propose two architectures to estimate distance maps from stereo images. The first architecture is a simple and generic network that identifies which features to extract, and how to combine them in a multi-resolution framework. The second architecture is a more specialized one that extracts local similarity information from two images, which are used for stereo feature matching, and fuses them at multiple resolutions to generate the distance map. We generate several synthetic data sets for an end-to-end training of the networks. Note that the pavements are excluded from the ground truth distance maps in the synthetic data sets. This helps the networks put more emphasis on the surrounding objects than the pavements, and yield more accurate predictions for the objects, which is crucial for path planning and active safety systems during operation of autonomous vehicles. The evaluation results on 500 stereo images show that both networks are able to estimate the distance to the surrounding objects accurately in close to real time, which is a pivotal characteristic for operation of autonomous vehicles. However, the second network is especially more accurate for distance estimation of the distant objects than the first one.