CGCANet: Context-Guided Cost Aggregation Network for Robust Stereo Matching
DOI:
https://doi.org/10.31577/cai_2024_2_505Keywords:
Stereo matching, cost computation, cost aggregation, disparity refinementAbstract
Stereo matching methods based on Convolutional Neural Network (CNN) have achieved a significant progress in recent years. However, they still cannot work well on generalization performance across a variety of datasets due to their poor robustness. In view of this, we aim to enhance the robustness in three main steps of stereo matching, namely cost computation, cost aggregation, and disparity refinement. For cost computation, we propose an atrous pyramid grouping convolution (APGC) module, which combines local context information with multi-scale features generated from CNN backbone, aiming to obtain a more discriminative feature representation. For cost aggregation, we provide a multi-scale cost aggregation (MSCA) module, which sufficiently and effectively fuses multiple cost volumes at three different scales into the 3D hourglass networks to improve initial disparity estimation. In addition, we present a disparity refinement (DR) module that employs the color guidance of left input image and several convolutional residual blocks to obtain a more accurate disparity estimation. With such three modules, we propose an end-to-end context-guided cost aggregation network (CGCANet) for robust stereo matching. To evaluate the performance of the proposed modules and CGCANet, we conduct comprehensive experiments on the challenging SceneFlow, KITTI 2015 and KITTI 2012 datasets, with a consistent and competitive improvement over the existing stereo matching methods.