“…It has three SGA layers, two LGA layers and fifteen 3D convolutional layers for cost aggregation. (9,12), (7,14), (5,16), (3,18), (17,20), (15,22), (13,24), (11,26), ( from (3), 3×3 conv 1 /3H× 1 /3W×32 (7) 3×3 conv (no bn & relu) 1 /3H× 1 /3W×640 (8) split, reshape, normalize 4× 1 /3H× 1 /3W×5×32 (9)- (11) from (6), repeat (6)-(8) 4× 1 /3H× 1 /3W×5×32 (12) from (1), 3×3 conv H×W×16 (13) 3×3 conv (no bn & relu) H×W×75 (14) split, reshape, normalize H×W×75 (15)- (17) from (12), repeat (12)-(14) H×W×75 Cost Aggregation input 4D cost volume 1 /3H× 1 /3W×48×64 [1] 3×3×3, 3D conv 1 /3H× 1 /3W×48×32 [2] SGA layer: weight matrices from (5) 1 /3H× 1 /3W×48×32 [3] 3×3×3, 3D conv 1 /3H× 1 /3W×48×32 output 3×3×3, 3D to 2D conv, upsamping H×W×193 softmax, regression, loss weight: 0.2 H×W×1 [4] 3×3×3, 3D conv, stride 2 1 /6H× 1 /6W×48×48 [5] 3×3×3, 3D conv, stride 2 1 /12H× 1 /12W×48×64 [6] 3×3×3, 3D deconv, stride 2 1 /6H× 1 /6W×48×48 [7] 3×3×3, 3D conv 1 /6H× 1 /6W×48×48 [8] 3×3×3, 3D deconv, stride 2 1 /3H× 1 /3W×48×32 [9] 3×3×3, 3D conv 1 /3H× 1 /3W×48×32 [10] SGA layer: weight matrices from (8) 1 /3H× 1 /3W×48×32 output 3×3×3, 3D to 2D conv, upsamping H×W×193 softmax, regression, loss weight: 0.6 H×W×1 [11] 3×3×3, 3D conv...…”