“…Different voxel grids are selected for various datasets, more in detail, 2048, 512, 512, 512 are chosen for the DVS128-Gait-Day, ASL-DVS, N-MNIST and HARDVS datasets. After considering the spatiotemporal discrepancy across different datasets, we set the scale (v h , v w , v t ) of voxel grid as (10,10,10) for ASL-DVS, (4, 4, 4) for DVS128-Gait-Day, (20, 2, 2), (50,30,20) for N-MNIST and HARDVS datasets. When building graphs for the voxel branch, the threshold R is set as 2.…”