This work describes the development of a vision-based tactile sensor system that utilizes the image-based information of the tactile sensor in conjunction with input loads at various motions to train the neural network for the estimation of tactile contact position, area, and force distribution. The current study also addresses pragmatic aspects, such as choice of the thickness and materials for the tactile fingertips and surface tendency, etc. The overall vision-based tactile sensor equipment interacts with an actuating motion controller, force gauge, and control PC (personal computer) with a LabVIEW software on it. The image acquisition was carried out using a compact stereo camera setup mounted inside the elastic body to observe and measure the amount of deformation by the motion and input load. The vision-based tactile sensor test bench was employed to collect the output contact position, angle, and force distribution caused by various randomly considered input loads for motion in X, Y, Z directions and RxRy rotational motion. The retrieved image information, contact position, area, and force distribution from different input loads with specified 3D position and angle are utilized for deep learning. A convolutional neural network VGG-16 classification modelhas been modified to a regression network model and transfer learning was applied to suit the regression task of estimating contact position and force distribution. Several experiments were carried out using thick and thin sized tactile sensors with various shapes, such as circle, square, hexagon, for better validation of the predicted contact position, contact area, and force distribution.