Stochastic gradient descent (SGD) and its variants are among the most successful approaches for solving large-scale optimization problems. At each iteration, SGD employs an unbiased estimator of the full gradient computed from one single randomly selected data point. Hence, it scales well with problem size and is very attractive for handling truly massive dataset, and holds significant potentials for solving large-scale inverse problems. In this work, we rigorously establish its regularizing property under a priori early stopping rule for linear inverse problems, and also prove convergence rates under the canonical sourcewise condition. This is achieved by combining tools from classical regularization theory and stochastic analysis. Further, we analyze its preasymptotic weak and strong convergence behavior, in order to explain the fast initial convergence typically observed in practice. The theoretical findings shed insights into the performance of the algorithm, and are complemented with illustrative numerical experiments.