Aftershock occurrence is characterized by scaling behaviors with quite universal exponents. At the same time, deviations from universality have been proposed as a tool to discriminate aftershocks from foreshocks. Here we show that the change in rheological behavior of the crust, from velocity weakening to velocity strengthening, represents a viable mechanism to explain statistical features of both aftershocks and foreshocks. More precisely, we present a model of the seismic fault described as a velocity weakening elastic layer coupled to a velocity strengthening visco-elastic layer. We show that the statistical properties of aftershocks in instrumental catalogs are recovered at a quantitative level, quite independently of the value of model parameters. We also find that large earthquakes are often anticipated by a preparatory phase characterized by the occurrence of foreshocks. Their magnitude distribution is significantly flatter than the aftershock one, in agreement with recent results for forecasting tools based on foreshocks.