On-line and batch learning of a perceptron in a discrete weight space, where each weight can take 2L + 1 different values, are examined analytically and numerically. The learning algorithm is based on the training of the continuous perceptron and prediction following the clipped weights. The learning is described by a new set of order parameters, composed of the overlaps between the teacher and the continuous/clipped students. Different scenarios are examined among them on-line learning with discrete/continuous transfer functions and off-line Hebb learning. The generalization error of the clipped weights decays asymptotically as exp(−Kα 2 )/exp(−e |λ|α ) in the case of on-line learning with binary/continuous activation functions, respectively, where α is the number of examples divided by N, the size of the input vector and K is a positive constant that decays linearly with 1/L. For finite N and L, a perfect agreement between the discrete student and the teacher is obtained for α ∝ L ln(N L). A crossover to the generalization error ∝ 1/α, characterized continuous weights with binary output, is obtained for synaptic depth L > O( √ N ).