In this article we study the problem of credal learning, a general form of weakly supervised learning in which instances are associated with credal sets (i.e., closed, convex sets of probabilities), which are assumed to represent the partial knowledge of an annotating agent about the true conditional label distribution. A variety of algorithms have been proposed in this setting, chiefly among them the generalized risk minimization method, a class of algorithms that extend empirical risk minimization. Despite its popularity and promising empirical results, however, the theoretical properties of this algorithm (as well as of credal learning more in general) have not been previously studied. In this article we address this gap by studying the problem of credal learning from the learning-theoretic and complexity-theoretic perspectives. We provide, in particular, three main contributions: 1) we show that, under weak assumptions about the accuracy of the annotating agent, credal learning is learnable in the convex learning setting, providing effective risk bounds; 2) we study the properties of generalized risk minimization and, in particular, identify the optimal instance of this approach, that we call trade-off risk minimization; 3) we study the computational complexity of generalized risk minimization, showing effective algorithms based on gradient descent and providing sufficient and necessary conditions for them being computationally efficient.