In the classical team orienteering problem (TOP), a fixed fleet of vehicles is employed, each of them with a limited driving range. The manager has to decide about the subset of customers to visit, as well as the visiting order (routes). Each customer offers a different reward, which is gathered the first time that it is visited. The goal is then to maximize the total reward collected without exceeding the driving range constraint. This paper analyzes a more realistic version of the TOP in which the driving range limitation is considered as a soft constraint: every time that this range is exceeded, a penalty cost is triggered. This cost is modeled as a piece-wise function, which depends on factors such as the distance of the vehicle to the destination depot. As a result, the traditional reward-maximization objective becomes a non-smooth function. In addition, a second objective, regarding the design of balanced routing plans, is considered as well. A mathematical model for this non-smooth and bi-objective TOP is provided, and a biased-randomized algorithm is proposed as a solving approach.